Skip to content

Gait Analysis

One-line summary: Recognizing individuals by their walking pattern — a soft biometric that works at a distance without subject cooperation, using silhouette-based or skeleton-based representations.

Modality: Gait
Related concepts: Deep Learning Architectures for Biometrics, Transformer Architectures for Biometrics, Multimodal Biometrics, Bias and Fairness in Biometrics, Real World Biometric Deployments, Biometric Datasets and Benchmarks
Last updated: 2026-04-04


Overview

Gait recognition identifies people by how they walk. It is the only mainstream biometric that operates at long range (50–100+ meters) without subject awareness or cooperation, making it valuable for surveillance and forensic applications.

Two dominant representation paradigms:

  1. Appearance-based (silhouette) — Extract binary silhouettes from video frames, aggregate into a compact temporal representation (Gait Energy Image), and match via CNN or metric learning.
  2. Model-based (skeleton/pose) — Estimate body joint positions per frame using pose estimation (OpenPose, HRNet, MediaPipe), then model temporal dynamics of the skeleton sequence using RNNs, GCNs, or Transformers.

Pipeline

  1. Video capture — Surveillance camera, depth sensor, or radar.
  2. Person detection + tracking — Detect and track the subject across frames.
  3. Silhouette extraction / Pose estimation — Background subtraction or segmentation for silhouettes; 2D/3D pose estimation for skeletons.
  4. Temporal representation — GEI (silhouette), or joint coordinate sequences (skeleton).
  5. Feature extraction — CNN on GEI; GCN/Transformer on skeleton sequences.
  6. Matching — Cosine similarity on embeddings.

Technical Details

Silhouette-Based Methods

Method Year Key Idea
Gait Energy Image (GEI) 2006 Average silhouette over one gait cycle; simple and effective baseline
GaitSet (Chao et al.) 2019 Set-based: treats gait as an unordered set of silhouettes; no explicit temporal modeling
GaitPart 2020 Part-based: splits silhouette into horizontal strips and learns local features
GaitGL (Lin et al.) 2021 Global-local feature extraction; combines set-level and part-level features
OpenGait (Fan et al.) 2023 Unified open-source framework; reproduces GaitSet, GaitPart, GaitGL, and adds new baselines
GaitBase / GaitGCI 2023 Strong ResNet-based baseline that outperforms many specialized methods; shows backbone matters more than architecture tricks
DeepGaitV2 2024 Explores diverse backbone architectures (ConvNeXt, Swin, etc.) for gait

Skeleton-Based Methods

Method Year Key Idea
PoseGait (Liao et al.) 2020 3D pose features (joint angles, limb lengths) + FC network
GaitGraph (Teepe et al.) 2021 Graph Convolutional Network on skeleton sequences
GaitGraph2 2022 Multi-scale graph convolutions + augmentation strategies
GaitMixer 2023 MLP-Mixer on skeleton sequences; competitive with GCN approaches
GaitTR 2024 Transformer on skeleton sequences; best skeleton-based results on GREW

Covariate Challenges

  • Clothing — Carrying bags, wearing coats, different shoes alter silhouette shape significantly.
  • View angle — Cross-view matching is a core challenge; most methods learn view-invariant features.
  • Speed — Walking speed affects gait cycle length and joint dynamics.
  • Surface/terrain — Inclines, stairs, different flooring change gait patterns.

Datasets

Dataset Subjects Setting Notes
CASIA-B 124 Indoor, 11 views, 3 conditions (normal, bag, coat) Classic benchmark; limited scale
OU-MVLP 10,307 Indoor, 14 views Largest lab-based gait dataset
GREW (Gait Recognition in the Wild) 26,345 Outdoor, natural, unconstrained Wild gait from street cameras; 4 subsets
Gait3D 4,000 Outdoor, 3D point clouds 3D gait in the wild from multi-camera LIDAR
OUMVLP-Pose 10,307 Indoor, skeleton Pose-estimated version of OU-MVLP
FVG (Frontal-View Gait) 226 Indoor, frontal view only Frontal gait recognition

Challenges

  • In-the-wild performance — Lab-to-wild generalization remains the biggest gap; GREW benchmark is humbling even for SOTA methods (Rank-1 ~70–80% vs. >95% on CASIA-B).
  • Silhouette quality — Background subtraction fails in crowded scenes, poor lighting, and dynamic backgrounds. Semantic segmentation (Mask R-CNN) helps but adds compute.
  • Clothing and carrying conditions — Still the primary covariate degrading accuracy 10–30%.
  • Occlusion — Partial body visibility in real scenes; part-based methods handle this better.
  • Privacy and ethics — Gait can be captured covertly; raises surveillance concerns. See Bias and Fairness in Biometrics, Privacy Preserving Biometrics.
  • Low uniqueness — Gait has lower discriminative power than face, iris, or fingerprint; typically used as a soft biometric or for re-identification rather than 1:N identification at scale.

State of the Art (SOTA)

As of early 2026: - CASIA-B (NM, BG, CL): Rank-1 ~98%, ~95%, ~88% respectively for top silhouette methods (GaitBase + augmentation). - OU-MVLP: Rank-1 ~93% (GaitGL, DeepGaitV2). - GREW (in the wild): Rank-1 ~75% (best published); significant room for improvement. - Gait3D: Rank-1 ~55–65% (challenging 3D outdoor benchmark). - Skeleton-based on GREW: Rank-1 ~60% (GaitTR); silhouette methods still dominate.

Open Questions

  • Can multimodal (silhouette + skeleton + depth) fusion close the gap on in-the-wild benchmarks?
  • Will self-supervised pre-training on large-scale unlabeled walking video improve generalization?
  • How to build gait recognition that is truly invariant to clothing without losing gait-specific discriminative features?
  • Can radar-based gait (through-wall, privacy-preserving) become practical for smart home or healthcare applications?
  • Ethical framework: should gait recognition be regulated differently from face recognition given its covert nature?

References

  • Chao, H. et al. (2019). GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition. AAAI.
  • Fan, C. et al. (2023). OpenGait: Revisiting Gait Recognition Towards Better Practicality. CVPR.
  • Zhu, Z. et al. (2022). Gait Recognition in the Wild: A Benchmark. ICCV.
  • Teepe, T. et al. (2022). GaitGraph: Graph Convolutional Network for Skeleton-Based Gait Recognition. ICIP.

Backlinks: Deep Learning Architectures for Biometrics, Transformer Architectures for Biometrics, Multimodal Biometrics, Bias and Fairness in Biometrics, Biometric Datasets and Benchmarks, Real World Biometric Deployments