Alpha

Fix & expand

Current phase

Labeling tool — S/E single-pass

S=start, E=end, W=save. A/D±5f, B/F±50f. Bold overlay.

LR mirror new database

Negate X, swap joint pairs. ~20 labeled vids → ×2 dataset.

Merge FERN v1 clips

ffmpeg concat + auto-label JSON from 5s clips by gesture folder.

Idle class

Record stand-still videos. Add as class 0. Retrain.

Window offset fix

Shift window ~15f earlier to catch gesture onset, not midpoint.

Subject-independent eval

Leave-one-subject-out split. True generalisation accuracy.

CUDA utilisation probe

Check DataLoader workers, batch size, mixed precision.

Beta

Multi-camera + streaming

After Alpha milestones done

45° camera — Option B

Feature-level fusion: concat front+angle → 60-feature input. Fixes heel_tap.

3D reconstruction — Option C

Stereo triangulation. Paper ablation row. Checkerboard calibration session.

Frame sync buffer

Hold frames until both cameras deliver same timestamp.

DroidGrid integration

Phone → MJPEG socket → MediaPipe pipeline. Overlay streamed back.

Confidence smoothing

Temporal N-frame majority vote. Stops label flicker in live inference.

Gold

Paper-ready

Freeze → write

Augmentation expansion

Mirror + rotation ±5° + brightness. Applied last, after all real data collected.

Ablation study

Single-cam vs Option B vs Option C. Three rows in results table.

Dataset freeze

No new data after this point. All eval numbers reproducible.

Paper sections

Methodology, results, related work, conclusion. Architecture diagrams.

Deployment package

Self-contained ZIP. install.ps1. Tested on 3 machines including CPU-only.

Release

Publish & open

~18 weeks from now

Paper submitted

IEEE Sensors Journal (primary) or MDPI Sensors. ISMAR as backup.

GitHub open-source

Full source + weights + README + demo video.

Dataset public

Skeleton CSVs released. Raw videos optional (consent required).

Post-release

Future paths

Choose one or more

Industrial deployment

Factory floor ergonomics. Edge device (Jetson). ISO compliance.

VR/AR locomotion

Foot gestures as controller input. ISMAR / IEEE VR target.

Physiotherapy

Rehab exercise counting and form correction. Clinical partnership.

Unsupervised segmentation

Auto-segment unlabeled videos. PhD-level research direction.

Multi-person

Track multiple subjects simultaneously. Crowd analysis use case.