Research
Speculative Decoding for Vision-Language-Action Models
Accelerating robot control with efficient draft models
Role
Draft model architecture design, training pipeline implementation, evaluation
Investigated whether Mamba state-space models could improve inference speed for Vision-Language-Action (VLA) systems in robotic manipulation. Implemented and evaluated Mamba-based draft models against Llama baselines on the LIBERO-Goal benchmark, exploring architectural approaches for real-time robot action generation. Team project for CS229 Machine Learning.
Tech Stack
PythonPyTorchMamba SSMOpenVLALIBERO
