Research

Speculative Decoding for Vision-Language-Action Models

Accelerating robot control with efficient draft models

Role

Draft model architecture design, training pipeline implementation, evaluation

Investigated whether Mamba state-space models could improve inference speed for Vision-Language-Action (VLA) systems in robotic manipulation. Implemented and evaluated Mamba-based draft models against Llama baselines on the LIBERO-Goal benchmark, exploring architectural approaches for real-time robot action generation. Team project for CS229 Machine Learning.

Tech Stack

PythonPyTorchMamba SSMOpenVLALIBERO

Gallery

Speculative Decoding Poster

Click to open

Speculative Decoding Final Report

Click to open

Back to Projects