Projects

An overview of the research directions that we work on.

Featured

Systems for low-latency and high-throughput LLM serving and emerging paradigms: compound AI systems, agentic workflows, inference-time scaling, etc.

Cross-stack optimizations and innovations for State Space Models (e.g., Mamba) and Attention-SSM Hybrid models.

ML for Systems

Machine learning for systems: caching, resource management, network protocol design, etc.

Edge AI Systems

Accelerator Toolkit

Expanding the accelerator toolkit for AI serving.