Projects
An overview of the research directions that we work on.
Featured
LLM Serving
Systems for low-latency and high-throughput LLM serving and emerging paradigms: compound AI systems, agentic workflows, inference-time scaling, etc.
Novel Model Architectures
Cross-stack optimizations and innovations for State Space Models (e.g., Mamba) and Attention-SSM Hybrid models.
ML for Systems
Machine learning for systems: caching, resource management, network protocol design, etc.
Accelerator Toolkit
Expanding the accelerator toolkit for AI serving.