Distributed Systems

Efficient Cross-GPU Communication for Disaggregated LLM Serving featured image

Efficient Cross-GPU Communication for Disaggregated LLM Serving

CommBridge is a portable communication runtime for disaggregated LLM serving that decouples LLM communication primitives from RDMA backends, improving deployment portability across …

avatar
Dezhi Yu
KVDirect: Distributed Disaggregated LLM Inference featured image

KVDirect: Distributed Disaggregated LLM Inference

Distributed disaggregated inference for efficient LLM serving.

s.-chen