Efficient Cross-GPU Communication for Disaggregated LLM Serving
CommBridge is a portable communication runtime for disaggregated LLM serving that decouples LLM communication primitives from RDMA backends, improving deployment portability across …
CommBridge is a portable communication runtime for disaggregated LLM serving that decouples LLM communication primitives from RDMA backends, improving deployment portability across …
Distributed disaggregated inference for efficient LLM serving.