ml:distributed_training

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
ml:distributed_training [2025/05/20 10:06] – [Model Parallel (or a combination of model + data parallel)] jmflanigml:distributed_training [2025/05/29 07:18] (current) – [Model Parallel (or a combination of model + data parallel)] jmflanig
Line 45: Line 45:
   * [[https://arxiv.org/pdf/2401.10241.pdf|Qi et al 2024 - Zero Bubble Pipeline Parallelism]]   * [[https://arxiv.org/pdf/2401.10241.pdf|Qi et al 2024 - Zero Bubble Pipeline Parallelism]]
   * [[https://arxiv.org/pdf/2401.02669|Lin et al 2024 - Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache]]   * [[https://arxiv.org/pdf/2401.02669|Lin et al 2024 - Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache]]
 +  * [[https://arxiv.org/pdf/2408.04093|Shyam et al 2024 - Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters]]
  
 ===== Distributed Serving (Inference) ===== ===== Distributed Serving (Inference) =====
ml/distributed_training.1747735579.txt.gz · Last modified: 2025/05/20 10:06 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki