User Tools

Site Tools


ml:mixture_of_expert_models

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ml:mixture_of_expert_models [2025/03/26 03:44] – [Related Pages] jmflanigml:mixture_of_expert_models [2025/05/31 07:40] (current) – [MoE Large Language Models] jmflanig
Line 18: Line 18:
   * [[https://arxiv.org/pdf/2406.18219|Lo et al 2024 - A Closer Look into Mixture-of-Experts in Large Language Models]]   * [[https://arxiv.org/pdf/2406.18219|Lo et al 2024 - A Closer Look into Mixture-of-Experts in Large Language Models]]
   * [[https://arxiv.org/pdf/2410.07348|Jin et al 2024 - MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts]]   * [[https://arxiv.org/pdf/2410.07348|Jin et al 2024 - MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts]]
 +  * [[https://arxiv.org/pdf/2505.21411|Tang et al 2025 - Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity]]
 +  * [[https://arxiv.org/pdf/2505.22323|Guo et al 2025 - Advancing Expert Specialization for Better MoE]]
  
 ===== People ===== ===== People =====
ml/mixture_of_expert_models.1742960686.txt.gz · Last modified: 2025/03/26 03:44 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki