ml:model_compression
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:model_compression [2025/03/26 03:46] – [Related Pages] jmflanig | ml:model_compression [2025/05/12 09:00] (current) – [After Training] jmflanig | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Model Compression ====== | ====== Model Compression ====== | ||
| + | See also [[nn_sparsity|Sparsity in Neural Networks]]. | ||
| ===== Overviews ===== | ===== Overviews ===== | ||
| * **General** | * **General** | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| * See chapter 4 of [[https:// | * See chapter 4 of [[https:// | ||
| Line 13: | Line 15: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * **Distillation** | ||
| + | * [[https:// | ||
| + | |||
| ===== General Papers ===== | ===== General Papers ===== | ||
| - | * [[https:// | + | * [[https:// |
| * [[https:// | * [[https:// | ||
| Line 29: | Line 34: | ||
| * [[https:// | * [[https:// | ||
| * **LLM Pruning** | * **LLM Pruning** | ||
| - | * See overview in section 2.1.2 of [[https:// | + | * See overview in section 2.1.2 of [[https:// |
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| Line 36: | Line 41: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| ===== Quantization ====== | ===== Quantization ====== | ||
| Line 44: | Line 50: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * **Empirical Studies** | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| ==== During Training ==== | ==== During Training ==== | ||
| Line 67: | Line 77: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | |||
| + | ===== Parameter Sharing ===== | ||
| + | * HashedNets: [[https:// | ||
| ===== Conferences and Workshops ===== | ===== Conferences and Workshops ===== | ||
| Line 78: | Line 91: | ||
| * [[Conditional Computation]] Invokes certain parts of the network for each instance, early exit, etc | * [[Conditional Computation]] Invokes certain parts of the network for each instance, early exit, etc | ||
| * [[Edge Computing]] | * [[Edge Computing]] | ||
| + | * [[Efficient NNs]] | ||
| * [[Knowledge Distillation]] | * [[Knowledge Distillation]] | ||
| + | * [[nn_sparsity|Sparsity in Neural Networks]] | ||
ml/model_compression.1742960800.txt.gz · Last modified: 2025/03/26 03:46 by jmflanig