nlp:language_model

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:language_model [2025/05/29 07:10] – [Adversarial Attacks] jmflanignlp:language_model [2026/03/07 22:21] (current) – [Extracting Knowledge from Language Models] jmflanig
Line 26: Line 26:
       * **[[https://arxiv.org/pdf/2303.18223.pdf|Zhao et al 2023 - A Survey of Large Language Models]]**       * **[[https://arxiv.org/pdf/2303.18223.pdf|Zhao et al 2023 - A Survey of Large Language Models]]**
       * [[https://arxiv.org/pdf/2404.09022|Weng 2024 - Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies]]       * [[https://arxiv.org/pdf/2404.09022|Weng 2024 - Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies]]
 +      * **[[https://arxiv.org/pdf/2501.17805|2025 - International AI Safety Report]]** (Has a good non-technical overview of AI, ML & LLMs)
   * **Language models in the news, etc**   * **Language models in the news, etc**
     * [[https://www.wired.com/story/ai-text-generator-gpt-3-learning-language-fitfully/|Wired - GPT-3]]     * [[https://www.wired.com/story/ai-text-generator-gpt-3-learning-language-fitfully/|Wired - GPT-3]]
Line 101: Line 102:
 | [[https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf|Gemma]] | 2024 | 7B, 2B | | Yes | [[https://blog.google/technology/developers/gemma-open-models/|blog]] | | [[https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf|Gemma]] | 2024 | 7B, 2B | | Yes | [[https://blog.google/technology/developers/gemma-open-models/|blog]] |
 | [[https://arxiv.org/pdf/2403.19887|Jamba]] | 2024 | 52B | | Yes | [[https://www.ai21.com/blog/announcing-jamba|blog]] [[https://huggingface.co/ai21labs/Jamba-v0.1|HuggingFace]] | | [[https://arxiv.org/pdf/2403.19887|Jamba]] | 2024 | 52B | | Yes | [[https://www.ai21.com/blog/announcing-jamba|blog]] [[https://huggingface.co/ai21labs/Jamba-v0.1|HuggingFace]] |
 +| [[https://arxiv.org/pdf/2404.14619|OpenELM]] | 2024 | 1.1B | | Yes | |
 +| [[https://arxiv.org/pdf/2507.20534|Kimi K2]] | 2025 | 1T | | Yes | |
 | | | | | | | | | | | | | |
- 
  
 ===== Abilities and Analysis of LLMs ===== ===== Abilities and Analysis of LLMs =====
Line 118: Line 120:
     * [[https://arxiv.org/pdf/1805.04623.pdf|Khandelwal et al 2018 - Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context]] (Old, no longer applies to transformer models)     * [[https://arxiv.org/pdf/1805.04623.pdf|Khandelwal et al 2018 - Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context]] (Old, no longer applies to transformer models)
     * [[https://arxiv.org/pdf/2307.03172|Liu 2023 - Lost in the Middle: How Language Models Use Long Contexts]]     * [[https://arxiv.org/pdf/2307.03172|Liu 2023 - Lost in the Middle: How Language Models Use Long Contexts]]
 +      * [[https://link.springer.com/chapter/10.1007/978-3-031-88708-6_16|Hutter et al 2025 - Lost but Not Only in the Middle]]
 +
 +==== Origin of Capabilities ====
 +  * [[https://arxiv.org/pdf/2505.23323|Madabushi et al 2025 - Neither Stochastic Parroting nor AGI: LLMs Solve Tasks through Context-Directed Extrapolation from Training Data Priors]]
 +  * **Machine Translation**
 +    * [[https://arxiv.org/pdf/2305.10266|Briakou et al 2023 - Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM's Translation Capability]]
 +    * [[https://arxiv.org/pdf/2505.23548|Balashov 2025 - Translation in the Wild]]
  
 ===== Evaluation of LLMs and Benchmarks ===== ===== Evaluation of LLMs and Benchmarks =====
   * **Overviews**   * **Overviews**
     * [[https://arxiv.org/pdf/2307.03109|Chang et al 2023 - A Survey on Evaluation of Large Language Models]]     * [[https://arxiv.org/pdf/2307.03109|Chang et al 2023 - A Survey on Evaluation of Large Language Models]]
 +    * For common evaluation datasets for LLMs, see recent LLM system description papers such as the [[https://arxiv.org/pdf/2407.21783|LLama 3 paper]] (table 2) or [[https://www.anthropic.com/news/claude-sonnet-4-5|Claude Sonnet 4.5]] (evaluation table).
   * lm-evaluation-harness: [[https://github.com/EleutherAI/lm-evaluation-harness|LM Evaluation Harness (EleutherAI)]] (Released May 2021)   * lm-evaluation-harness: [[https://github.com/EleutherAI/lm-evaluation-harness|LM Evaluation Harness (EleutherAI)]] (Released May 2021)
   * [[https://arxiv.org/pdf/2401.00595|Mizrahi et al 2024 - State of What Art? A Call for Multi-Prompt LLM Evaluation]]   * [[https://arxiv.org/pdf/2401.00595|Mizrahi et al 2024 - State of What Art? A Call for Multi-Prompt LLM Evaluation]]
Line 129: Line 139:
   * **Effects of Length and Irrelevant Context**   * **Effects of Length and Irrelevant Context**
     * [[https://arxiv.org/pdf/2402.14848|Levy et al 2024 - Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models]]     * [[https://arxiv.org/pdf/2402.14848|Levy et al 2024 - Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models]]
 +
 +===== Tool-Use in LLMs =====
 +See also [[prompting#Chained or Tool-based Prompting]].
 +  * **Overviews and Background**
 +    * [[https://modelcontextprotocol.io/docs/getting-started/intro|Model Contex Protocol]]
 +
 +===== Retrieval-Augmented Generation (RAG) =====
 +See [[Retrieval-Augmented Methods]].
 +
 +===== Limitations of Current LLMs =====
 +  * [[https://aclanthology.org/2025.acl-long.1016.pdf|Shaikh et al 2025 - Navigating Rifts in Human-LLM Grounding: Study and Benchmark]]
  
 ===== Questions and Critiques of LLMs ===== ===== Questions and Critiques of LLMs =====
 +  * [[https://s10251.pcdn.co/pdf/2021-bender-parrots.pdf|Bender et al 2021 - On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?]]
   * [[https://arxiv.org/pdf/2308.07120|Rogers & Luccioni 2023 - Position: Key Claims in LLM Research Have a Long Tail of Footnotes]]   * [[https://arxiv.org/pdf/2308.07120|Rogers & Luccioni 2023 - Position: Key Claims in LLM Research Have a Long Tail of Footnotes]]
  
Line 159: Line 181:
   * Extracting Training Data   * Extracting Training Data
     * [[https://arxiv.org/pdf/2012.07805.pdf|Carlini et al 2020 - Extracting Training Data from Large Language Models]] [[https://github.com/ftramer/LM_Memorization|github]]     * [[https://arxiv.org/pdf/2012.07805.pdf|Carlini et al 2020 - Extracting Training Data from Large Language Models]] [[https://github.com/ftramer/LM_Memorization|github]]
 +    * [[https://arxiv.org/pdf/2601.02671|Ahmed et al 2026 - Extracting Books from Production Language Models]]
   * Membership Inference for Training Data   * Membership Inference for Training Data
     * (Decide if some sample data is in the training data or not)     * (Decide if some sample data is in the training data or not)
Line 170: Line 193:
   * [[https://arxiv.org/pdf/2201.07207.pdf|Huang et al 2022 - Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents]]   * [[https://arxiv.org/pdf/2201.07207.pdf|Huang et al 2022 - Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents]]
   * [[https://arxiv.org/pdf/2205.11482.pdf|Akyürek et al 2022 - Tracing Knowledge in Language Models Back to the Training Data]]   * [[https://arxiv.org/pdf/2205.11482.pdf|Akyürek et al 2022 - Tracing Knowledge in Language Models Back to the Training Data]]
 +  * [[https://arxiv.org/pdf/2404.15146|Schwarzschild et al 2024 - Rethinking LLM Memorization through the Lens of Adversarial Compression]]
  
  
Line 213: Line 237:
  
 ===== Theoretical and Foundational Papers ===== ===== Theoretical and Foundational Papers =====
-See also [[Prompting#Analysis of In-Context-Learning]].+See also [[Prompting#Analysis of In-Context-Learning]] and [[Language Model#Origin of Capabilities|Language Model - Origin of Capabilities]].
  
 === Emergent Abilities === === Emergent Abilities ===
Line 224: Line 248:
     * [[https://arxiv.org/pdf/2202.07105|Xu & McAuley et al 2022 - A Survey on Model Compression and Acceleration for Pretrained Language Models]]     * [[https://arxiv.org/pdf/2202.07105|Xu & McAuley et al 2022 - A Survey on Model Compression and Acceleration for Pretrained Language Models]]
     * **[[https://arxiv.org/pdf/2312.03863|Wan et al 2023 - Efficient Large Language Models: A Survey]]** Updated continuously.  **See paper list [[https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey|here]]**     * **[[https://arxiv.org/pdf/2312.03863|Wan et al 2023 - Efficient Large Language Models: A Survey]]** Updated continuously.  **See paper list [[https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey|here]]**
 +
 +===== Economics of LLMs =====
 +  * [[https://arxiv.org/pdf/2306.07402|Howell et al 2023 - The Economic Trade-offs of Large Language Models: A Case Study]]
  
 ===== Miscellaneous ===== ===== Miscellaneous =====
nlp/language_model.1748502656.txt.gz · Last modified: 2025/05/29 07:10 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki