Differences

This shows you the differences between two versions of the page.

--- nlp:prompting [2025/04/05 00:55] – [Chain of Thought Prompting] jmflanig
+++ nlp:prompting [2026/02/13 00:31] (current) – [Chain of Thought Prompting] jmflanig
@@ Line 64: / Line 64: @@
 ==== Chain of Thought Prompting ====
-See also [[Reasoning Chains]].
+See also [[Reasoning#Reasoning Chains|Reasoning - Reasoning Chains]].
   * **Overviews**
@@ Line 78: / Line 78: @@
   * [[https://arxiv.org/pdf/2210.01240.pdf|Saparov & He 2022 - Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought]]
   * [[https://arxiv.org/pdf/2210.03629.pdf|Yao et al 2022 - ReAct: Synergizing Reasoning and Acting in Language Models]] - The basis of LangChain
+  * **[[https://arxiv.org/pdf/2211.12588|Chen et al 2022 - Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks]]**
   * [[https://arxiv.org/pdf/2305.04091.pdf|Wang et 2023 - Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models]]
   * [[https://arxiv.org/pdf/2305.14992|Hao et al 2023 - Reasoning with Language Model is Planning with World Model]]
@@ Line 91: / Line 92: @@
   * [[https://arxiv.org/pdf/2402.10200.pdf|Wang & Zhou et al 2024 - Chain-of-Thought Reasoning Without Prompting]]
   * [[https://arxiv.org/pdf/2403.02178|Chen et al 2024 - Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models]] Masks the CoT to get better results
+  * [[https://arxiv.org/pdf/2502.15589|Zhang et al 2025 - LightThinker: Thinking Step-by-Step Compression]]
+  * [[https://arxiv.org/pdf/2505.24217|Leng et al 2025 - Semi-structured LLM Reasoners Can Be Rigorously Audited]] William Cohen paper
+  * **Analysis of Chain of Thought**
+    * [[https://arxiv.org/pdf/2310.07923|Merrill & Sabharwal 2024 - The Expressive Power of Transformers with Chain of Thought]]
+    * [[https://arxiv.org/pdf/2502.21212|Huang et al 2025 - Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought]] See related work section for more work
 ==== Cross-lingual Prompting ====
@@ Line 105: / Line 111: @@
   * **Overviews**
     * [[https://arxiv.org/pdf/2304.08354.pdf|Qin et al 2023 - Tool Learning with Foundation Models]]
+    * [[https://modelcontextprotocol.io/docs/getting-started/intro|Model Contex Protocol]] A standard introduced by Anthropic in 2024
   * [[https://arxiv.org/pdf/2210.03629.pdf|Yao et al 2022 - ReAct: Synergizing Reasoning and Acting in Language Models]]. This kind of thing is implemented in [[https://github.com/hwchase17/langchain|LangChain]]
   * [[https://arxiv.org/abs/2302.04761|Schick et al 2023 - Toolformer: Language Models Can Teach Themselves to Use Tools]]
@@ Line 111: / Line 118: @@
     * Uses [[https://rapidapi.com/|RapidAPI]]
   * [[https://arxiv.org/pdf/2402.01869|2024 - InferCept: Efficient Intercept Support for Augmented Large Language Model Inference]]
+  * [[https://arxiv.org/pdf/2409.00920|Liu et al 2024 - ToolACE: Winning the Points of LLM Function Calling]]
 ==== Prompt Compression ====
@@ Line 119: / Line 127: @@
 ==== Data Contamination Issues ====
+See also [[ml: Membership Inference]].
   * **Overviews**
     * [[https://arxiv.org/pdf/2404.00699.pdf|Ravaut et al 2024 - How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library]]
@@ Line 124: / Line 133: @@
   * [[https://arxiv.org/pdf/2312.16337|Li & Flanigan 2023 - Task Contamination: Language Models May Not Be Few-Shot Anymore]]
   * LLMSanitize: [[https://arxiv.org/pdf/2404.00699.pdf|Ravaut et al 2024 - How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library]]
-  * [[https://arxiv.org/pdf/2405.00332|Zhang et al 2024 - A Careful Examination of Large Language Model Performance on Grade School Arithmetic]]
   * [[https://arxiv.org/pdf/2404.18543|Drinkall et al 2024 - Time Machine GPT]]
+  * GSM1k: [[https://arxiv.org/pdf/2405.00332|Zhang et al 2024 - A Careful Examination of Large Language Model Performance on Grade School Arithmetic]] Re-evaluates GSM8K with a new dataset
 ==== Dependence on Number of Examples ====
@@ Line 141: / Line 150: @@
   * [[https://arxiv.org/pdf/2211.15661.pdf|Akyürek et al 2022 - What learning algorithm is in-context learning? Investigations with linear models]]
   * [[https://arxiv.org/pdf/2310.15916.pdf|Hendel et al 2023 - In-Context Learning Creates Task Vectors]]
+  * [[https://arxiv.org/pdf/2505.05145|Hu et al 2025 - Understanding In-context Learning of Addition via Activation Subspaces]] Great paper. Fig 1 is awesome.
+  * [[https://arxiv.org/pdf/2504.00132|Bakalova et al 2025 - Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B]]
 ===== Datasets =====