nlp:explainability

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:explainability [2025/03/27 00:01] – [Surveys] jmflanignlp:explainability [2025/06/01 23:17] (current) – [Related Pages] jmflanig
Line 35: Line 35:
   * **Overviews**   * **Overviews**
     * [[https://arxiv.org/pdf/2401.12874|Luo & Specia 2024 - From Understanding to Utilization: A Survey on Explainability for Large Language Models]]     * [[https://arxiv.org/pdf/2401.12874|Luo & Specia 2024 - From Understanding to Utilization: A Survey on Explainability for Large Language Models]]
-    * [[https://arxiv.org/pdf/2402.10688|Zhao et al 2024 - Towards Uncovering How Large Language Model Works: An Explainability Perspective]]+    * [[https://arxiv.org/pdf/2402.10688|Zhao et al 2024 - Towards Uncovering How Large Language Model Works: An Explainability Perspective]] This is an ok paper, but it cites almost none of the work before 2021 or work outside of the mechanistic interpretability literature.
   * **Resources**   * **Resources**
     * [[https://burnycoder.github.io/Landing/Contents/Exobrain/Topics/Mechanistic%20interpretability/|Paper list]]     * [[https://burnycoder.github.io/Landing/Contents/Exobrain/Topics/Mechanistic%20interpretability/|Paper list]]
Line 42: Line 42:
     * [[https://arxiv.org/pdf/2305.08809|Wu et al 2023 - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca]]     * [[https://arxiv.org/pdf/2305.08809|Wu et al 2023 - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca]]
     * [[https://arxiv.org/pdf/2305.19911|Foote et al 2023 - Neuron to Graph: Interpreting Language Model Neurons at Scale]]     * [[https://arxiv.org/pdf/2305.19911|Foote et al 2023 - Neuron to Graph: Interpreting Language Model Neurons at Scale]]
-    * **[[https://arxiv.org/pdf/2402.10688|Zhao et al 2024 - Towards Uncovering How Large Language Model Works: An Explainability Perspective]]** Good review paper 
  
 ===== Natural Language Explanations ===== ===== Natural Language Explanations =====
Line 64: Line 63:
   * [[ml:Neural Network Psychology]]   * [[ml:Neural Network Psychology]]
   * [[Probing Experiments]]   * [[Probing Experiments]]
-  * [[Reasoning Chains]]+  * [[Reasoning#Reasoning Chains|Reasoning - Reasoning Chains]]
   * [[ml:Trustworthy AI]]   * [[ml:Trustworthy AI]]
   * [[ml:Visualizing Neural Networks]]   * [[ml:Visualizing Neural Networks]]
nlp/explainability.1743033699.txt.gz · Last modified: 2025/03/27 00:01 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki