====== Model Editing and Unlearning ====== //Model editing// is where a model, such as a large language model, is "edited" to change the facts in the model. //Machine unlearning// is where a trained model is adjusted to "remove" one or more datapoints that were used to train the model, so that it behaves like a model that was trained without those datapoints. The datapoints to remove can either be specific datapoints from the training set, or classes of datapoints, such as all datapoints about bioweapons. ===== Model Editing ===== ==== In NLP ==== See also [[nlp:Knowledge Editing]]. * [[https://arxiv.org/pdf/2311.04661.pdf|Tan et al 2023 - Massive Editing for Large Language Models via Meta Learning]] * [[https://arxiv.org/pdf/2404.13752|Zhang et al 2024 - Towards General Conceptual Model Editing via Adversarial Representation Engineering]] ===== Machine Unlearning ===== ==== Overviews ==== * [[https://arxiv.org/pdf/2209.02299|Nguyen et al 2022 - A Survey of Machine Unlearning]] * [[https://arxiv.org/pdf/2306.03558|Xu et al 2023 - Machine Unlearning: A Survey]] * [[https://arxiv.org/pdf/2405.07406|Wang et al 2024 - Machine Unlearning: A Comprehensive Survey]] * [[https://arxiv.org/pdf/2407.20516|Liu et al 2024 - Machine Unlearning in Generative AI: A Survey]] * **Paper lists** * [[https://github.com/chrisliu298/awesome-llm-unlearning|Awesome LLM Unlearning]] * ** For NLP or LLMs ** * [[https://arxiv.org/pdf/2402.08787.pdf|Liu et al 2024 - Rethinking Machine Unlearning for Large Language Models]] (This is also a survey paper.) * [[https://arxiv.org/pdf/2503.01854|Geng et al 2025 - A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models]] ==== Key Papers ==== * [[https://browse.arxiv.org/pdf/1912.03817.pdf|Bourtoule et al 2019 - Machine Unlearning]] ==== In NLP or LLMs ==== * [[https://arxiv.org/pdf/2310.02238.pdf|Eldan & Russinovich 2023 - Who’s Harry Potter? Approximate Unlearning in LLMs]] * [[https://arxiv.org/pdf/2310.10683.pdf|Yao et al 2023 - Large Language Model Unlearning]] * [[https://arxiv.org/pdf/2401.06121.pdf|Maini et al 2024 - TOFU: A Task of Fictitious Unlearning for LLMs]] * [[https://arxiv.org/pdf/2402.08787.pdf|Liu et al 2024 - Rethinking Machine Unlearning for Large Language Models]] * [[https://arxiv.org/pdf/2403.03329.pdf|Thaker et al 2024 - Guardrail Baselines for Unlearning in LLMs]] * [[https://arxiv.org/pdf/2410.02760|Gandikota et al 2024 - Erasing Conceptual Knowledge from Language Models]] * [[https://arxiv.org/pdf/2505.22586|Gur-Arieh et al 2025 - Precise In-Parameter Concept Erasure in Large Language Models]] ==== Theory Papers ==== * [[https://arxiv.org/pdf/1911.03030|Guo et al 2024 - Certified Data Removal from Machine Learning Models]] ===== Related Pages ===== * [[Privacy]] * [[nlp:Knowledge Editing]]