Differences

This shows you the differences between two versions of the page.

--- ml:fine-tuning [2025/05/15 06:35] – [Parameter-Efficient Tuning (PET)] jmflanig
+++ ml:fine-tuning [2025/07/14 07:37] (current) – [General Papers] jmflanig
@@ Line 40: / Line 40: @@
     * [[https://arxiv.org/pdf/2401.14556|2024 - Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling]] Says "LLMs fall short of achieving state-of-the-art results in information extraction (IE) tasks, many of which are formulated as sequence labeling (SL)"
     * [[https://arxiv.org/pdf/2404.05961|LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders]] Shows that Mistral was probably pre-trained using some bi-directional attention.
+    * [[https://arxiv.org/pdf/2504.06225|Zhang et al 2025 - Encoder-Decoder Gemma: Improving the Quality-Efficiency Trade-Off via Adaptation]] They seem to think they are the first to do this (adapt pretrained decoder-only LLMs to encoder-decoder), which is incorrect.
 ===== Parameter-Efficient Tuning (PET) =====