====== Beam Search ====== For an introduction, see [[https://en.wikipedia.org/wiki/Beam_search|Wikipedia - Beam Search]]. ===== Papers ===== * **Beam search for RNNs**: [[https://arxiv.org/pdf/1509.00685.pdf|Rush et al 2015 - A Neural Attention Model for Abstractive Sentence Summarization]] This may be the first paper to use beam search in seq2seq models (as far as Jeff knows) * **[[https://arxiv.org/pdf/1708.00111|Goyal et al 2017 - A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models]]** ===== History ===== Invented for decoding in speech recognition, beam search was the de-facto decoding algorithm for statistical machine translation and was also used for statistical parsing. Historical references are here: * Speech Recognition * [[http://jpk.pku.edu.cn/course/sjjg/chapter2/resource/A%20fast%20sequential%20decoding%20algorithm%20using%20a%20stack.pdf|Jelinek 1969 - Fast Sequential Decoding Algorithm Using a Stack]] This is the standard stack-based beam search decoder, used extensively in statistical MT. Cited and used by [[https://aclanthology.org/1988.tmi-1.19.pdf|Brown et al 1988]] for MT. Didn't use the term beam search however. * Machine translation, see [[Statistical Machine Translation]] * [[https://www.aclweb.org/anthology/N03-1017.pdf|Och et al 2003 - Statistical Phrase-Based Translation]] * [[http://webpages.iust.ac.ir/morteza_zakeri/repo/iust_course_materials/NaturalLanguageProcessing/Project/refs/2004_pharaoh_a%20baem%20search_amta2004.pdf|Khoen 2004 - Pharaoh: a beam search decoder for phrase-based statistical machine translation models]] ===== Related Pages ===== * [[Decoding]] * [[Seq2seq]]