User Tools

Site Tools


nlp:patent_domain_nlp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:patent_domain_nlp [2022/06/01 07:42] – [Datasets] jmflanignlp:patent_domain_nlp [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 9: Line 9:
   * **Translation**   * **Translation**
     * [[https://aclanthology.org/volumes/2005.mtsummit-wpt/|2005 Workshop on Patent Translation]]     * [[https://aclanthology.org/volumes/2005.mtsummit-wpt/|2005 Workshop on Patent Translation]]
 +    * [[http://www.lrec-conf.org/proceedings/lrec2012/pdf/1043_Paper.pdf|Nanba et al 2012 - Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques]]
     * [[https://aclanthology.org/2020.lrec-1.465.pdf|Soares et al 2020 - ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts]]     * [[https://aclanthology.org/2020.lrec-1.465.pdf|Soares et al 2020 - ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts]]
   * **Retrieval**   * **Retrieval**
Line 29: Line 30:
  
 ===== Datasets ===== ===== Datasets =====
-  * CMUmine: [[https://aclanthology.org/2021.nllp-1.21.pdf|paper]] [[https://drive.google.com/drive/u/0/folders/1J4sAcM_21G39VuZT1jv6RqLTEM_UngWS|dataset]] Patent application dataset. Contains patent claims section, used for automatic construction of patent claims+  * EuroPat: Sentence-Aligned European Patent Corpus: [[https://europat.net/|website]] [[https://aclanthology.org/2011.eamt-1.25.pdf|2011 paper]]
   * ParaPat: [[https://aclanthology.org/2020.lrec-1.465.pdf|paper]] Large parallel corpus of patent abstract (68M sentences total)   * ParaPat: [[https://aclanthology.org/2020.lrec-1.465.pdf|paper]] Large parallel corpus of patent abstract (68M sentences total)
 +  * CMUmine: [[https://aclanthology.org/2021.nllp-1.21.pdf|paper]] [[https://drive.google.com/drive/u/0/folders/1J4sAcM_21G39VuZT1jv6RqLTEM_UngWS|dataset]] [[https://drive.google.com/file/d/18YF3RvOzwIZQvLVilcAVNLYUCm6v_Wx8/view?usp=sharing|backup copy]] Patent application dataset. Contains patent claims section, used for automatic construction of patent claims. Warning: They seem to be unaware of the EuroPat dataset, as well as the large amount of prior work on NLP for patents.
  
 ===== Workshops ===== ===== Workshops =====
nlp/patent_domain_nlp.1654069377.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki