User Tools

Site Tools


nlp:patent_domain_nlp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:patent_domain_nlp [2022/06/01 07:27] – [Papers] jmflanignlp:patent_domain_nlp [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Patent Domain NLP ===== ====== Patent Domain NLP =====
 +NLP in the patent domain. Overlaps with [[Legal Domain NLP]] and [[Scientific Text Processing]].
  
 ===== Papers ===== ===== Papers =====
Line 8: Line 9:
   * **Translation**   * **Translation**
     * [[https://aclanthology.org/volumes/2005.mtsummit-wpt/|2005 Workshop on Patent Translation]]     * [[https://aclanthology.org/volumes/2005.mtsummit-wpt/|2005 Workshop on Patent Translation]]
 +    * [[http://www.lrec-conf.org/proceedings/lrec2012/pdf/1043_Paper.pdf|Nanba et al 2012 - Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques]]
     * [[https://aclanthology.org/2020.lrec-1.465.pdf|Soares et al 2020 - ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts]]     * [[https://aclanthology.org/2020.lrec-1.465.pdf|Soares et al 2020 - ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts]]
   * **Retrieval**   * **Retrieval**
Line 17: Line 19:
   * **Information Extraction and Other Analysis**   * **Information Extraction and Other Analysis**
     * [[https://aclanthology.org/W03-2008.pdf|Sheremetyeva 2003 - Natural Language Analysis of Patent Claims]]     * [[https://aclanthology.org/W03-2008.pdf|Sheremetyeva 2003 - Natural Language Analysis of Patent Claims]]
 +    * [[http://www.lrec-conf.org/proceedings/lrec2010/pdf/81_Paper.pdf|Galibert et al 2010 - Hybrid Citation Extraction from Patents]]
 +    * [[https://aclanthology.org/W19-5035.pdf|Zhai et al 2019 - Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings]]
 +    * [[https://aclanthology.org/2020.icon-main.1.pdf|Nittala & Shrivastava 2020 - The WEAVE Corpus: Annotating Synthetic Chemical Procedures in Patents with Chemical Named Entities]]
     * [[https://aclanthology.org/2020.coling-main.54.pdf|Hu & Verberne 2020 - Named Entity Recognition for Chinese Biomedical Patents]]     * [[https://aclanthology.org/2020.coling-main.54.pdf|Hu & Verberne 2020 - Named Entity Recognition for Chinese Biomedical Patents]]
   * **Patent Claim Construction**   * **Patent Claim Construction**
Line 25: Line 30:
  
 ===== Datasets ===== ===== Datasets =====
-  * CMUmine: [[https://aclanthology.org/2021.nllp-1.21.pdf|paper]] Patent application dataset. Contains patent claims section, used for automatic construction of patent claims+  * EuroPat: Sentence-Aligned European Patent Corpus[[https://europat.net/|website]] [[https://aclanthology.org/2011.eamt-1.25.pdf|2011 paper]]
   * ParaPat: [[https://aclanthology.org/2020.lrec-1.465.pdf|paper]] Large parallel corpus of patent abstract (68M sentences total)   * ParaPat: [[https://aclanthology.org/2020.lrec-1.465.pdf|paper]] Large parallel corpus of patent abstract (68M sentences total)
 +  * CMUmine: [[https://aclanthology.org/2021.nllp-1.21.pdf|paper]] [[https://drive.google.com/drive/u/0/folders/1J4sAcM_21G39VuZT1jv6RqLTEM_UngWS|dataset]] [[https://drive.google.com/file/d/18YF3RvOzwIZQvLVilcAVNLYUCm6v_Wx8/view?usp=sharing|backup copy]] Patent application dataset. Contains patent claims section, used for automatic construction of patent claims. Warning: They seem to be unaware of the EuroPat dataset, as well as the large amount of prior work on NLP for patents.
  
 ===== Workshops ===== ===== Workshops =====
Line 33: Line 39:
  
 ===== Related Pages ===== ===== Related Pages =====
-  * [[Legal Domain NLP]]+  * [[Legal Domain NLP]] Some overlap with patent domain NLP 
 +  * [[Scientific Text Processing]] Some overlap with patent domain NLP
nlp/patent_domain_nlp.1654068429.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki