User Tools

Site Tools


nlp:patent_domain_nlp

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:patent_domain_nlp [2022/06/01 07:05] – [Papers] jmflanignlp:patent_domain_nlp [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Patent Domain NLP ===== ====== Patent Domain NLP =====
 +NLP in the patent domain. Overlaps with [[Legal Domain NLP]] and [[Scientific Text Processing]].
  
 ===== Papers ===== ===== Papers =====
Line 5: Line 6:
     * [[https://aclanthology.org/J13-3009.pdf|D'hondt et al 2013 - Text Representations for Patent Classification]]     * [[https://aclanthology.org/J13-3009.pdf|D'hondt et al 2013 - Text Representations for Patent Classification]]
     * [[https://aclanthology.org/U18-1013.pdf|Hepburn 2018 - Universal Language Model Fine-tuning for Patent Classification]]     * [[https://aclanthology.org/U18-1013.pdf|Hepburn 2018 - Universal Language Model Fine-tuning for Patent Classification]]
 +    * [[https://aclanthology.org/D19-1344.pdf|Niu & Cai 2019 - A Label Informative Wide & Deep Classifier for Patents and Papers]]
   * **Translation**   * **Translation**
     * [[https://aclanthology.org/volumes/2005.mtsummit-wpt/|2005 Workshop on Patent Translation]]     * [[https://aclanthology.org/volumes/2005.mtsummit-wpt/|2005 Workshop on Patent Translation]]
 +    * [[http://www.lrec-conf.org/proceedings/lrec2012/pdf/1043_Paper.pdf|Nanba et al 2012 - Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques]]
     * [[https://aclanthology.org/2020.lrec-1.465.pdf|Soares et al 2020 - ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts]]     * [[https://aclanthology.org/2020.lrec-1.465.pdf|Soares et al 2020 - ParaPat: The Multi-Million Sentences Parallel Corpus of Patents Abstracts]]
   * **Retrieval**   * **Retrieval**
Line 14: Line 17:
   * **Keyword Extraction**   * **Keyword Extraction**
     * [[https://aclanthology.org/C16-1113.pdf|Suzuki & Takatsuka 2016 - Extraction of Keywords of Novelties From Patent Claims]]     * [[https://aclanthology.org/C16-1113.pdf|Suzuki & Takatsuka 2016 - Extraction of Keywords of Novelties From Patent Claims]]
-  * **Information Extraction or Analysis**+  * **Information Extraction and Other Analysis**
     * [[https://aclanthology.org/W03-2008.pdf|Sheremetyeva 2003 - Natural Language Analysis of Patent Claims]]     * [[https://aclanthology.org/W03-2008.pdf|Sheremetyeva 2003 - Natural Language Analysis of Patent Claims]]
 +    * [[http://www.lrec-conf.org/proceedings/lrec2010/pdf/81_Paper.pdf|Galibert et al 2010 - Hybrid Citation Extraction from Patents]]
 +    * [[https://aclanthology.org/W19-5035.pdf|Zhai et al 2019 - Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings]]
 +    * [[https://aclanthology.org/2020.icon-main.1.pdf|Nittala & Shrivastava 2020 - The WEAVE Corpus: Annotating Synthetic Chemical Procedures in Patents with Chemical Named Entities]]
 +    * [[https://aclanthology.org/2020.coling-main.54.pdf|Hu & Verberne 2020 - Named Entity Recognition for Chinese Biomedical Patents]]
   * **Patent Claim Construction**   * **Patent Claim Construction**
     * [[https://aclanthology.org/W96-0407.pdf|Sheremetyeva, Nirenburg & Nirenburg 1996- Generating Patent Claims from Interactive Input]]     * [[https://aclanthology.org/W96-0407.pdf|Sheremetyeva, Nirenburg & Nirenburg 1996- Generating Patent Claims from Interactive Input]]
Line 23: Line 30:
  
 ===== Datasets ===== ===== Datasets =====
-  * CMUmine: [[https://aclanthology.org/2021.nllp-1.21.pdf|paper]] Patent application dataset. Contains patent claims section, used for automatic construction of patent claims+  * EuroPat: Sentence-Aligned European Patent Corpus[[https://europat.net/|website]] [[https://aclanthology.org/2011.eamt-1.25.pdf|2011 paper]]
   * ParaPat: [[https://aclanthology.org/2020.lrec-1.465.pdf|paper]] Large parallel corpus of patent abstract (68M sentences total)   * ParaPat: [[https://aclanthology.org/2020.lrec-1.465.pdf|paper]] Large parallel corpus of patent abstract (68M sentences total)
 +  * CMUmine: [[https://aclanthology.org/2021.nllp-1.21.pdf|paper]] [[https://drive.google.com/drive/u/0/folders/1J4sAcM_21G39VuZT1jv6RqLTEM_UngWS|dataset]] [[https://drive.google.com/file/d/18YF3RvOzwIZQvLVilcAVNLYUCm6v_Wx8/view?usp=sharing|backup copy]] Patent application dataset. Contains patent claims section, used for automatic construction of patent claims. Warning: They seem to be unaware of the EuroPat dataset, as well as the large amount of prior work on NLP for patents.
  
 ===== Workshops ===== ===== Workshops =====
Line 31: Line 39:
  
 ===== Related Pages ===== ===== Related Pages =====
-  * [[Legal Domain NLP]]+  * [[Legal Domain NLP]] Some overlap with patent domain NLP 
 +  * [[Scientific Text Processing]] Some overlap with patent domain NLP
nlp/patent_domain_nlp.1654067114.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki