ChEMU: Cheminformatics Elsevier Melbourne University
Information Extraction from Chemical Patents
If you are interested in participating in the CLEF2020 ChEMU task on information extraction from chemical patents, please register here:
We will be running a new evaluation lab named ChEMU, part of the 11th Conference and Labs of the Evaluation Forum (CLEF-2020).
ChEMU proposes two key information extraction tasks over chemical reactions from patents.
- Task 1: Named Entity Recognition involves identifying chemical compounds as well as their types in context, i.e., to assign the label of a chemical compound according to the role which the compound plays within a chemical reaction.
- Task 2: Event extraction over chemical reactions involves event trigger detection and argument recognition.
More details coming soon!
This project is a collaboration between the University of Melbourne natural language processing group in the School of Computing and Information Systems, the Elsevier Content Transformations, Life Science team, and RMIT University. The principal investigator of the project is Karin Verspoor. The research is supported by an Australian Research Council Linkage Project, LP160101469, and Elsevier.
- Zhai Z, Nguyen DQ, Akhondi S, Thorne C, Druckenbrodt C, Cohn T, Gregory M and Verspoor K. (2019) Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings. Proceedings of the Workshop on Biomedical Natural Language Processing (BioNLP) at ACL 2019. https://www.aclweb.org/anthology/W19-5035.pdf
- Yoshikawa H, Verspoor K, Baldwin T, Nguyen DQ, Zhai Z, Zkhondi S, Thorne C, Druckenbrodt C. (2019) Detecting Chemical Reaction Schemes in Patents. Australian Language Technology Association Workshop (ALTA 2019). Sydney, Australia, December 2019. https://www.aclweb.org/anthology/U19-1014.pdf
- Nguyen DQ, Zhai Z, Yoshikawa H, Fang B, Druckenbrodt C, Thorne C, Hoessel R, Akhondi SA, Cohn T, Baldwin T and Verspoor K. (2020) ChEMU: Named Entity Recognition and Event Extraction of Chemical Reactions from Patents. To appear in ECIR 2020. PDF.
- Registration opens: 20 November 2019
- Registration closes: 26 April 2020
- Sample set release: (early March)
- Training set release: (mid March)
- Test set release: (TBD)
- End of Evaluation Cycle: 10 May 2020
- Submission of Participant Papers [CEUR-WS]: 24 May 2020
- Review process of participant papers: 24 May – 14 June 2020
- Submission of Condensed Lab Overviews [LNCS]: 31 May 2020
- Notification of Acceptance Participant Papers [CEUR-WS]: 14 June 2020
- Notification of Acceptance Condensed Lab Overviews [LNCS]: 7 June 2020
- Camera Ready Copy of Condensed Lab Overviews [LNCS]: 21 June 2020
- Camera Ready Copy of Participant Papers and Extended Lab Overviews [CEUR-WS]: 28 June 2020
- CEUR-WS Working Notes Preview for Checking by Authors and Lab Organizers: 17-22 July 2020