Computer Science > Computation and Language
[Submitted on 1 May 2020 (this version), latest version 22 Aug 2021 (v4)]
Title:MedType: Improving Medical Entity Linking with Semantic Type Prediction
View PDFAbstract:Medical entity linking is the task of identifying and standardizing concepts referred in a scientific article or clinical record. Existing methods adopt a two-step approach of detecting mentions and identifying a list of candidate concepts for them. In this paper, we probe the impact of incorporating an entity disambiguation step in existing entity linkers. For this, we present MedType, a novel method that leverages the surrounding context to identify the semantic type of a mention and uses it for filtering out candidate concepts of the wrong types. We further present two novel largescale, automatically-created datasets of medical entity mentions: WIKIMED, a Wikipediabased dataset for cross-domain transfer learning, and PUBMEDDS, a distantly-supervised dataset of medical entity mentions in biomedical abstracts. Through extensive experiments across several datasets and methods, we demonstrate that MedType pre-trained on our proposed datasets substantially improve medical entity linking and gives state-of-the-art performance. We make our source code and datasets publicly available for medical entity linking research.
Submission history
From: Shikhar Vashishth [view email][v1] Fri, 1 May 2020 15:55:50 UTC (2,725 KB)
[v2] Wed, 16 Sep 2020 15:07:32 UTC (1,506 KB)
[v3] Thu, 11 Feb 2021 23:10:29 UTC (8,025 KB)
[v4] Sun, 22 Aug 2021 06:53:08 UTC (2,718 KB)
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.