Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

Ganea, Octavian-Eugen; Horlescu, Marina; Lucchi, Aurelien; Eickhoff, Carsten; Hofmann, Thomas

Computer Science > Computation and Language

arXiv:1509.02301v1 (cs)

[Submitted on 8 Sep 2015 (this version), latest version 29 Jan 2016 (v3)]

Title:Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

Authors:Octavian-Eugen Ganea, Marina Horlescu, Aurelien Lucchi, Carsten Eickhoff, Thomas Hofmann

View PDF

Abstract:The goal of entity linking is to map spans of text to canonical entity representations such as Freebase entries or Wikipedia articles. It provides a foundation for various natural language processing tasks, including text understanding, summarization and machine translation. Name ambiguity, word polysemy, context dependencies, and a heavy-tailed distribution of entities contribute to the complexity of this problem.
We propose a simple, yet effective, probabilistic graphical model for collective entity linking, which resolves entity links jointly across an entire document. Our model captures local information from linkable token spans (i.e., mentions) and their surrounding context and combines it with a document-level prior of entity co-occurrences. The model is acquired automatically from entity-linked text repositories with a lightweight computational step for parameter adaptation. Loopy belief propagation is used as an efficient approximate inference algorithm.
In contrast to state-of-the-art methods, our model is conceptually simple and easy to reproduce. It comes with a small memory footprint and is sufficiently fast for real-time usage. We demonstrate its benefits on a wide range of well-known entity linking benchmark datasets. Our empirical results show the merits of the proposed approach and its competitiveness in comparison to state-of-the-art methods.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1509.02301 [cs.CL]
	(or arXiv:1509.02301v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1509.02301

Submission history

From: Octavian Ganea [view email]
[v1] Tue, 8 Sep 2015 09:43:13 UTC (241 KB)
[v2] Sun, 18 Oct 2015 13:40:31 UTC (246 KB)
[v3] Fri, 29 Jan 2016 19:22:44 UTC (246 KB)

Computer Science > Computation and Language

Title:Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Probabilistic Bag-Of-Hyperlinks Model for Entity Linking

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators