Cross-Lingual Adaptation using Structural Correspondence Learning

Prettenhofer, Peter; Stein, Benno

Computer Science > Information Retrieval

arXiv:1008.0716 (cs)

[Submitted on 4 Aug 2010 (v1), last revised 25 Aug 2010 (this version, v2)]

Title:Cross-Lingual Adaptation using Structural Correspondence Learning

Authors:Peter Prettenhofer, Benno Stein

View PDF

Abstract:Cross-lingual adaptation, a special case of domain adaptation, refers to the transfer of classification knowledge between two languages. In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation. The proposed method uses unlabeled documents from both languages, along with a word translation oracle, to induce cross-lingual feature correspondences. From these correspondences a cross-lingual representation is created that enables the transfer of classification knowledge from the source to the target language. The main advantages of this approach over other approaches are its resource efficiency and task specificity.
We conduct experiments in the area of cross-language topic and sentiment classification involving English as source language and German, French, and Japanese as target languages. The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification) and 59% (sentiment classification). We further report on empirical analyses that reveal insights into the use of unlabeled data, the sensitivity with respect to important hyperparameters, and the nature of the induced cross-lingual correspondences.

Subjects:	Information Retrieval (cs.IR)
ACM classes:	H.3.3; I.2.7
Cite as:	arXiv:1008.0716 [cs.IR]
	(or arXiv:1008.0716v2 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1008.0716

Submission history

From: Peter Prettenhofer [view email]
[v1] Wed, 4 Aug 2010 08:42:07 UTC (1,147 KB)
[v2] Wed, 25 Aug 2010 15:52:09 UTC (1,147 KB)

Computer Science > Information Retrieval

Title:Cross-Lingual Adaptation using Structural Correspondence Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Cross-Lingual Adaptation using Structural Correspondence Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators