Browsing Research Institute in Information and Language Processing by Subjects
Now showing items 1-1 of 1
Bilingual contexts from comparable corpora to mine for translations of collocationsDue to the limited availability of parallel data in many languages, we propose a methodology that benefits from comparable corpora to find translation equivalents for collocations (as a specific type of difficult-to-translate multi-word expressions). Finding translations is known to be more difficult for collocations than for words. We propose a method based on bilingual context extraction and build a word (distributional) representation model drawing on these bilingual contexts (bilingual English-Spanish contexts in our case). We show that the bilingual context construction is effective for the task of translation equivalent learning and that our method outperforms a simplified distributional similarity baseline in finding translation equivalents.