Búsqueda de Colocaciones en la Web para Sinónimos de Wordnet
PDF (Español (España))

Keywords

Mining of the web
Lexical patterns.

Abstract

There is no doubt that the Web is the biggest information repository ever before constructed by human beings. With more than four billion pages index-linked by the public search engines, the Web represents the biggest and the widest textual corpus available in our days. Because his high linguistic value, because contains information in more than 1500 different languages, this corpus is being used with great success in many tasks related to the natural language process. In particular, several methods of data mining have been applied to extract from the Web some types of linguistic patterns which are useful for a diversity of tasks, such as automatic translations and answer search systems. In this work we presenta method that allows finding significant laying to the different senses attributable to a polysemic word. A collocation is an arbitrary and recurrent combination of words. The experiment results show a great potential of the proposed method to find collocations between words by using the Web as linguistic corpus, as well as the feasibility of incorporating the lexical patterns obtained in word sense disambiguation systems that can be used, for example, in translation machines or information recovery systems. by using the Web as a linguistic corpus, as well as the feasibility of incorporation the lexical patterns obtained in word sense disambiguation systems that can be used, for example, in machines translation or information recovery systems.
https://doi.org/10.15174/au.2005.213
PDF (Español (España))