Analysis of method of discovering synsets representing identical concepts
Abstract
Analysis of method of discovering synsets representing identical concepts
Incoming article date: 16.07.2015The paper analyses a method of detecting synsets representing the same concepts. The analyzed method is based on the assumption that two synonyms define a certain meaning (or concept/sense). According to this, if a couple of synsets has at least two common words, these synsets represent identical concepts. It is clear that a concept should not appear in a thesaurus twice. Hence, this method can be used to clean the synsets of lexical resources. However, this criterion of synsets similarity has not yet been studied. The paper describes an experiment in which the data from open Russian thesaurus YARN was used and a survey among Russian native-speakers was conducted. The result of the experiment showed that the precision of the criterion of synsets similarity is 73%.
Keywords: lexical resource, dictionary, Wiki dictionary, crowd sourcing, a thesaurus, a synonym, a number of synonymous, semantic relations, similarity measure, Russian language