33 research outputs found
Recommended from our members
Extrapolating Subjectivity Research to Other Languages
Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language
Multilingual subjectivity analysis using machine translation
Although research in other languages is increasing, much of the work in subjectivity analysis has been applied to English data, mainly due to the large body of electronic resources and tools that are available for this language. In this paper, we propose and evaluate methods that can be employed to transfer a repository of subjectivity resources across languages. Specifically, we attempt to leverage on the resources available for English and, by employing machine translation, generate resources for subjectivity analysis in other languages. Through comparative evaluations on two different languages (Romanian and Spanish), we show that automatic translation is a viable alternative for the construction of resources and tools for subjectivity analysis in a new target language.
Recommended from our members
Random-Walk Term Weighting for Improved Text Classification
This paper describes a new approach for estimating term weights in a document, and shows how the new weighting scheme can be used to improve the accuracy of a text classifier
Recommended from our members
Multilingual Subjectivity: Are More Languages Better?
This paper discusses multilingual subjectivity
Recommended from our members
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources
This article discusses a bootstrapping method for building subjectivity lexicons for languages with scarce resources
Recommended from our members
Learning Multilingual Subjective Language via Cross-Lingual Projections
This paper discusses learning multilingual subjective language via cross-lingual projections
Recommended from our members
Multilingual Subjectivity Analysis Using Machine Translation
This paper discusses multilingual subjectivity analysis using machine translation
Recommended from our members
UNT: SubFinder: Combining Knowledge Sources for Automatic Lexical Substitution
This paper describes the University of North Texas SubFinder system. The system is able to provide the most likely set of substitutes for a word in a given context, by combining several techniques and knowledge sources. SubFinder has successfully participated in the best and out of ten (oot) tracks in the SEMEVAL lexical substitution task, consistently ranking in the first or second place