Search CORE

33 research outputs found

Recommended from our members

Extrapolating Subjectivity Research to Other Languages

Author: Banea Carmen
Publication venue: 'University of North Texas Libraries'
Publication date: 01/05/2013
Field of study

Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language

UNT Digital Library

Random-Walk Term Weighting for Improved Text Classification

Author: Carmen Banea
Rada Mihalcea
Samer Hassan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Crossref

Multilingual subjectivity analysis using machine translation

Author: Carmen Banea
Janyce Wiebe
Rada Mihalcea
Samer Hassan
Publication venue
Publication date: 01/01/2008
Field of study

Although research in other languages is increasing, much of the work in subjectivity analysis has been applied to English data, mainly due to the large body of electronic resources and tools that are available for this language. In this paper, we propose and evaluate methods that can be employed to transfer a repository of subjectivity resources across languages. Specifically, we attempt to leverage on the resources available for English and, by employing machine translation, generate resources for subjectivity analysis in other languages. Through comparative evaluations on two different languages (Romanian and Spanish), we show that automatic translation is a viable alternative for the construction of resources and tools for subjectivity analysis in a new target language.

CiteSeerX

Crossref

Recommended from our members

Random-Walk Term Weighting for Improved Text Classification

Author: Banea Carmen
Hassan Samer
Mihalcea Rada, 1974-
Publication venue
Publication date: 01/09/2007
Field of study

This paper describes a new approach for estimating term weights in a document, and shows how the new weighting scheme can be used to improve the accuracy of a text classifier

UNT Digital Library

Recommended from our members

Multilingual Subjectivity: Are More Languages Better?

Author: Banea Carmen
Mihalcea Rada, 1974-
Wiebe Janyce M.
Publication venue
Publication date: 01/08/2010
Field of study

This paper discusses multilingual subjectivity

UNT Digital Library

women s syntactic resilience and men s grammatical luck gender bias in part of speech tagging and dependency parsing

Author: Aparna Garimella
Carmen Banea
Dirk Hovy
Rada Mihalcea
Publication venue
Publication date: 01/01/2019
Field of study

Archivio istituzionale della Ricerca - Bocconi

Crossref

Open Access Repository

Recommended from our members

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources

Author: Banea Carmen
Mihalcea Rada, 1974-
Wiebe Janyce M.
Publication venue: Evaluations and Language Resources Distribution Agency
Publication date: 01/05/2008
Field of study

This article discusses a bootstrapping method for building subjectivity lexicons for languages with scarce resources

UNT Digital Library

Recommended from our members

Learning Multilingual Subjective Language via Cross-Lingual Projections

Author: Banea Carmen
Mihalcea Rada, 1974-
Wiebe Janyce M.
Publication venue
Publication date: 01/06/2007
Field of study

This paper discusses learning multilingual subjective language via cross-lingual projections

UNT Digital Library

Recommended from our members

Multilingual Subjectivity Analysis Using Machine Translation

Author: Banea Carmen
Hassan Samer
Mihalcea Rada, 1974-
Wiebe Janyce M.
Publication venue
Publication date: 01/10/2008
Field of study

This paper discusses multilingual subjectivity analysis using machine translation

UNT Digital Library

Recommended from our members

UNT: SubFinder: Combining Knowledge Sources for Automatic Lexical Substitution

Author: Banea Carmen
Csomai Andras
Hassan Samer
Mihalcea Rada, 1974-
Sinha Ravi
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/06/2007
Field of study

This paper describes the University of North Texas SubFinder system. The system is able to provide the most likely set of substitutes for a word in a given context, by combining several techniques and knowledge sources. SubFinder has successfully participated in the best and out of ten (oot) tracks in the SEMEVAL lexical substitution task, consistently ranking in the first or second place

UNT Digital Library