Search CORE

25 research outputs found

Combining granularity-based topic-dependent and topic-independent evidences for opinion detection

Author: Missen Malik Muhammad Saad
Publication venue
Publication date: 07/06/2011
Field of study

Fouille des opinion, une sous-discipline dans la recherche d'information (IR) et la linguistique computationnelle, fait référence aux techniques de calcul pour l'extraction, la classification, la compréhension et l'évaluation des opinions exprimées par diverses sources de nouvelles en ligne, social commentaires des médias, et tout autre contenu généré par l'utilisateur. Il est également connu par de nombreux autres termes comme trouver l'opinion, la détection d'opinion, l'analyse des sentiments, la classification sentiment, de détection de polarité, etc. Définition dans le contexte plus spécifique et plus simple, fouille des opinion est la tâche de récupération des opinions contre son besoin aussi exprimé par l'utilisateur sous la forme d'une requête. Il y a de nombreux problèmes et défis liés à l'activité fouille des opinion. Dans cette thèse, nous nous concentrons sur quelques problèmes d'analyse d'opinion. L'un des défis majeurs de fouille des opinion est de trouver des opinions concernant spécifiquement le sujet donné (requête). Un document peut contenir des informations sur de nombreux sujets à la fois et il est possible qu'elle contienne opiniâtre texte sur chacun des sujet ou sur seulement quelques-uns. Par conséquent, il devient très important de choisir les segments du document pertinentes à sujet avec leurs opinions correspondantes. Nous abordons ce problème sur deux niveaux de granularité, des phrases et des passages. Dans notre première approche de niveau de phrase, nous utilisons des relations sémantiques de WordNet pour trouver cette association entre sujet et opinion. Dans notre deuxième approche pour le niveau de passage, nous utilisons plus robuste modèle de RI i.e. la language modèle de se concentrer sur ce problème. L'idée de base derrière les deux contributions pour l'association d'opinion-sujet est que si un document contient plus segments textuels (phrases ou passages) opiniâtre et pertinentes à sujet, il est plus opiniâtre qu'un document avec moins segments textuels opiniâtre et pertinentes. La plupart des approches d'apprentissage-machine basée à fouille des opinion sont dépendants du domaine i.e. leurs performances varient d'un domaine à d'autre. D'autre part, une approche indépendant de domaine ou un sujet est plus généralisée et peut maintenir son efficacité dans différents domaines. Cependant, les approches indépendant de domaine souffrent de mauvaises performances en général. C'est un grand défi dans le domaine de fouille des opinion à développer une approche qui est plus efficace et généralisé. Nos contributions de cette thèse incluent le développement d'une approche qui utilise de simples fonctions heuristiques pour trouver des documents opiniâtre. Fouille des opinion basée entité devient très populaire parmi les chercheurs de la communauté IR. Il vise à identifier les entités pertinentes pour un sujet donné et d'en extraire les opinions qui leur sont associées à partir d'un ensemble de documents textuels. Toutefois, l'identification et la détermination de la pertinence des entités est déjà une tâche difficile. Nous proposons un système qui prend en compte à la fois l'information de l'article de nouvelles en cours ainsi que des articles antérieurs pertinents afin de détecter les entités les plus importantes dans les nouvelles actuelles. En plus de cela, nous présentons également notre cadre d'analyse d'opinion et tâches relieés. Ce cadre est basée sur les évidences contents et les évidences sociales de la blogosphère pour les tâches de trouver des opinions, de prévision et d'avis de classement multidimensionnel. Cette contribution d'prématurée pose les bases pour nos travaux futurs. L'évaluation de nos méthodes comprennent l'utilisation de TREC 2006 Blog collection et de TREC Novelty track 2004 collection. La plupart des évaluations ont été réalisées dans le cadre de TREC Blog track.Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online sources like news articles, social media comments, and other user-generated content. It is also known by many other terms like opinion finding, opinion detection, sentiment analysis, sentiment classification, polarity detection, etc. Defining in more specific and simpler context, opinion mining is the task of retrieving opinions on an issue as expressed by the user in the form of a query. There are many problems and challenges associated with the field of opinion mining. In this thesis, we focus on some major problems of opinion mining

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Thèses en ligne de l'Université Toulouse III - Paul Sabatier

Modelling Diffusion Processes in Social Networks and Visualising Social Network Data

Author: Yao Juan
Publication venue: Dublin City University. School of Computing
Publication date: 21/09/2010
Field of study

Irish Universities

DCU Online Research Access Service

Predictive Modeling for Navigating Social Media

Author: HU Meiqun
Publication venue: Singapore Management University
Publication date: 01/01/2012
Field of study

Social media changes the way people use the Web. It has transformed ordinary Web users from information consumers to content contributors. One popular form of content contribution is social tagging, in which users assign tags to Web resources. By the collective efforts of the social tagging community, a new information space has been created for information navigation. Navigation allows serendipitous discovery of information by examining the information objects linked to one another in the social tagging space. In this dissertation, we study prediction tasks that facilitate navigation in social tagging systems. For social tagging systems to meet complex navigation needs of users, two issues are fundamental, namely link sparseness and object selection. Link sparseness is observed for many resources that are untagged or inadequately tagged, hindering navigation to the resources. Object selection is concerned when there are a large number of information objects that are linked to the current object, requiring to select the more interesting or relevant ones for guiding navigation effectively. This dissertation focuses on three dimensions, namely the semantic, social and temporal dimensions, to address link sparseness and object selection. To address link sparseness, we study the task of tag prediction. This task aims to enrich tags for the untagged or inadequately tagged resources, such that the predicted tags can serve as navigable links to these resources. For this task, we take a topic modeling approach to exploit the latent semantic relationships between resource content and tags. To address object selection, we study the task of personalized tag recommendation and trend discovery using social annotations. Personalized tag recommendation leverages the collective wisdom from the social tagging community to recommend tags that are semantically relevant to the target resource, while being tailored to the tagging preferences of individual users. For this task, we propose a probabilistic framework which leverages the implicit social links between like-minded users, i.e. who show similar tagging preferences, to recommend suitable tags. Social tags capture the interest of the users in the annotated resources at different times. These social annotations allow us to construct temporal profiles for the annotated resources. By analyzing these temporal profiles, we unveil the non-trivial temporal trends of the annotated resources, which provide novel metrics for selecting relevant and interesting resources for guiding navigation. For trend discovery using social annotations, we propose a trend discovery process which enables us to analyze trends for a multitude of semantics encapsulated in the temporal profiles of the annotated resources

Institutional Knowledge at Singapore Management University

ProQuest OAI Repository

Making sense of strangers' expertise from digital artifacts

Author: Shami Nazmus Sadat
Publication venue
Publication date: 08/09/2008
Field of study

In organizations, individuals typically rely on their personal networks to obtain expertise when faced with ill-defined problems that require answers that are beyond the scope of their own knowledge. However, individuals cannot always get the needed expertise from their local colleagues. This issue is particularly acute for members in large geographically dispersed organizations since it is difficult to know ?who knows what? among numerous colleagues. The proliferation of social computing technologies such as blogs, online forums, social tags and bookmarks, and social network connection information have expanded the reach and ease at which knowledge workers may become aware of others? expertise. While all these technologies facilitate access to a stranger that can potentially provide needed expertise or advice, there has been little theoretical work on how individuals actually go about this process. I refer to the process of gathering complex, changing and potentially equivocal information, and comprehending it by connecting nuggets of information from many sources to answer vague, non-procedural questions as the process of ?sensemaking?. Through a study of 81 fulltime IBM employees in 21 countries, I look at how existing models and theories of sensemaking and information search may be inadequate to describe the ?people sensemaking? process individuals go through when considering contacting strangers for expertise. Using signaling theory as an interpretive framework, I describe how certain ?signals? in various social software are hard to fake, and are thus more reliable indicators of expertise, approachability, and responsiveness. This research has the potential to inform models of sensemaking and information search when the search is for people, as opposed to documents

eCommons@Cornell

Corporate impression formation in online communities - determinants and consequences of online community corporate impressions

Author: Hallier Willi Christine
Publication venue: Brunel University Brunel Business School PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The purpose of this study is to gain in-depth knowledge of how the members of online communities form impressions of organisations that use online communities in their communication activities. Online impression formation has its peculiarities and in order to succeed companies need to better understand this phenomenon. In order to appreciate and evaluate an interaction, those involved in it must know their own identity. Hence, individuals as well as companies engage in identity production by trying to project a favourable impression. The process of identity production can take place in both the offline and the online world. This study focuses on the online world, more specifically on online communities, by investigating how online community members form impressions of companies that produce their identities in online communities. Technology has changed customer behaviours dramatically. People have embraced the Internet to meet and interact with one another. This behaviour is in line with the postmodern assumption that there is a movement towards re-socialisation. Online communication platforms connect people globally and give them the possibility to interact and form online social networks. These platforms are interactive, and thus change the traditional way of communication. Companies therefore have to embrace those interactive ways of communication. In the online world consumers are quick to react to communication weaknesses. Inappropriate corporate communication activities can affect the image they have formed of the company in question

ZHAW digitalcollection

Brunel University Research Archive

Big Data for Social Sciences: Measuring patterns of human behavior through large-scale mobile phone data

Author: Sundsøy Pål
Publication venue
Publication date: 01/01/2017
Field of study

Through seven publications this dissertation shows how anonymized mobile phone data can contribute to the social good and provide insights into human behaviour on a large scale. The size of the datasets analysed ranges from 500 million to 300 billion phone records, covering millions of people. The key contributions are two-fold: 1. Big Data for Social Good: Through prediction algorithms the results show how mobile phone data can be useful to predict important socio-economic indicators, such as income, illiteracy and poverty in developing countries. Such knowledge can be used to identify where vulnerable groups in society are, reduce economic shocks and is a critical component for monitoring poverty rates over time. Further, the dissertation demonstrates how mobile phone data can be used to better understand human behaviour during large shocks in society, exemplified by an analysis of data from the terror attack in Norway and a natural disaster on the south-coast in Bangladesh. This work leads to an increased understanding of how information spreads, and how millions of people move around. The intention is to identify displaced people faster, cheaper and more accurately than existing survey-based methods. 2. Big Data for efficient marketing: Finally, the dissertation offers an insight into how anonymised mobile phone data can be used to map out large social networks, covering millions of people, to understand how products spread inside these networks. Results show that by including social patterns and machine learning techniques in a large-scale marketing experiment in Asia, the adoption rate is increased by 13 times compared to the approach used by experienced marketers. A data-driven and scientific approach to marketing, through more tailored campaigns, contributes to less irrelevant offers for the customers, and better cost efficiency for the companies.Comment: 166 pages, PHD thesi

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

Social search in collaborative tagging networks : the role of ties

Author: Bischoff Kerstin
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2013
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover

Unsupervised Graph-Based Similarity Learning Using Heterogeneous Features.

Author: Muthukrishnan Pradeep
Publication venue
Publication date: 01/01/2011
Field of study

Relational data refers to data that contains explicit relations among objects. Nowadays, relational data are universal and have a broad appeal in many different application domains. The problem of estimating similarity between objects is a core requirement for many standard Machine Learning (ML), Natural Language Processing (NLP) and Information Retrieval (IR) problems such as clustering, classiffication, word sense disambiguation, etc. Traditional machine learning approaches represent the data using simple, concise representations such as feature vectors. While this works very well for homogeneous data, i.e, data with a single feature type such as text, it does not exploit the availability of dfferent feature types fully. For example, scientic publications have text, citations, authorship information, venue information. Each of the features can be used for estimating similarity. Representing such objects has been a key issue in efficient mining (Getoor and Taskar, 2007). In this thesis, we propose natural representations for relational data using multiple, connected layers of graphs; one for each feature type. Also, we propose novel algorithms for estimating similarity using multiple heterogeneous features. Also, we present novel algorithms for tasks like topic detection and music recommendation using the estimated similarity measure. We demonstrate superior performance of the proposed algorithms (root mean squared error of 24.81 on the Yahoo! KDD Music recommendation data set and classiffication accuracy of 88% on the ACL Anthology Network data set) over many of the state of the art algorithms, such as Latent Semantic Analysis (LSA), Multiple Kernel Learning (MKL) and spectral clustering and baselines on large, standard data sets.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89824/1/mpradeep_1.pd

Deep Blue Documents at the University of Michigan

High-Performance Modelling and Simulation for Big Data Applications

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

Directory of Open Access Books (DOAB)