5,015 research outputs found

    Automatic domain ontology extraction for context-sensitive opinion mining

    Get PDF
    Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline

    On Quantifying Qualitative Geospatial Data: A Probabilistic Approach

    Full text link
    Living in the era of data deluge, we have witnessed a web content explosion, largely due to the massive availability of User-Generated Content (UGC). In this work, we specifically consider the problem of geospatial information extraction and representation, where one can exploit diverse sources of information (such as image and audio data, text data, etc), going beyond traditional volunteered geographic information. Our ambition is to include available narrative information in an effort to better explain geospatial relationships: with spatial reasoning being a basic form of human cognition, narratives expressing such experiences typically contain qualitative spatial data, i.e., spatial objects and spatial relationships. To this end, we formulate a quantitative approach for the representation of qualitative spatial relations extracted from UGC in the form of texts. The proposed method quantifies such relations based on multiple text observations. Such observations provide distance and orientation features which are utilized by a greedy Expectation Maximization-based (EM) algorithm to infer a probability distribution over predefined spatial relationships; the latter represent the quantified relationships under user-defined probabilistic assumptions. We evaluate the applicability and quality of the proposed approach using real UGC data originating from an actual travel blog text corpus. To verify the quality of the result, we generate grid-based maps visualizing the spatial extent of the various relations

    Visual analysis of sensor logs in smart spaces: Activities vs. situations

    Get PDF
    Models of human habits in smart spaces can be expressed by using a multitude of representations whose readability influences the possibility of being validated by human experts. Our research is focused on developing a visual analysis pipeline (service) that allows, starting from the sensor log of a smart space, to graphically visualize human habits. The basic assumption is to apply techniques borrowed from the area of business process automation and mining on a version of the sensor log preprocessed in order to translate raw sensor measurements into human actions. The proposed pipeline is employed to automatically extract models to be reused for ambient intelligence. In this paper, we present an user evaluation aimed at demonstrating the effectiveness of the approach, by comparing it wrt. a relevant state-of-the-art visual tool, namely SITUVIS

    Quality of Information in Mobile Crowdsensing: Survey and Research Challenges

    Full text link
    Smartphones have become the most pervasive devices in people's lives, and are clearly transforming the way we live and perceive technology. Today's smartphones benefit from almost ubiquitous Internet connectivity and come equipped with a plethora of inexpensive yet powerful embedded sensors, such as accelerometer, gyroscope, microphone, and camera. This unique combination has enabled revolutionary applications based on the mobile crowdsensing paradigm, such as real-time road traffic monitoring, air and noise pollution, crime control, and wildlife monitoring, just to name a few. Differently from prior sensing paradigms, humans are now the primary actors of the sensing process, since they become fundamental in retrieving reliable and up-to-date information about the event being monitored. As humans may behave unreliably or maliciously, assessing and guaranteeing Quality of Information (QoI) becomes more important than ever. In this paper, we provide a new framework for defining and enforcing the QoI in mobile crowdsensing, and analyze in depth the current state-of-the-art on the topic. We also outline novel research challenges, along with possible directions of future work.Comment: To appear in ACM Transactions on Sensor Networks (TOSN

    Entity Linking for the Biomedical Domain

    Get PDF
    Entity linking is the process of detecting mentions of different concepts in text documents and linking them to canonical entities in a target lexicon. However, one of the biggest issues in entity linking is the ambiguity in entity names. The ambiguity is an issue that many text mining tools have yet to address since different names can represent the same thing and every mention could indicate a different thing. For instance, search engines that rely on heuristic string matches frequently return irrelevant results, because they are unable to satisfactorily resolve ambiguity. Thus, resolving named entity ambiguity is a crucial step in entity linking. To solve the problem of ambiguity, this work proposes a heuristic method for entity recognition and entity linking over the biomedical knowledge graph concerning the semantic similarity of entities in the knowledge graph. Named entity recognition (NER), relation extraction (RE), and relationship linking make up a conventional entity linking (EL) system pipeline (RL). We have used the accuracy metric in this thesis. Therefore, for each identified relation or entity, the solution comprises identifying the correct one and matching it to its corresponding unique CUI in the knowledge base. Because KBs contain a substantial number of relations and entities, each with only one natural language label, the second phase is directly dependent on the accuracy of the first. The framework developed in this thesis enables the extraction of relations and entities from the text and their mapping to the associated CUI in the UMLS knowledge base. This approach derives a new representation of the knowledge base that lends it to the easy comparison. Our idea to select the best candidates is to build a graph of relations and determine the shortest path distance using a ranking approach. We test our suggested approach on two well-known benchmarks in the biomedical field and show that our method exceeds the search engine's top result and provides us with around 4% more accuracy. In general, when it comes to fine-tuning, we notice that entity linking contains subjective characteristics and modifications may be required depending on the task at hand. The performance of the framework is evaluated based on a Python implementation

    COMMUNITY DETECTION AND INFLUENCE MAXIMIZATION IN ONLINE SOCIAL NETWORKS

    Get PDF
    The detecting and clustering of data and users into communities on the social web are important and complex issues in order to develop smart marketing models in changing and evolving social ecosystems. These marketing models are created by individual decision to purchase a product and are influenced by friends and acquaintances. This leads to novel marketing models, which view users as members of online social network communities, rather than the traditional view of marketing to individuals. This thesis starts by examining models that detect communities in online social networks. Then an enhanced approach to detect community which clusters similar nodes together is suggested. Social relationships play an important role in determining user behavior. For example, a user might purchase a product that his/her friend recently bought. Such a phenomenon is called social influence and is used to study how far the action of one user can affect the behaviors of others. Then an original metric used to compute the influential power of social network users based on logs of common actions in order to infer a probabilistic influence propagation model. Finally, a combined community detection algorithm and suggested influence propagation approach reveals a new influence maximization model by identifying and using the most influential users within their communities. In doing so, we employed a fuzzy logic based technique to determine the key users who drive this influence in their communities and diffuse a certain behavior. This original approach contrasts with previous influence propagation models, which did not use similarity opportunities among members of communities to maximize influence propagation. The performance results show that the model activates a higher number of overall nodes in contemporary social networks, starting from a smaller set of key users, as compared to existing landmark approaches which influence fewer nodes, yet employ a larger set of key users
    corecore