113,749 research outputs found

    A Semantic Method to Information Extraction for Decision Support Systems

    Get PDF
    In this paper, we describe a novel schema for a more semantic text mining process which results in more comprehensive decision making activity by decision support systems via providing more effective and accurate textual information. The utility of two semantic lexical resources; Frame Net and Word Net, in extracting required text snippets from unstructured free texts yields a better and more accurate information extraction process to deliver more precise information either to a DSS or to a decision maker. We explain how the usage of these lexical resources could elevate a focused text mining process which could be applied to an information provider system in a decision support paradigm. The preliminary results obtained after a starter experiment show that the hybrid information extraction schema performs well on some semantic failure situations

    Drug-drug interaction extraction-based system: an natural language processing approach

    Get PDF
    Poly-medicated patients, especially those over 65, have increased. Multiple drug use and inappropriate prescribing increase drug-drug interactions, adverse drug reactions, morbidity, and mortality. This issue was addressed with recommendation systems. Health professionals have not followed these systems due to their poor alert quality and incomplete databases. Recent research shows a growing interest in using Text Mining via NLP to extract drug-drug interactions from unstructured data sources to support clinical prescribing decisions. NLP text mining and machine learning classifier training for drug relation extraction were used in this process. In this context, the proposed solution allows to develop an extraction system for drug-drug interactions from unstructured data sources. The system produces structured information, which can be inserted into a database that contains information acquired from three different data sources. The architecture outlined for the drug-drug interaction extraction system is capable of receiving unstructured text, identifying drug entities sentence by sentence, and determining whether or not there are interactions between them.- Fundacao para a Ciencia e a Tecnologi

    A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

    Full text link
    Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

    TOWARDS MINING BRAND ASSOCIATIONS FROM USER-GENERATED CONTENT (UGC): EVIDENCE FROM LINGUISTIC CHARACTERISTICS

    Get PDF
    Consumers’ brand associations offer qualitative explanations on a brand’s success or failure and are typically elicited using survey-based instruments. Marketers are interested in time- and cost-efficient, automated brand association elicitation approaches. To enable an automated brand association elicitation, we show that brand associations can be formalized and described by patterns of linguistic part-of-speech sequences that differ from ordinary speech which is required for an automated extraction via text mining. Furthermore, we provide evidence that UGC is an adequate data-source for an automated brand association elicitation. We do that by comparing survey-based and UGC data-sources using linguistic part-of-speech sequence- and n-gram analysis as well as sequential pattern mining. We contribute to exiting research by establishing prerequisites for the construction of novel information systems that use text mining to extract brand associations automatically from UGC

    Ranking deep web text collections for scalable information extraction

    Get PDF
    Information extraction (IE) systems discover structured in-formation from natural language text, to enable much richer querying and data mining than possible directly over the unstructured text. Unfortunately, IE is generally a com-putationally expensive process, and hence improving its ef-ficiency, so that it scales over large volumes of text, is of critical importance. State-of-the-art approaches for scaling the IE process focus on one text collection at a time. These approaches prioritize the extraction effort by learning key-word queries to identify the “useful ” documents for the IE task at hand, namely, those that lead to the extraction of structured “tuples. ” These approaches, however, do not at-tempt to predict which text collections are useful for the IE task—and hence merit further processing—and which ones will not contribute any useful output—and hence should be ignored altogether, for efficiency. In this paper, we focus on an especially valuable family of text sources, the so-called deep web collections, whose (remote) contents are only ac-cessible via querying. Specifically, we introduce and study techniques for ranking deep web collections for an IE task, to prioritize the extraction effort by focusing on collections with substantial numbers of useful documents for the task. We study both (adaptations of) state-of-the-art resource se-lection strategies for distributed information retrieval, and IE-specific approaches. Our extensive experimental eval-uation over realistic deep web collections, and for several different IE tasks, shows the merits and limitations of the alternative families of approaches, and provides a roadmap for addressing this critically important building block for efficient, scalable information extraction. 1

    Sentiment Analysis using an ensemble of Feature Selection Algorithms

    Get PDF
    To determine the opinion of any person experiencing any services or buying any product, the usage of Sentiment Analysis, a continuous research in the field of text mining, is a common practice. It is a process of using computation to identify and categorize opinions expressed in a piece of text. Individuals post their opinion via reviews, tweets, comments or discussions which is our unstructured information. Sentiment analysis gives a general conclusion of audits which benefit clients, individuals or organizations for decision making. The primary point of this paper is to perform an ensemble approach on feature reduction methods identified with natural language processing and performing the analysis based on the results. An ensemble approach is a process of combining two or more methodologies. The feature reduction methods used are Principal Component Analysis (PCA) for feature extraction and Pearson Chi squared statistical test for feature selection. The fundamental commitment of this paper is to experiment whether combined use of cautious feature determination and existing classification methodologies can yield better accuracy

    Extracting collective trends from Twitter using social-based data mining

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40495-5_62Proceedings 5th International Conference, ICCCI 2013, Craiova, Romania, September 11-13, 2013,Social Networks have become an important environment for Collective Trends extraction. The interactions amongst users provide information of their preferences and relationships. This information can be used to measure the influence of ideas, or opinions, and how they are spread within the Network. Currently, one of the most relevant and popular Social Network is Twitter. This Social Network was created to share comments and opinions. The information provided by users is specially useful in different fields and research areas such as marketing. This data is presented as short text strings containing different ideas expressed by real people. With this representation, different Data Mining and Text Mining techniques (such as classification and clustering) might be used for knowledge extraction trying to distinguish the meaning of the opinions. This work is focused on the analysis about how these techniques can interpret these opinions within the Social Network using information related to IKEA® company.The preparation of this manuscript has been supported by the Spanish Ministry of Science and Innovation under the following projects: TIN2010-19872, ECO2011-30105 (National Plan for Research, Development and Innovation) and the Multidisciplinary Project of Universidad Aut´onoma de Madrid (CEMU-2012-034

    Automatic domain ontology extraction for context-sensitive opinion mining

    Get PDF
    Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline

    Using Neural Networks for Relation Extraction from Biomedical Literature

    Full text link
    Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely, using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
    • …
    corecore