825 research outputs found

    Semantics-based information extraction for detecting economic events

    Get PDF
    As today's financial markets are sensitive to breaking news on economic events, accurate and timely automatic identification of events in news items is crucial. Unstructured news items originating from many heterogeneous sources have to be mined in order to extract knowledge useful for guiding decision making processes. Hence, we propose the Semantics-Based Pipeline for Economic Event Detection (SPEED), focusing on extracting financial events from news articles and annotating these with meta-data at a speed that enables real-time use. In our implementation, we use some components of an existing framework as well as new components, e.g., a high-performance Ontology Gazetteer, a Word Group Look-Up component, a Word Sense Disambiguator, and components for detecting economic events. Through their interaction with a domain-specific ontology, our novel, semantically enabled components constitute a feedback loop which fosters future reuse of acquired knowledge in the event detection process

    Automated Detection of Financial Events in News Text

    Get PDF
    Today’s financial markets are inextricably linked with financial events like acquisitions, profit announcements, or product launches. Information extracted from news messages that report on such events could hence be beneficial for financial decision making. The ubiquity of news, however, makes manual analysis impossible, and due to the unstructured nature of text, the (semi-)automatic extraction and application of financial events remains a non-trivial task. Therefore, the studies composing this dissertation investigate 1) how to accurately identify financial events in news text, and 2) how to effectively use such extracted events in financial applications. Based on a detailed evaluation of current event extraction systems, this thesis presents a competitive, knowledge-driven, semi-automatic system for financial event extraction from text. A novel pattern language, which makes clever use of the system’s underlying knowledge base, allows for the definition of simple, yet expressive event extraction rules that can be applied to natural language texts. The system’s knowledge-driven internals remain synchronized with the latest market developments through the accompanying event-triggered update language for knowledge bases, enabling the definition of update rules. Additional research covered by this dissertation investigates the practical applicability of extracted events. In automated stock trading experiments, the best performing trading rules do not only make use of traditional numerical signals, but also employ news-based event signals. Moreover, when cleaning stock data from disruptions caused by financial events, financial risk analyses yield more accurate results. These results suggest that events detected in news can be used advantageously as supplementary parameters in financial applications

    Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches

    Get PDF
    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships, as well as difficulty in updating the ontology as domain knowledge changes. Methodologies developed in the fields of Natural Language Processing (NLP), Information Extraction (IE), Information Retrieval (IR), and Machine Learning (ML) provide techniques for automating the enrichment of ontology from free-text documents. In this dissertation, I extended these methodologies into biomedical ontology development. First, I reviewed existing methodologies and systems developed in the fields of NLP, IR, and IE, and discussed how existing methods can benefit the development of biomedical ontologies. This previously unconducted review was published in the Journal of Biomedical Informatics. Second, I compared the effectiveness of three methods from two different approaches, the symbolic (the Hearst method) and the statistical (the Church and Lin methods), using clinical free-text documents. Third, I developed a methodological framework for Ontology Learning (OL) evaluation and comparison. This framework permits evaluation of the two types of OL approaches that include three OL methods. The significance of this work is as follows: 1) The results from the comparative study showed the potential of these methods for biomedical ontology enrichment. For the two targeted domains (NCIT and RadLex), the Hearst method revealed an average of 21% and 11% new concept acceptance rates, respectively. The Lin method produced a 74% acceptance rate for NCIT; the Church method, 53%. As a result of this study (published in the Journal of Methods of Information in Medicine), many suggested candidates have been incorporated into the NCIT; 2) The evaluation framework is flexible and general enough that it can analyze the performance of ontology enrichment methods for many domains, thus expediting the process of automation and minimizing the likelihood that key concepts and relationships would be missed as domain knowledge evolves

    Constructive Ontology Engineering

    Get PDF
    The proliferation of the Semantic Web depends on ontologies for knowledge sharing, semantic annotation, data fusion, and descriptions of data for machine interpretation. However, ontologies are difficult to create and maintain. In addition, their structure and content may vary depending on the application and domain. Several methods described in literature have been used in creating ontologies from various data sources such as structured data in databases or unstructured text found in text documents or HTML documents. Various data mining techniques, natural language processing methods, syntactical analysis, machine learning methods, and other techniques have been used in building ontologies with automated and semi-automated processes. Due to the vast amount of unstructured text and its continued proliferation, the problem of constructing ontologies from text has attracted considerable attention for research. However, the constructed ontologies may be noisy, with missing and incorrect knowledge. Thus ontology construction continues to be a challenging research problem. The goal of this research is to investigate a new method for guiding a process of extracting and assembling candidate terms into domain specific concepts and relationships. The process is part of an overall semi automated system for creating ontologies from unstructured text sources and is driven by the user’s goals in an incremental process. The system applies natural language processing techniques and uses a series of syntactical analysis tools for extracting grammatical relations from a list of text terms representing the parts of speech of a sentence. The extraction process focuses on evaluating the subject predicate-object sequences of the text for potential concept-relation-concept triples to be built into an ontology. Users can guide the system by selecting seedling concept-relation-concept triples to assist building concepts from the extracted domain specific terms. As a result, the ontology building process develops into an incremental one that allows the user to interact with the system, to guide the development of an ontology, and to tailor the ontology for the user’s application needs. The main contribution of this work is the implementation and evaluation of a new semi- automated methodology for constructing domain specific ontologies from unstructured text corpus

    Mining answers for causation questions

    Get PDF
    Abstract Causal relations of various kinds are a pervasive feature of human language and theorising about the world. Despite this, the specification of a satisfactory general analysis of causal relations has long proved difficult. This paper provides a syntactic and semantic classification of cause-effect lexico-syntactic patterns for automatic detection and extraction of causation relationships in English texts. We also present a semiautomatic method of discovering generally applicable lexico-syntactic patterns that refer to the causation relation. The patterns are found automatically, but their validation is done semi-automatically. Our final purpose is to add a new module to our existing Question Answering (QA) system that will answer complex cause-effect questions

    Прогнозирование состояния фондового рынка на основе финансовых новостей

    Get PDF
    In this paper are discussed existing methods of Semantic Web technologies application for financial news processing

    Ontologies and Information Extraction

    Full text link
    This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

    Визуализация оценки мониторинга состояния биобъектов с помощью метода "Лица Чернова"

    Get PDF
    The article is about the usage of pictogtaphics of Chernoff faces. The idea behind using faces is that humans easily recognize faces and notice small changes without difficulty. Chernoff faces themselves can be plotted on a standard X-Y graph. The main aim of the article is to find the right way how to treat the person with different diseases. It can be useful for all medical workforce who somehow connected with such a problem. Also the written article can help young programmers and students of medical universities with their scientific papers

    Automatic domain-specific learning: towards a methodology for ontology enrichment

    Get PDF
    [EN] At the current rate of technological development, in a world where enormous amount of data are constantly created and in which the Internet is used as the primary means for information exchange, there exists a need for tools that help processing, analyzing and using that information. However, while the growth of information poses many opportunities for social and scientific advance, it has also highlighted the difficulties of extracting meaningful patterns from massive data. Ontologies have been claimed to play a major role in the processing of large-scale data, as they serve as universal models of knowledge representation, and are being studied as possible solutions to this. This paper presents a method for the automatic expansion of ontologies based on corpus and terminological data exploitation. The proposed ¿ontology enrichment method¿ (OEM) consists of a sequence of tasks aimed at classifying an input keyword automatically under its corresponding node within a target ontology. Results prove that the method can be successfully applied for the automatic classification of specialized units into a reference ontology.Financial support for this research has been provided by the DGI, Spanish Ministry of Education and Science, grant FFI2011-29798-C0201.Ureña Gómez-Moreno, P.; Mestre-Mestre, EM. (2017). Automatic domain-specific learning: towards a methodology for ontology enrichment. LFE. Revista de Lenguas para Fines Específicos. 23(2):63-85. http://hdl.handle.net/10251/148357S638523
    corecore