25,403 research outputs found

    Towards Building a Knowledge Base of Monetary Transactions from a News Collection

    Full text link
    We address the problem of extracting structured representations of economic events from a large corpus of news articles, using a combination of natural language processing and machine learning techniques. The developed techniques allow for semi-automatic population of a financial knowledge base, which, in turn, may be used to support a range of data mining and exploration tasks. The key challenge we face in this domain is that the same event is often reported multiple times, with varying correctness of details. We address this challenge by first collecting all information pertinent to a given event from the entire corpus, then considering all possible representations of the event, and finally, using a supervised learning method, to rank these representations by the associated confidence scores. A main innovative element of our approach is that it jointly extracts and stores all attributes of the event as a single representation (quintuple). Using a purpose-built test set we demonstrate that our supervised learning approach can achieve 25% improvement in F1-score over baseline methods that consider the earliest, the latest or the most frequent reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '17), 201

    Joint Video and Text Parsing for Understanding Events and Answering Queries

    Full text link
    We propose a framework for parsing video and text jointly for understanding events and answering user queries. Our framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events) and causal information (causalities between events and fluents) in the video and text. The knowledge representation of our framework is based on a spatial-temporal-causal And-Or graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. We present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs and the joint parse graph. Based on the probabilistic model, we propose a joint parsing system consisting of three modules: video parsing, text parsing and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text respectively. The joint inference module produces a joint parse graph by performing matching, deduction and revision on the video and text parse graphs. The proposed framework has the following objectives: Firstly, we aim at deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; Secondly, we perform parsing and reasoning across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG representation; Thirdly, we show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where and why. We empirically evaluated our system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results

    state of the art analysis ; working packages in project phase II

    Get PDF
    In this report, we introduce our goals and present our requirement analysis for the second phase of the Corporate Semantic Web project. Corporate ontology engineering will improve the facilitation of agile ontology engineering to lessen the costs of ontology development and, especially, maintenance. Corporate semantic collaboration focuses the human-centered aspects of knowledge management in corporate contexts. Corporate semantic search is settled on the highest application level of the three research areas and at that point it is a representative for applications working on and with the appropriately represented and delivered background knowledge

    AVOIDIT IRS: An Issue Resolution System To Resolve Cyber Attacks

    Get PDF
    Cyber attacks have greatly increased over the years and the attackers have progressively improved in devising attacks against specific targets. Cyber attacks are considered a malicious activity launched against networks to gain unauthorized access causing modification, destruction, or even deletion of data. This dissertation highlights the need to assist defenders with identifying and defending against cyber attacks. In this dissertation an attack issue resolution system is developed called AVOIDIT IRS (AIRS). AVOIDIT IRS is based on the attack taxonomy AVOIDIT (Attack Vector, Operational Impact, Defense, Information Impact, and Target). Attacks are collected by AIRS and classified into their respective category using AVOIDIT.Accordingly, an organizational cyber attack ontology was developed using feedback from security professionals to improve the communication and reusability amongst cyber security stakeholders. AIRS is developed as a semi-autonomous application that extracts unstructured external and internal attack data to classify attacks in sequential form. In doing so, we designed and implemented a frequent pattern and sequential classification algorithm associated with the five classifications in AVOIDIT. The issue resolution approach uses inference to educate the defender on the plausible cyber attacks. The AIRS can work in conjunction with an intrusion detection system (IDS) to provide a heuristic to cyber security breaches within an organization. AVOIDIT provides a framework for classifying appropriate attack information, which is fundamental in devising defense strategies against such cyber attacks. The AIRS is further used as a knowledge base in a game inspired defense architecture to promote game model selection upon attack identification. Future work will incorporate honeypot attack information to improve attack identification, classification, and defense propagation.In this dissertation, 1,025 common vulnerabilities and exposures (CVEs) and over 5,000 lines of log files instances were captured in the AIRS for analysis. Security experts were consulted to create rules to extract pertinent information and algorithms to correlate identified data for notification. The AIRS was developed using the Codeigniter [74] framework to provide a seamless visualization tool for data mining regarding potential cyber attacks relative to web applications. Testing of the AVOIDIT IRS revealed a recall of 88%, precision of 93%, and a 66% correlation metric

    ViSOR: VIdeo Surveillance On-line Repository for annotation retrieval

    Full text link
    Aim of the Visor Project [1] is to gather and make freely available a repository of surveillance and video footages for the research community on pattern recognition and multimedia retrieval. Th

    Use-cases on evolution

    Get PDF
    This report presents a set of use cases for evolution and reactivity for data in the Web and Semantic Web. This set is organized around three different case study scenarios, each of them is related to one of the three different areas of application within Rewerse. Namely, the scenarios are: “The Rewerse Information System and Portal”, closely related to the work of A3 – Personalised Information Systems; “Organizing Travels”, that may be related to the work of A1 – Events, Time, and Locations; “Updates and evolution in bioinformatics data sources” related to the work of A2 – Towards a Bioinformatics Web
    • …
    corecore