25,409 research outputs found
Towards Building a Knowledge Base of Monetary Transactions from a News Collection
We address the problem of extracting structured representations of economic
events from a large corpus of news articles, using a combination of natural
language processing and machine learning techniques. The developed techniques
allow for semi-automatic population of a financial knowledge base, which, in
turn, may be used to support a range of data mining and exploration tasks. The
key challenge we face in this domain is that the same event is often reported
multiple times, with varying correctness of details. We address this challenge
by first collecting all information pertinent to a given event from the entire
corpus, then considering all possible representations of the event, and
finally, using a supervised learning method, to rank these representations by
the associated confidence scores. A main innovative element of our approach is
that it jointly extracts and stores all attributes of the event as a single
representation (quintuple). Using a purpose-built test set we demonstrate that
our supervised learning approach can achieve 25% improvement in F1-score over
baseline methods that consider the earliest, the latest or the most frequent
reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital
Libraries (JCDL '17), 201
Joint Video and Text Parsing for Understanding Events and Answering Queries
We propose a framework for parsing video and text jointly for understanding
events and answering user queries. Our framework produces a parse graph that
represents the compositional structures of spatial information (objects and
scenes), temporal information (actions and events) and causal information
(causalities between events and fluents) in the video and text. The knowledge
representation of our framework is based on a spatial-temporal-causal And-Or
graph (S/T/C-AOG), which jointly models possible hierarchical compositions of
objects, scenes and events as well as their interactions and mutual contexts,
and specifies the prior probabilistic distribution of the parse graphs. We
present a probabilistic generative model for joint parsing that captures the
relations between the input video/text, their corresponding parse graphs and
the joint parse graph. Based on the probabilistic model, we propose a joint
parsing system consisting of three modules: video parsing, text parsing and
joint inference. Video parsing and text parsing produce two parse graphs from
the input video and text respectively. The joint inference module produces a
joint parse graph by performing matching, deduction and revision on the video
and text parse graphs. The proposed framework has the following objectives:
Firstly, we aim at deep semantic parsing of video and text that goes beyond the
traditional bag-of-words approaches; Secondly, we perform parsing and reasoning
across the spatial, temporal and causal dimensions based on the joint S/T/C-AOG
representation; Thirdly, we show that deep joint parsing facilitates subsequent
applications such as generating narrative text descriptions and answering
queries in the forms of who, what, when, where and why. We empirically
evaluated our system based on comparison against ground-truth as well as
accuracy of query answering and obtained satisfactory results
Recommended from our members
Proceedings ICPW'07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: NL
Proceedings ICPW'07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: N
state of the art analysis ; working packages in project phase II
In this report, we introduce our goals and present our requirement analysis
for the second phase of the Corporate Semantic Web project. Corporate ontology
engineering will improve the facilitation of agile ontology engineering to
lessen the costs of ontology development and, especially, maintenance.
Corporate semantic collaboration focuses the human-centered aspects of
knowledge management in corporate contexts. Corporate semantic search is
settled on the highest application level of the three research areas and at
that point it is a representative for applications working on and with the
appropriately represented and delivered background knowledge
AVOIDIT IRS: An Issue Resolution System To Resolve Cyber Attacks
Cyber attacks have greatly increased over the years and the attackers have progressively improved in devising attacks against specific targets. Cyber attacks are considered a malicious activity launched against networks to gain unauthorized access causing modification, destruction, or even deletion of data. This dissertation highlights the need to assist defenders with identifying and defending against cyber attacks. In this dissertation an attack issue resolution system is developed called AVOIDIT IRS (AIRS). AVOIDIT IRS is based on the attack taxonomy AVOIDIT (Attack Vector, Operational Impact, Defense, Information Impact, and Target). Attacks are collected by AIRS and classified into their respective category using AVOIDIT.Accordingly, an organizational cyber attack ontology was developed using feedback from security professionals to improve the communication and reusability amongst cyber security stakeholders. AIRS is developed as a semi-autonomous application that extracts unstructured external and internal attack data to classify attacks in sequential form. In doing so, we designed and implemented a frequent pattern and sequential classification algorithm associated with the five classifications in AVOIDIT. The issue resolution approach uses inference to educate the defender on the plausible cyber attacks. The AIRS can work in conjunction with an intrusion detection system (IDS) to provide a heuristic to cyber security breaches within an organization. AVOIDIT provides a framework for classifying appropriate attack information, which is fundamental in devising defense strategies against such cyber attacks. The AIRS is further used as a knowledge base in a game inspired defense architecture to promote game model selection upon attack identification. Future work will incorporate honeypot attack information to improve attack identification, classification, and defense propagation.In this dissertation, 1,025 common vulnerabilities and exposures (CVEs) and over 5,000 lines of log files instances were captured in the AIRS for analysis. Security experts were consulted to create rules to extract pertinent information and algorithms to correlate identified data for notification. The AIRS was developed using the Codeigniter [74] framework to provide a seamless visualization tool for data mining regarding potential cyber attacks relative to web applications. Testing of the AVOIDIT IRS revealed a recall of 88%, precision of 93%, and a 66% correlation metric
ViSOR: VIdeo Surveillance On-line Repository for annotation retrieval
Aim of the Visor Project [1] is to gather and make freely available a repository of surveillance and video footages for the research community on pattern recognition and multimedia retrieval. Th
Use-cases on evolution
This report presents a set of use cases for evolution and reactivity for data in the Web and
Semantic Web. This set is organized around three different case study scenarios, each of them
is related to one of the three different areas of application within Rewerse. Namely, the scenarios
are: “The Rewerse Information System and Portal”, closely related to the work of A3
– Personalised Information Systems; “Organizing Travels”, that may be related to the work
of A1 – Events, Time, and Locations; “Updates and evolution in bioinformatics data sources”
related to the work of A2 – Towards a Bioinformatics Web
- …