142 research outputs found
Using Synchronic and Diachronic Relations for Summarizing Multiple Documents Describing Evolving Events
In this paper we present a fresh look at the problem of summarizing evolving
events from multiple sources. After a discussion concerning the nature of
evolving events we introduce a distinction between linearly and non-linearly
evolving events. We present then a general methodology for the automatic
creation of summaries from evolving events. At its heart lie the notions of
Synchronic and Diachronic cross-document Relations (SDRs), whose aim is the
identification of similarities and differences between sources, from a
synchronical and diachronical perspective. SDRs do not connect documents or
textual elements found therein, but structures one might call messages.
Applying this methodology will yield a set of messages and relations, SDRs,
connecting them, that is a graph which we call grid. We will show how such a
grid can be considered as the starting point of a Natural Language Generation
System. The methodology is evaluated in two case-studies, one for linearly
evolving events (descriptions of football matches) and another one for
non-linearly evolving events (terrorist incidents involving hostages). In both
cases we evaluate the results produced by our computational systems.Comment: 45 pages, 6 figures. To appear in the Journal of Intelligent
Information System
Identifying gene-disease associations using centrality on a literature mined gene-interaction network
Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network
The Structure of the EU Mediasphere
Background.
A trend towards automation of scientific research has recently resulted in what has been termed âdata-driven inquiryâ in various disciplines, including physics and biology. The automation of many tasks has been identified as a possible future also for the humanities and the social sciences, particularly in those disciplines concerned with the analysis of text, due to the recent availability of millions of books and news articles in digital format. In the social sciences, the analysis of news media is done largely by hand and in a hypothesis-driven fashion: the scholar needs to formulate a very specific assumption about the patterns that might be in the data, and then set out to verify if they are present or not.
Methodology/Principal Findings.
In this study, we report what we think is the first large scale content-analysis of cross-linguistic text in the social sciences, by using various artificial intelligence techniques. We analyse 1.3 M news articles in 22 languages detecting a clear structure in the choice of stories covered by the various outlets. This is significantly affected by objective national, geographic, economic and cultural relations among outlets and countries, e.g., outlets from countries sharing strong economic ties are more likely to cover the same stories. We also show that the deviation from average content is significantly correlated with membership to the eurozone, as well as with the year of accession to the EU.
Conclusions/Significance.
While independently making a multitude of small editorial decisions, the leading media of the 27 EU countries, over a period of six months, shaped the contents of the EU mediasphere in a way that reflects its deep geographic, economic and cultural relations. Detecting these subtle signals in a statistically rigorous way would be out of the reach of traditional methods. This analysis demonstrates the power of the available methods for significant automation of media content analysis
Turkish information retrieval: Past changes future
One of the most exciting accomplishments of computer science in the lifetime of this generation is the World Wide Web. The Web is a global electronic publishing medium. Its size has been growing with an enormous speed for over a decade. Most of its content is objectionable, but it also contains a huge amount of valuable information. The Web adds a new dimension to the concept of information explosion and tries to solve the very same problem by information retrieval systems known as Web search engines. We briefly review the information explosion problem and information retrieval systems, convey the past and state of the art in Turkish information retrieval research, illustrate some recent developments, and propose some future actions in this research area in Turkey. © Springer-Verlag Berlin Heidelberg 2006
U-Compare bio-event meta-service: compatible BioNLP event extraction services
AbstractBackgroundBio-molecular event extraction from literature is recognized as an important task of bio text mining and, as such, many relevant systems have been developed and made available during the last decade. While such systems provide useful services individually, there is a need for a meta-service to enable comparison and ensemble of such services, offering optimal solutions for various purposes.ResultsWe have integrated nine event extraction systems in the U-Compare framework, making them inter-compatible and interoperable with other U-Compare components. The U-Compare event meta-service provides various meta-level features for comparison and ensemble of multiple event extraction systems. Experimental results show that the performance improvements achieved by the ensemble are significant. ConclusionsWhile individual event extraction systems themselves provide useful features for bio text mining, the U-Compare meta-service is expected to improve the accessibility to the individual systems, and to enable meta-level uses over multiple event extraction systems such as comparison and ensemble.This research was partially supported by KAKENHI 18002007 [YK, MM, JDK, SP, TO, JT]; JST PRESTO and KAKENHI 21500130 [YK]; the Academy of Finland and computational resources were provided by CSC -- IT Center for Science Ltd [JB, FG]; the Research Foundation Flanders (FWO) [SVL]; UK Biotechnology and Biological Sciences, Research Council (BBSRC project BB/G013160/1 Automated Biological Event Extraction from the Literature for Drug Discovery) and JISC, National Centre for Text Mining [SA]; the Spanish grant BIO2010-17527 [MN, APM]; NIH Grant U54 DA021519 [AO, DRR]Peer Reviewe
Identification and reconstruction of low-energy electrons in the ProtoDUNE-SP detector
Measurements of electrons from interactions are crucial for the Deep
Underground Neutrino Experiment (DUNE) neutrino oscillation program, as well as
searches for physics beyond the standard model, supernova neutrino detection,
and solar neutrino measurements. This article describes the selection and
reconstruction of low-energy (Michel) electrons in the ProtoDUNE-SP detector.
ProtoDUNE-SP is one of the prototypes for the DUNE far detector, built and
operated at CERN as a charged particle test beam experiment. A sample of
low-energy electrons produced by the decay of cosmic muons is selected with a
purity of 95%. This sample is used to calibrate the low-energy electron energy
scale with two techniques. An electron energy calibration based on a cosmic ray
muon sample uses calibration constants derived from measured and simulated
cosmic ray muon events. Another calibration technique makes use of the
theoretically well-understood Michel electron energy spectrum to convert
reconstructed charge to electron energy. In addition, the effects of detector
response to low-energy electron energy scale and its resolution including
readout electronics threshold effects are quantified. Finally, the relation
between the theoretical and reconstructed low-energy electron energy spectrum
is derived and the energy resolution is characterized. The low-energy electron
selection presented here accounts for about 75% of the total electron deposited
energy. After the addition of lost energy using a Monte Carlo simulation, the
energy resolution improves from about 40% to 25% at 50~MeV. These results are
used to validate the expected capabilities of the DUNE far detector to
reconstruct low-energy electrons.Comment: 19 pages, 10 figure
Snowmass Neutrino Frontier: DUNE Physics Summary
The Deep Underground Neutrino Experiment (DUNE) is a next-generation long-baseline neutrino oscillation experiment with a primary physics goal of observing neutrino and antineutrino oscillation patterns to precisely measure the parameters governing long-baseline neutrino oscillation in a single experiment, and to test the three-flavor paradigm. DUNE's design has been developed by a large, international collaboration of scientists and engineers to have unique capability to measure neutrino oscillation as a function of energy in a broadband beam, to resolve degeneracy among oscillation parameters, and to control systematic uncertainty using the exquisite imaging capability of massive LArTPC far detector modules and an argon-based near detector. DUNE's neutrino oscillation measurements will unambiguously resolve the neutrino mass ordering and provide the sensitivity to discover CP violation in neutrinos for a wide range of possible values of ÎŽCP. DUNE is also uniquely sensitive to electron neutrinos from a galactic supernova burst, and to a broad range of physics beyond the Standard Model (BSM), including nucleon decays. DUNE is anticipated to begin collecting physics data with Phase I, an initial experiment configuration consisting of two far detector modules and a minimal suite of near detector components, with a 1.2 MW proton beam. To realize its extensive, world-leading physics potential requires the full scope of DUNE be completed in Phase II. The three Phase II upgrades are all necessary to achieve DUNE's physics goals: (1) addition of far detector modules three and four for a total FD fiducial mass of at least 40 kt, (2) upgrade of the proton beam power from 1.2 MW to 2.4 MW, and (3) replacement of the near detector's temporary muon spectrometer with a magnetized, high-pressure gaseous argon TPC and calorimeter
- âŠ