85,400 research outputs found
The Early Bird Catches The Term: Combining Twitter and News Data For Event Detection and Situational Awareness
Twitter updates now represent an enormous stream of information originating
from a wide variety of formal and informal sources, much of which is relevant
to real-world events. In this paper we adapt existing bio-surveillance
algorithms to detect localised spikes in Twitter activity corresponding to real
events with a high level of confidence. We then develop a methodology to
automatically summarise these events, both by providing the tweets which fully
describe the event and by linking to highly relevant news articles. We apply
our methods to outbreaks of illness and events strongly affecting sentiment. In
both case studies we are able to detect events verifiable by third party
sources and produce high quality summaries
Generalized Bayesian Record Linkage and Regression with Exact Error Propagation
Record linkage (de-duplication or entity resolution) is the process of
merging noisy databases to remove duplicate entities. While record linkage
removes duplicate entities from such databases, the downstream task is any
inferential, predictive, or post-linkage task on the linked data. One goal of
the downstream task is obtaining a larger reference data set, allowing one to
perform more accurate statistical analyses. In addition, there is inherent
record linkage uncertainty passed to the downstream task. Motivated by the
above, we propose a generalized Bayesian record linkage method and consider
multiple regression analysis as the downstream task. Records are linked via a
random partition model, which allows for a wide class to be considered. In
addition, we jointly model the record linkage and downstream task, which allows
one to account for the record linkage uncertainty exactly. Moreover, one is
able to generate a feedback propagation mechanism of the information from the
proposed Bayesian record linkage model into the downstream task. This feedback
effect is essential to eliminate potential biases that can jeopardize resulting
downstream task. We apply our methodology to multiple linear regression, and
illustrate empirically that the "feedback effect" is able to improve the
performance of record linkage.Comment: 18 pages, 5 figure
Coupling CFD and visualisation to model the behaviour and effect on visibility of small particles in air
The use of computational fluid dynamics (CFD) and lighting simulation software is becoming commonplace in building design. This study looks at a novel linkage between these two tools in the visualization of droplets or particles suspended in air. CFD is used to predict the distribution of the particles, which is then processed and passed to the lighting simulation tool. The mechanism for transforming CFD contaminant concentration predictions to a form suitable for visual simulation is explained in detail and an example presented which demonstrates this linkage. The CFD-visualisation simulations described in this paper have applications in both automotive and fire safety through the modelling of fog and smoke respectively. Historically, smoke and fog effects have been rendered in images with no attempt at modelling physical reality. The novelty of the work presented in this paper is that, for the first time, an attempt is made to model both the fluid mechanics and optical physics of small particles suspended in a primary fluid
The genomic Make-Up of a Hybrid Species - Analysis of the Invasive Cottus Lineage (Pisces, Teleostei) in the River Rhine system
In the past years a new invasive lineage of sculpins (Cottus species complex) has been studied that is currently expanding in the Lower River Rhine. Molecular analysis showed that this lineage has originated through hybridization of Cottus perifretum from the River Scheldt and Cottus rhenanus from the Lower River Rhine system. The emergence of the hybrid lineage is correlated with new habitat adaptations that allow the expansion along river habitats that have previously not been used by Cottus. Thus the question arises, if the hybridization event facilitated the invasion of and the adaptation to such a new environment. To start tackling this question an estimate is required how much each of the parental species contributed to the hybrid genome and which chromosomal fragments became fixed. Several genomic resources had to be developed in order to map the ancestries of chromosomal fragments in the hybrid genome. As a basic genomic resource for Cottus a genetic map based on already established microsatellite markers was created. This map was compared with the physical maps of sequenced fish genomes and a high degree of conserved synteny between Cottus and Tetraodon nigroviridis and between Cottus and Gasterosteus aculeatus could be detected. These model fish genomes could then be used as a reference in the further analysis of the Cottus genome. Finally, a set of ancestry-informative markers was developed in order to determine the ancestries of chromosomal fragments in the hybrid lineage. These tools allowed to map the hybrid genome and to assess the contribution of each parental species to the hybrid lineage. 25 genomic fragments could be identified that were fixed for material from only one parental species and thus might harbor genes that are relevant for the specific adaptations in the hybrid species
Measuring Organizational Performance in Strategic Human Resource Management: Looking Beyond the Lamppost
A major challenge for Strategic Human Resource Management research in the next decade will be to establish a clear, coherent and consistent construct for organizational performance. This paper describes the variety of measures used in current empirical research linking human resource management and organizational performance. Implications for future research are discussed amidst the challenges of construct definition, divergent stakeholder criteria and the temporal dynamics of performance. A model for performance information markets to address these challenges is introduced. The model uses a multi-dimensional weighted performance measurement system and a free information flow exchange mechanism for determining performance achievement criteria
A dynamic network model with persistent links and node-specific latent variables, with an application to the interbank market
We propose a dynamic network model where two mechanisms control the
probability of a link between two nodes: (i) the existence or absence of this
link in the past, and (ii) node-specific latent variables (dynamic fitnesses)
describing the propensity of each node to create links. Assuming a Markov
dynamics for both mechanisms, we propose an Expectation-Maximization algorithm
for model estimation and inference of the latent variables. The estimated
parameters and fitnesses can be used to forecast the presence of a link in the
future. We apply our methodology to the e-MID interbank network for which the
two linkage mechanisms are associated with two different trading behaviors in
the process of network formation, namely preferential trading and trading
driven by node-specific characteristics. The empirical results allow to
recognise preferential lending in the interbank market and indicate how a
method that does not account for time-varying network topologies tends to
overestimate preferential linkage.Comment: 19 pages, 6 figure
An incremental approach to genetic algorithms based classification
Incremental learning has been widely addressed in the machine learning literature to cope with learning tasks where the learning environment is ever changing or training samples become available over time. However, most research work explores incremental learning with statistical algorithms or neural networks, rather than evolutionary algorithms. The work in this paper employs genetic algorithms (GAs) as basic learning algorithms for incremental learning within one or more classifier agents in a multi-agent environment. Four new approaches with different initialization schemes are proposed. They keep the old solutions and use an “integration” operation to integrate them with new elements to accommodate new attributes, while biased mutation and crossover operations are adopted to further evolve a reinforced solution. The simulation results on benchmark classification data sets show that the proposed approaches can deal with the arrival of new input attributes and integrate them with the original input space. It is also shown that the proposed approaches can be successfully used for incremental learning and improve classification rates as compared to the retraining GA. Possible applications for continuous incremental training and feature selection are also discussed
A content-based retrieval system for UAV-like video and associated metadata
In this paper we provide an overview of a content-based retrieval (CBR) system that has been specifically designed for handling UAV video and associated meta-data. Our emphasis in designing this system is on managing large quantities of such information and providing intuitive and efficient access mechanisms to this content, rather than on analysis of the video content. The retrieval unit in our system is termed a "trip". At capture time, each trip consists of an MPEG-1 video stream and a set of time stamped GPS locations. An analysis process automatically selects and associates GPS locations with the video timeline. The indexed trip is then stored in a shared trip repository. The repository forms the backend of a MPEG-211 compliant Web 2.0 application for subsequent querying, browsing, annotation and video playback. The system interface allows users to search/browse across the entire archive of trips and, depending on their access rights, to annotate other users' trips with additional information. Interaction with the CBR system is via a novel interactive map-based interface. This interface supports content access by time, date, region of interest on the map, previously annotated specific locations of interest and combinations of these. To develop such a system and investigate its practical usefulness in real world scenarios, clearly a significant amount of appropriate data is required. In the absence of a large volume of UAV data with which to work, we have simulated UAV-like data using GPS tagged video content captured from moving vehicles
- …