85,400 research outputs found

    The Early Bird Catches The Term: Combining Twitter and News Data For Event Detection and Situational Awareness

    Full text link
    Twitter updates now represent an enormous stream of information originating from a wide variety of formal and informal sources, much of which is relevant to real-world events. In this paper we adapt existing bio-surveillance algorithms to detect localised spikes in Twitter activity corresponding to real events with a high level of confidence. We then develop a methodology to automatically summarise these events, both by providing the tweets which fully describe the event and by linking to highly relevant news articles. We apply our methods to outbreaks of illness and events strongly affecting sentiment. In both case studies we are able to detect events verifiable by third party sources and produce high quality summaries

    Generalized Bayesian Record Linkage and Regression with Exact Error Propagation

    Full text link
    Record linkage (de-duplication or entity resolution) is the process of merging noisy databases to remove duplicate entities. While record linkage removes duplicate entities from such databases, the downstream task is any inferential, predictive, or post-linkage task on the linked data. One goal of the downstream task is obtaining a larger reference data set, allowing one to perform more accurate statistical analyses. In addition, there is inherent record linkage uncertainty passed to the downstream task. Motivated by the above, we propose a generalized Bayesian record linkage method and consider multiple regression analysis as the downstream task. Records are linked via a random partition model, which allows for a wide class to be considered. In addition, we jointly model the record linkage and downstream task, which allows one to account for the record linkage uncertainty exactly. Moreover, one is able to generate a feedback propagation mechanism of the information from the proposed Bayesian record linkage model into the downstream task. This feedback effect is essential to eliminate potential biases that can jeopardize resulting downstream task. We apply our methodology to multiple linear regression, and illustrate empirically that the "feedback effect" is able to improve the performance of record linkage.Comment: 18 pages, 5 figure

    Coupling CFD and visualisation to model the behaviour and effect on visibility of small particles in air

    Get PDF
    The use of computational fluid dynamics (CFD) and lighting simulation software is becoming commonplace in building design. This study looks at a novel linkage between these two tools in the visualization of droplets or particles suspended in air. CFD is used to predict the distribution of the particles, which is then processed and passed to the lighting simulation tool. The mechanism for transforming CFD contaminant concentration predictions to a form suitable for visual simulation is explained in detail and an example presented which demonstrates this linkage. The CFD-visualisation simulations described in this paper have applications in both automotive and fire safety through the modelling of fog and smoke respectively. Historically, smoke and fog effects have been rendered in images with no attempt at modelling physical reality. The novelty of the work presented in this paper is that, for the first time, an attempt is made to model both the fluid mechanics and optical physics of small particles suspended in a primary fluid

    The genomic Make-Up of a Hybrid Species - Analysis of the Invasive Cottus Lineage (Pisces, Teleostei) in the River Rhine system

    Get PDF
    In the past years a new invasive lineage of sculpins (Cottus species complex) has been studied that is currently expanding in the Lower River Rhine. Molecular analysis showed that this lineage has originated through hybridization of Cottus perifretum from the River Scheldt and Cottus rhenanus from the Lower River Rhine system. The emergence of the hybrid lineage is correlated with new habitat adaptations that allow the expansion along river habitats that have previously not been used by Cottus. Thus the question arises, if the hybridization event facilitated the invasion of and the adaptation to such a new environment. To start tackling this question an estimate is required how much each of the parental species contributed to the hybrid genome and which chromosomal fragments became fixed. Several genomic resources had to be developed in order to map the ancestries of chromosomal fragments in the hybrid genome. As a basic genomic resource for Cottus a genetic map based on already established microsatellite markers was created. This map was compared with the physical maps of sequenced fish genomes and a high degree of conserved synteny between Cottus and Tetraodon nigroviridis and between Cottus and Gasterosteus aculeatus could be detected. These model fish genomes could then be used as a reference in the further analysis of the Cottus genome. Finally, a set of ancestry-informative markers was developed in order to determine the ancestries of chromosomal fragments in the hybrid lineage. These tools allowed to map the hybrid genome and to assess the contribution of each parental species to the hybrid lineage. 25 genomic fragments could be identified that were fixed for material from only one parental species and thus might harbor genes that are relevant for the specific adaptations in the hybrid species

    Measuring Organizational Performance in Strategic Human Resource Management: Looking Beyond the Lamppost

    Get PDF
    A major challenge for Strategic Human Resource Management research in the next decade will be to establish a clear, coherent and consistent construct for organizational performance. This paper describes the variety of measures used in current empirical research linking human resource management and organizational performance. Implications for future research are discussed amidst the challenges of construct definition, divergent stakeholder criteria and the temporal dynamics of performance. A model for performance information markets to address these challenges is introduced. The model uses a multi-dimensional weighted performance measurement system and a free information flow exchange mechanism for determining performance achievement criteria

    A dynamic network model with persistent links and node-specific latent variables, with an application to the interbank market

    Get PDF
    We propose a dynamic network model where two mechanisms control the probability of a link between two nodes: (i) the existence or absence of this link in the past, and (ii) node-specific latent variables (dynamic fitnesses) describing the propensity of each node to create links. Assuming a Markov dynamics for both mechanisms, we propose an Expectation-Maximization algorithm for model estimation and inference of the latent variables. The estimated parameters and fitnesses can be used to forecast the presence of a link in the future. We apply our methodology to the e-MID interbank network for which the two linkage mechanisms are associated with two different trading behaviors in the process of network formation, namely preferential trading and trading driven by node-specific characteristics. The empirical results allow to recognise preferential lending in the interbank market and indicate how a method that does not account for time-varying network topologies tends to overestimate preferential linkage.Comment: 19 pages, 6 figure

    An incremental approach to genetic algorithms based classification

    Get PDF
    Incremental learning has been widely addressed in the machine learning literature to cope with learning tasks where the learning environment is ever changing or training samples become available over time. However, most research work explores incremental learning with statistical algorithms or neural networks, rather than evolutionary algorithms. The work in this paper employs genetic algorithms (GAs) as basic learning algorithms for incremental learning within one or more classifier agents in a multi-agent environment. Four new approaches with different initialization schemes are proposed. They keep the old solutions and use an “integration” operation to integrate them with new elements to accommodate new attributes, while biased mutation and crossover operations are adopted to further evolve a reinforced solution. The simulation results on benchmark classification data sets show that the proposed approaches can deal with the arrival of new input attributes and integrate them with the original input space. It is also shown that the proposed approaches can be successfully used for incremental learning and improve classification rates as compared to the retraining GA. Possible applications for continuous incremental training and feature selection are also discussed

    A content-based retrieval system for UAV-like video and associated metadata

    Get PDF
    In this paper we provide an overview of a content-based retrieval (CBR) system that has been specifically designed for handling UAV video and associated meta-data. Our emphasis in designing this system is on managing large quantities of such information and providing intuitive and efficient access mechanisms to this content, rather than on analysis of the video content. The retrieval unit in our system is termed a "trip". At capture time, each trip consists of an MPEG-1 video stream and a set of time stamped GPS locations. An analysis process automatically selects and associates GPS locations with the video timeline. The indexed trip is then stored in a shared trip repository. The repository forms the backend of a MPEG-211 compliant Web 2.0 application for subsequent querying, browsing, annotation and video playback. The system interface allows users to search/browse across the entire archive of trips and, depending on their access rights, to annotate other users' trips with additional information. Interaction with the CBR system is via a novel interactive map-based interface. This interface supports content access by time, date, region of interest on the map, previously annotated specific locations of interest and combinations of these. To develop such a system and investigate its practical usefulness in real world scenarios, clearly a significant amount of appropriate data is required. In the absence of a large volume of UAV data with which to work, we have simulated UAV-like data using GPS tagged video content captured from moving vehicles
    corecore