23,215 research outputs found

    Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

    Get PDF
    Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

    Multi-camera Realtime 3D Tracking of Multiple Flying Animals

    Full text link
    Automated tracking of animal movement allows analyses that would not otherwise be possible by providing great quantities of data. The additional capability of tracking in realtime - with minimal latency - opens up the experimental possibility of manipulating sensory feedback, thus allowing detailed explorations of the neural basis for control of behavior. Here we describe a new system capable of tracking the position and body orientation of animals such as flies and birds. The system operates with less than 40 msec latency and can track multiple animals simultaneously. To achieve these results, a multi target tracking algorithm was developed based on the Extended Kalman Filter and the Nearest Neighbor Standard Filter data association algorithm. In one implementation, an eleven camera system is capable of tracking three flies simultaneously at 60 frames per second using a gigabit network of nine standard Intel Pentium 4 and Core 2 Duo computers. This manuscript presents the rationale and details of the algorithms employed and shows three implementations of the system. An experiment was performed using the tracking system to measure the effect of visual contrast on the flight speed of Drosophila melanogaster. At low contrasts, speed is more variable and faster on average than at high contrasts. Thus, the system is already a useful tool to study the neurobiology and behavior of freely flying animals. If combined with other techniques, such as `virtual reality'-type computer graphics or genetic manipulation, the tracking system would offer a powerful new way to investigate the biology of flying animals.Comment: pdfTeX using libpoppler 3.141592-1.40.3-2.2 (Web2C 7.5.6), 18 pages with 9 figure

    Digital-forensics based pattern recognition for discovering identities in electronic evidence

    Get PDF
    With the pervasiveness of computers and mobile devices, digital forensics becomes more important in law enforcement. Detectives increasingly depend on the scarce support of digital specialists which impedes efficiency of criminal investigations. This paper proposes and algorithm to extract, merge and rank identities that are encountered in the electronic evidence during processing. Two experiments are described demonstrating that our approach can assist with the identification of frequently occurring identities so that investigators can prioritize the investigation of evidence units accordingly

    JigsawNet: Shredded Image Reassembly using Convolutional Neural Network and Loop-based Composition

    Full text link
    This paper proposes a novel algorithm to reassemble an arbitrarily shredded image to its original status. Existing reassembly pipelines commonly consist of a local matching stage and a global compositions stage. In the local stage, a key challenge in fragment reassembly is to reliably compute and identify correct pairwise matching, for which most existing algorithms use handcrafted features, and hence, cannot reliably handle complicated puzzles. We build a deep convolutional neural network to detect the compatibility of a pairwise stitching, and use it to prune computed pairwise matches. To improve the network efficiency and accuracy, we transfer the calculation of CNN to the stitching region and apply a boost training strategy. In the global composition stage, we modify the commonly adopted greedy edge selection strategies to two new loop closure based searching algorithms. Extensive experiments show that our algorithm significantly outperforms existing methods on solving various puzzles, especially those challenging ones with many fragment pieces

    Recovering complete and draft population genomes from metagenome datasets.

    Get PDF
    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution

    Objective Classification of Galaxy Spectra using the Information Bottleneck Method

    Get PDF
    A new method for classification of galaxy spectra is presented, based on a recently introduced information theoretical principle, the `Information Bottleneck'. For any desired number of classes, galaxies are classified such that the information content about the spectra is maximally preserved. The result is classes of galaxies with similar spectra, where the similarity is determined via a measure of information. We apply our method to approximately 6000 galaxy spectra from the ongoing 2dF redshift survey, and a mock-2dF catalogue produced by a Cold Dark Matter-based semi-analytic model of galaxy formation. We find a good match between the mean spectra of the classes found in the data and in the models. For the mock catalogue, we find that the classes produced by our algorithm form an intuitively sensible sequence in terms of physical properties such as colour, star formation activity, morphology, and internal velocity dispersion. We also show the correlation of the classes with the projections resulting from a Principal Component Analysis.Comment: submitted to MNRAS, 17 pages, Latex, with 14 figures embedde

    A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web

    Full text link
    Over the past decade, rapid advances in web technologies, coupled with innovative models of spatial data collection and consumption, have generated a robust growth in geo-referenced information, resulting in spatial information overload. Increasing 'geographic intelligence' in traditional text-based information retrieval has become a prominent approach to respond to this issue and to fulfill users' spatial information needs. Numerous efforts in the Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the Linking Open Data initiative have converged in a constellation of open knowledge bases, freely available online. In this article, we survey these open knowledge bases, focusing on their geospatial dimension. Particular attention is devoted to the crucial issue of the quality of geo-knowledge bases, as well as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic Network, is outlined as our contribution to this area. Research directions in information integration and Geographic Information Retrieval (GIR) are then reviewed, with a critical discussion of their current limitations and future prospects
    • …
    corecore