526 research outputs found

    Scaling DBSCAN-like algorithms for event detection systems in Twitter

    Get PDF
    The increasing use of mobile social networks has lately transformed news media. Real-world events are nowadays reported in social networks much faster than in traditional channels. As a result, the autonomous detection of events from networks like Twitter has gained lot of interest in both research and media groups. DBSCAN-like algorithms constitute a well-known clustering approach to retrospective event detection. However, scaling such algorithms to geographically large regions and temporarily long periods present two major shortcomings. First, detecting real-world events from the vast amount of tweets cannot be performed anymore in a single machine. Second, the tweeting activity varies a lot within these broad space-time regions limiting the use of global parameters. Against this background, we propose to scale DBSCAN-like event detection techniques by parallelizing and distributing them through a novel density-aware MapReduce scheme. The proposed scheme partitions tweet data as per its spatial and temporal features and tailors local DBSCAN parameters to local tweet densities. We implement the scheme in Apache Spark and evaluate its performance in a dataset composed of geo-located tweets in the Iberian peninsula during the course of several football matches. The results pointed out to the benefits of our proposal against other state-of-the-art techniques in terms of speed-up and detection accuracy.Peer ReviewedPostprint (author's final draft

    A Connected Component Labeling Algorithm for Implicitly-Defined Domains

    Full text link
    A connected component labeling algorithm is developed for implicitly-defined domains specified by multivariate polynomials. The algorithm operates by recursively subdividing the constraint domain into hyperrectangular subcells until the topology thereon is sufficiently simple; in particular, we devise a topology test using properties of Bernstein polynomials. In many cases the algorithm produces a certificate guaranteeing its correctness, i.e., two points yield the same label if and only if they are path-connected. To robustly handle various kinds of edge cases, the algorithm may assign identical labels to distinct components, but only when they are exactly or nearly touching, relative to a user-controlled length scale. A variety of numerical experiments assess the effectiveness of the overall approach, including statistical analyses on randomly generated multi-component geometry in 2D and 3D, as well as specific examples involving cusps, self-intersections, junctions, and other kinds of singularities.Comment: 15 pages, 7 figures, 3 algorithm

    Sensory processing and world modeling for an active ranging device

    Get PDF
    In this project, we studied world modeling and sensory processing for laser range data. World Model data representation and operation were defined. Sensory processing algorithms for point processing and linear feature detection were designed and implemented. The interface between world modeling and sensory processing in the Servo and Primitive levels was investigated and implemented. In the primitive level, linear features detectors for edges were also implemented, analyzed and compared. The existing world model representations is surveyed. Also presented is the design and implementation of the Y-frame model, a hierarchical world model. The interfaces between the world model module and the sensory processing module are discussed as well as the linear feature detectors that were designed and implemented
    • …
    corecore