70 research outputs found

    IJA: An Efficient Algorithm for Query Processing in Sensor Networks

    Get PDF
    One of main features in sensor networks is the function that processes real time state information after gathering needed data from many domains. The component technologies consisting of each node called a sensor node that are including physical sensors, processors, actuators and power have advanced significantly over the last decade. Thanks to the advanced technology, over time sensor networks have been adopted in an all-round industry sensing physical phenomenon. However, sensor nodes in sensor networks are considerably constrained because with their energy and memory resources they have a very limited ability to process any information compared to conventional computer systems. Thus query processing over the nodes should be constrained because of their limitations. Due to the problems, the join operations in sensor networks are typically processed in a distributed manner over a set of nodes and have been studied. By way of example while simple queries, such as select and aggregate queries, in sensor networks have been addressed in the literature, the processing of join queries in sensor networks remains to be investigated. Therefore, in this paper, we propose and describe an Incremental Join Algorithm (IJA) in Sensor Networks to reduce the overhead caused by moving a join pair to the final join node or to minimize the communication cost that is the main consumer of the battery when processing the distributed queries in sensor networks environments. At the same time, the simulation result shows that the proposed IJA algorithm significantly reduces the number of bytes to be moved to join nodes compared to the popular synopsis join algorithm

    Effectiveness of note duration information for music retrieval

    Get PDF
    Content-based music information retrieval uses features extracted from music to answer queries. For melodic queries, the two main features are the pitch and duration of notes. The note pitch feature has been well researched whereas duration has not been fully explored. In this paper, we discuss how the note duration feature can be used to alter music retrieval effectiveness. Notes are represented by strings called standardisations. A standardisation is designed for approximate string matching and may not capture melodic information precisely. To represent pitches, we use a string of pitch differences. Our duration standardisation uses a string of five symbols representing the relative durations of adjacent notes. For both features, the Smith-Waterman alignment is used for matching. We demonstrate combining the similarity in both features using a vector model. Results of our experiments in retrieval effectiveness show that note duration similarity by itself is not useful for effective music retrieval. Combining pitch and duration similarity using the vector model does not improve retrieval effectiveness over the use of pitch on its own

    Building K-Anonymous User Cohorts with\\ Consecutive Consistent Weighted Sampling (CCWS)

    Full text link
    To retrieve personalized campaigns and creatives while protecting user privacy, digital advertising is shifting from member-based identity to cohort-based identity. Under such identity regime, an accurate and efficient cohort building algorithm is desired to group users with similar characteristics. In this paper, we propose a scalable KK-anonymous cohort building algorithm called {\em consecutive consistent weighted sampling} (CCWS). The proposed method combines the spirit of the (pp-powered) consistent weighted sampling and hierarchical clustering, so that the KK-anonymity is ensured by enforcing a lower bound on the size of cohorts. Evaluations on a LinkedIn dataset consisting of >70>70M users and ads campaigns demonstrate that CCWS achieves substantial improvements over several hashing-based methods including sign random projections (SignRP), minwise hashing (MinHash), as well as the vanilla CWS

    Distributed Database Management Techniques for Wireless Sensor Networks

    Full text link
    Authors and/or their employers shall have the right to post the accepted version of IEEE-copyrighted articles on their own personal servers or the servers of their institutions or employers without permission from IEEE, provided that the posted version includes a prominently displayed IEEE copyright notice and, when published, a full citation to the original IEEE publication, including a link to the article abstract in IEEE Xplore. Authors shall not post the final, published versions of their papers.In sensor networks, the large amount of data generated by sensors greatly influences the lifetime of the network. In order to manage this amount of sensed data in an energy-efficient way, new methods of storage and data query are needed. In this way, the distributed database approach for sensor networks is proved as one of the most energy-efficient data storage and query techniques. This paper surveys the state of the art of the techniques used to manage data and queries in wireless sensor networks based on the distributed paradigm. A classification of these techniques is also proposed. The goal of this work is not only to present how data and query management techniques have advanced nowadays, but also show their benefits and drawbacks, and to identify open issues providing guidelines for further contributions in this type of distributed architectures.This work was partially supported by the Instituto de Telcomunicacoes, Next Generation Networks and Applications Group (NetGNA), Portugal, by the Ministerio de Ciencia e Innovacion, through the Plan Nacional de I+D+i 2008-2011 in the Subprograma de Proyectos de Investigacion Fundamental, project TEC2011-27516, by the Polytechnic University of Valencia, though the PAID-05-12 multidisciplinary projects, by Government of Russian Federation, Grant 074-U01, and by National Funding from the FCT-Fundacao para a Ciencia e a Tecnologia through the Pest-OE/EEI/LA0008/2013 Project.Diallo, O.; Rodrigues, JJPC.; Sene, M.; Lloret, J. (2013). Distributed Database Management Techniques for Wireless Sensor Networks. IEEE Transactions on Parallel and Distributed Systems. PP(99):1-17. https://doi.org/10.1109/TPDS.2013.207S117PP9

    Stealth databases : ensuring user-controlled queries in untrusted cloud environments

    Get PDF
    Sensitive data is increasingly being hosted online in ubiquitous cloud storage services. Recent advances in multi-cloud service integration through provider multiplexing and data dispersion have alleviated most of the associated risks for hosting files which are retrieved by users for further processing. However, for structured data managed in databases, many issues remain, including the need to perform operations directly on the remote data to avoid costly transfers. In this paper, we motivate the need for distributed stealth databases which combine properties from structure-preserving dispersed file storage for capacity-saving increased availability with emerging work on structure-preserving encryption for on-demand increased confidentiality with controllable performance degradation. We contribute an analysis of operators executing in map-reduce or map-carry-reduce phases and derive performance statistics. Our prototype, StealthDB, demonstrates that for typical amounts of personal structured data, stealth databases are a convincing concept for taming untrusted and unsafe cloud environments

    From data acquisition to data fusion : a comprehensive review and a roadmap for the identification of activities of daily living using mobile devices

    Get PDF
    This paper focuses on the research on the state of the art for sensor fusion techniques, applied to the sensors embedded in mobile devices, as a means to help identify the mobile device user’s daily activities. Sensor data fusion techniques are used to consolidate the data collected from several sensors, increasing the reliability of the algorithms for the identification of the different activities. However, mobile devices have several constraints, e.g., low memory, low battery life and low processing power, and some data fusion techniques are not suited to this scenario. The main purpose of this paper is to present an overview of the state of the art to identify examples of sensor data fusion techniques that can be applied to the sensors available in mobile devices aiming to identify activities of daily living (ADLs)

    The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives

    Full text link
    The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry.Comment: SIGIR 2023 resource paper, 13 page

    The potential of e-ticketing for public transport planning: the Piedmont region case study

    Full text link
    [EN] In the Piedmont region (Italy) the electronic ticketing system called BIP, is currently active across much of its territory, and thedata collected in the Province of Cuneo since the full activation of the system (2014) provide today a sound source ofinformation. Two different travel documents are available, travel passes and pay-per-use, with different validation rules: check-inonly for travel passes and check-in and check-out for pay-per-use. Data produced by this electronic ticketing system employingsmart cards allow to perform a detailed analysis of each user’s behaviour, and calculate time and space distributions of eachpassenger trip. In detail, data originating from smart card transactions allow to trace back the trip chains, establish journey originsand destinations, and produce a “travel diary” for each passenger. Based on this data, performance indicators (i.e. load factor) aswell as user mobility patterns and origin-destination matrices can be calculated in an automated and reliable way. This articlepresents a methodology for assessing the quality of the data collected when information about boarding and alighting stops isavailable from the (on board) validation system. It also presents an algorithm to assign a destination for each trip where only theboarding information is available. In the case study of the Province of Cuneo, it was found that 91% of the pay-per-use journeydata are reliable and can be used for further analysis, whereas with the use of the proposed algorithm it was possible to estimatethe destinations for 82% of the travel pass trips.DOI: http://dx.doi.org/10.4995/CIT2016.2016.1999Arnone, M.; Delmastro, T.; Giacosa, G.; Paoleti, M.; Villalta, P. (2016). The potential of e-ticketing for public transport planning: the Piedmont region case study. En XII Congreso de ingeniería del transporte. 7, 8 y 9 de Junio, Valencia (España). Editorial Universitat Politècnica de València. 2380-2387. https://doi.org/10.4995/CIT2016.2015.1999OCS2380238
    corecore