4,251 research outputs found

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    GLOVE: towards privacy-preserving publishing of record-level-truthful mobile phone trajectories

    Get PDF
    Datasets of mobile phone trajectories collected by network operators offer an unprecedented opportunity to discover new knowledge from the activity of large populations of millions. However, publishing such trajectories also raises significant privacy concerns, as they contain personal data in the form of individual movement patterns. Privacy risks induce network operators to enforce restrictive confidential agreements in the rare occasions when they grant access to collected trajectories, whereas a less involved circulation of these data would fuel research and enable reproducibility in many disciplines. In this work, we contribute a building block toward the design of privacy-preserving datasets of mobile phone trajectories that are truthful at the record level. We present GLOVE, an algorithm that implements k-anonymity, hence solving the crucial unicity problem that affects this type of data while ensuring that the anonymized trajectories correspond to real-life users. GLOVE builds on original insights about the root causes behind the undesirable unicity of mobile phone trajectories, and leverages generalization and suppression to remove them. Proof-of-concept validations with large-scale real-world datasets demonstrate that the approach adopted by GLOVE allows preserving a substantial level of accuracy in the data, higher than that granted by previous methodologies.This work was supported by the Atracción de Talento Investigador program of the Comunidad de Madrid under Grant No. 2019-T1/TIC-16037 NetSense

    Architecture and Information Requirements to Assess and Predict Flight Safety Risks During Highly Autonomous Urban Flight Operations

    Get PDF
    As aviation adopts new and increasingly complex operational paradigms, vehicle types, and technologies to broaden airspace capability and efficiency, maintaining a safe system will require recognition and timely mitigation of new safety issues as they emerge and before significant consequences occur. A shift toward a more predictive risk mitigation capability becomes critical to meet this challenge. In-time safety assurance comprises monitoring, assessment, and mitigation functions that proactively reduce risk in complex operational environments where the interplay of hazards may not be known (and therefore not accounted for) during design. These functions can also help to understand and predict emergent effects caused by the increased use of automation or autonomous functions that may exhibit unexpected non-deterministic behaviors. The envisioned monitoring and assessment functions can look for precursors, anomalies, and trends (PATs) by applying model-based and data-driven methods. Outputs would then drive downstream mitigation(s) if needed to reduce risk. These mitigations may be accomplished using traditional design revision processes or via operational (and sometimes automated) mechanisms. The latter refers to the in-time aspect of the system concept. This report comprises architecture and information requirements and considerations toward enabling such a capability within the domain of low altitude highly autonomous urban flight operations. This domain may span, for example, public-use surveillance missions flown by small unmanned aircraft (e.g., infrastructure inspection, facility management, emergency response, law enforcement, and/or security) to transportation missions flown by larger aircraft that may carry passengers or deliver products. Caveat: Any stated requirements in this report should be considered initial requirements that are intended to drive research and development (R&D). These initial requirements are likely to evolve based on R&D findings, refinement of operational concepts, industry advances, and new industry or regulatory policies or standards related to safety assurance

    Process Monitoring and Data Mining with Chemical Process Historical Databases

    Get PDF
    Modern chemical plants have distributed control systems (DCS) that handle normal operations and quality control. However, the DCS cannot compensate for fault events such as fouling or equipment failures. When faults occur, human operators must rapidly assess the situation, determine causes, and take corrective action, a challenging task further complicated by the sheer number of sensors. This information overload as well as measurement noise can hide information critical to diagnosing and fixing faults. Process monitoring algorithms can highlight key trends in data and detect faults faster, reducing or even preventing the damage that faults can cause. This research improves tools for process monitoring on different chemical processes. Previously successful monitoring methods based on statistics can fail on non-linear processes and processes with multiple operating states. To address these challenges, we develop a process monitoring technique based on multiple self-organizing maps (MSOM) and apply it in industrial case studies including a simulated plant and a batch reactor. We also use standard SOM to detect a novel event in a separation tower and produce contribution plots which help isolate the causes of the event. Another key challenge to any engineer designing a process monitoring system is that implementing most algorithms requires data organized into “normal” and “faulty”; however, data from faulty operations can be difficult to locate in databases storing months or years of operations. To assist in identifying faulty data, we apply data mining algorithms from computer science and compare how they cluster chemical process data from normal and faulty conditions. We identify several techniques which successfully duplicated normal and faulty labels from expert knowledge and introduce a process data mining software tool to make analysis simpler for practitioners. The research in this dissertation enhances chemical process monitoring tasks. MSOM-based process monitoring improves upon standard process monitoring algorithms in fault identification and diagnosis tasks. The data mining research reduces a crucial barrier to the implementation of monitoring algorithms. The enhanced monitoring introduced can help engineers develop effective and scalable process monitoring systems to improve plant safety and reduce losses from fault events

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog
    corecore