1,975 research outputs found

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Carbon Capture; Transport and Storage in Europe: A Problematic Energy Bridge to Nowhere?

    Get PDF
    This paper is a follow up of the SECURE-project, financed by the European Commission to study “Security of Energy Considering its Uncertainties, Risks and Economic Implications”. It addresses the perspectives of, and the obstacles to a CCTS-roll out, as stipulated in some of the scenarios. Our main hypothesis is that given the substantial technical and institutional uncertainties, the lack of a clear political commitment, and the available alternatives of low-carbon technologies, CCTS is unlikely to play an important role in the future energy mix; it is even less likely to be an “energy bridge” into a low-carbon energy futureCarbon Capture, Transport, Storage

    Scaling up Labeling, Mining, and Inferencing on Event Extraction

    Get PDF
    Numerous important events happen every day and are reported in different media sources with varying narrative styles across different knowledge domains and languages. Detecting the real-world events that have been reported from online articles and posts is one of the main tasks in event extraction. Other tasks include identifying event triggers and trigger types, identifying event arguments and argument types, clustering and tracking similar events from different texts, event prediction, and event evolution. As one of the most important research themes in natural language processing and understanding, event extraction has wide applications in diverse domains and has been intensively researched for decades. This work targets a scaling-up of End-to-End event extraction task through three ways. First, scaling up the event labeling process to different languages and domains. We designed and implemented four approaches to accurately and efficiently produce multi-lingual labels for events. Using the approaches we developed, we were able to complete Arabic actor and verb dictionaries with coverage equivalent to English in less than two years of work, compared to two decades for English dictionary development. Second, scaling up event extraction by using the document topics information in a topic-aware deep learning framework. We propose a domain-aware event extraction method by using the topic name embeddings to enrich the sentences' contextual representations and multi-task setup of event extraction and topic classification task. With the topic-aware model we developed, we were able to improve F1 by 1.8% on all event types, and F1 by 13.34% on few-shot event types. Third, scaling up event extraction by designing containerized and efficient pipelines, which researchers can comfortably adopt. The pipeline has a container-based architecture that adapts to the available systems and load to process text. With the Kalman filter based batch size optimization, we were able to achieve 20.33% improvement on processing time compared to static batch size. Using the pipeline we developed, we were able to publish largest machine-coded political event dataset covering 1979 to 2016 (2TB, 300 million documents)

    Giving RSEs a Larger Stage through the Better Scientific Software Fellowship

    Full text link
    The Better Scientific Software Fellowship (BSSwF) was launched in 2018 to foster and promote practices, processes, and tools to improve developer productivity and software sustainability of scientific codes. BSSwF's vision is to grow the community with practitioners, leaders, mentors, and consultants to increase the visibility of scientific software production and sustainability. Over the last five years, many fellowship recipients and honorable mentions have identified as research software engineers (RSEs). This paper provides case studies from several of the program's participants to illustrate some of the diverse ways BSSwF has benefited both the RSE and scientific communities. In an environment where the contributions of RSEs are too often undervalued, we believe that programs such as BSSwF can be a valuable means to recognize and encourage community members to step outside of their regular commitments and expand on their work, collaborations and ideas for a larger audience.Comment: submitted to Computing in Science & Engineering (CiSE), Special Issue on the Future of Research Software Engineers in the U

    Introducing distributed dynamic data-intensive (D3) science: Understanding applications and infrastructure

    Get PDF
    A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Data sets are growing larger and becoming distributed; and their location, availability and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While "static" data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.Comment: 38 pages, 2 figure

    Risky Business: The Economic Risks of Climate Change in the United States

    Get PDF
    The American economy could face significant and widespread disruptions from climate change unless U.S. businesses and policymakers take immediate action to reduce climate risk. This report summarizes findings of an independent assessment of the impact of climate change at the county, state, and regional level, and shows that communities, industries, and properties across the U.S. face profound risks from climate change. The findings also show that the most severe risks can still be avoided through early investments in resilience, and through immediate action to reduce the pollution that causes global warming. The Risky Business report shows that two of the primary impacts of climate change -- extreme heat and sea level rise -- will disproportionately affect certain regions of the U.S., and pose highly variable risks across the nation. In the U.S. Gulf Coast, Northeast, and Southeast, for example, sea level rise and increased damage from storm surge are likely to lead to an additional 2to2 to 3.5 billion in property losses each year by 2030, with escalating costs in future decades. In interior states in the Midwest and Southwest, extreme heat will threaten human health, reduce labor productivity and strain electricity grids. Conversely in northern latitudes such as North Dakota and Montana, winter temperatures will likely rise, reducing frost events and cold-related deaths, and lengthening the growing season for some crops. The report is a product of The Risky Business Project a joint, non-partisan initiative of former Treasury Secretary Henry M. Paulson, Jr., Mayor of New York City from 2002-2013 Michael R. Bloomberg, and Thomas P. Steyer, former Senior Managing Member of Farallon Capital Management. They were joined by members of a high-level "Risk Committee" who helped scope the research and reviewed the research findings

    The Analysis of Big Data on Cites and Regions - Some Computational and Statistical Challenges

    Get PDF
    Big Data on cities and regions bring new opportunities and challenges to data analysts and city planners. On the one side, they hold great promise to combine increasingly detailed data for each citizen with critical infrastructures to plan, govern and manage cities and regions, improve their sustainability, optimize processes and maximize the provision of public and private services. On the other side, the massive sample size and high-dimensionality of Big Data and their geo-temporal character introduce unique computational and statistical challenges. This chapter provides overviews on the salient characteristics of Big Data and how these features impact on paradigm change of data management and analysis, and also on the computing environment.Series: Working Papers in Regional Scienc

    Analysis domain model for shared virtual environments

    Get PDF
    The field of shared virtual environments, which also encompasses online games and social 3D environments, has a system landscape consisting of multiple solutions that share great functional overlap. However, there is little system interoperability between the different solutions. A shared virtual environment has an associated problem domain that is highly complex raising difficult challenges to the development process, starting with the architectural design of the underlying system. This paper has two main contributions. The first contribution is a broad domain analysis of shared virtual environments, which enables developers to have a better understanding of the whole rather than the part(s). The second contribution is a reference domain model for discussing and describing solutions - the Analysis Domain Model
    corecore