140 research outputs found

    A Data-driven Methodology Towards Mobility- and Traffic-related Big Spatiotemporal Data Frameworks

    Get PDF
    Human population is increasing at unprecedented rates, particularly in urban areas. This increase, along with the rise of a more economically empowered middle class, brings new and complex challenges to the mobility of people within urban areas. To tackle such challenges, transportation and mobility authorities and operators are trying to adopt innovative Big Data-driven Mobility- and Traffic-related solutions. Such solutions will help decision-making processes that aim to ease the load on an already overloaded transport infrastructure. The information collected from day-to-day mobility and traffic can help to mitigate some of such mobility challenges in urban areas. Road infrastructure and traffic management operators (RITMOs) face several limitations to effectively extract value from the exponentially growing volumes of mobility- and traffic-related Big Spatiotemporal Data (MobiTrafficBD) that are being acquired and gathered. Research about the topics of Big Data, Spatiotemporal Data and specially MobiTrafficBD is scattered, and existing literature does not offer a concrete, common methodological approach to setup, configure, deploy and use a complete Big Data-based framework to manage the lifecycle of mobility-related spatiotemporal data, mainly focused on geo-referenced time series (GRTS) and spatiotemporal events (ST Events), extract value from it and support decision-making processes of RITMOs. This doctoral thesis proposes a data-driven, prescriptive methodological approach towards the design, development and deployment of MobiTrafficBD Frameworks focused on GRTS and ST Events. Besides a thorough literature review on Spatiotemporal Data, Big Data and the merging of these two fields through MobiTraffiBD, the methodological approach comprises a set of general characteristics, technical requirements, logical components, data flows and technological infrastructure models, as well as guidelines and best practices that aim to guide researchers, practitioners and stakeholders, such as RITMOs, throughout the design, development and deployment phases of any MobiTrafficBD Framework. This work is intended to be a supporting methodological guide, based on widely used Reference Architectures and guidelines for Big Data, but enriched with inherent characteristics and concerns brought about by Big Spatiotemporal Data, such as in the case of GRTS and ST Events. The proposed methodology was evaluated and demonstrated in various real-world use cases that deployed MobiTrafficBD-based Data Management, Processing, Analytics and Visualisation methods, tools and technologies, under the umbrella of several research projects funded by the European Commission and the Portuguese Government.A população humana cresce a um ritmo sem precedentes, particularmente nas áreas urbanas. Este aumento, aliado ao robustecimento de uma classe média com maior poder económico, introduzem novos e complexos desafios na mobilidade de pessoas em áreas urbanas. Para abordar estes desafios, autoridades e operadores de transportes e mobilidade estão a adotar soluções inovadoras no domínio dos sistemas de Dados em Larga Escala nos domínios da Mobilidade e Tráfego. Estas soluções irão apoiar os processos de decisão com o intuito de libertar uma infraestrutura de estradas e transportes já sobrecarregada. A informação colecionada da mobilidade diária e da utilização da infraestrutura de estradas pode ajudar na mitigação de alguns dos desafios da mobilidade urbana. Os operadores de gestão de trânsito e de infraestruturas de estradas (em inglês, road infrastructure and traffic management operators — RITMOs) estão limitados no que toca a extrair valor de um sempre crescente volume de Dados Espaciotemporais em Larga Escala no domínio da Mobilidade e Tráfego (em inglês, Mobility- and Traffic-related Big Spatiotemporal Data —MobiTrafficBD) que estão a ser colecionados e recolhidos. Os trabalhos de investigação sobre os tópicos de Big Data, Dados Espaciotemporais e, especialmente, de MobiTrafficBD, estão dispersos, e a literatura existente não oferece uma metodologia comum e concreta para preparar, configurar, implementar e usar uma plataforma (framework) baseada em tecnologias Big Data para gerir o ciclo de vida de dados espaciotemporais em larga escala, com ênfase nas série temporais georreferenciadas (em inglês, geo-referenced time series — GRTS) e eventos espacio- temporais (em inglês, spatiotemporal events — ST Events), extrair valor destes dados e apoiar os RITMOs nos seus processos de decisão. Esta dissertação doutoral propõe uma metodologia prescritiva orientada a dados, para o design, desenvolvimento e implementação de plataformas de MobiTrafficBD, focadas em GRTS e ST Events. Além de uma revisão de literatura completa nas áreas de Dados Espaciotemporais, Big Data e na junção destas áreas através do conceito de MobiTrafficBD, a metodologia proposta contem um conjunto de características gerais, requisitos técnicos, componentes lógicos, fluxos de dados e modelos de infraestrutura tecnológica, bem como diretrizes e boas práticas para investigadores, profissionais e outras partes interessadas, como RITMOs, com o objetivo de guiá-los pelas fases de design, desenvolvimento e implementação de qualquer pla- taforma MobiTrafficBD. Este trabalho deve ser visto como um guia metodológico de suporte, baseado em Arqui- teturas de Referência e diretrizes amplamente utilizadas, mas enriquecido com as característi- cas e assuntos implícitos relacionados com Dados Espaciotemporais em Larga Escala, como no caso de GRTS e ST Events. A metodologia proposta foi avaliada e demonstrada em vários cenários reais no âmbito de projetos de investigação financiados pela Comissão Europeia e pelo Governo português, nos quais foram implementados métodos, ferramentas e tecnologias nas áreas de Gestão de Dados, Processamento de Dados e Ciência e Visualização de Dados em plataformas MobiTrafficB

    Generic Object Detection and Segmentation for Real-World Environments

    Get PDF

    Leveraging Spatiotemporal Relationships of High-frequency Activation in Human Electrocorticographic Recordings for Speech Brain-Computer-Interface

    Get PDF
    Speech production is one of the most intricate yet natural human behaviors and is most keenly appreciated when it becomes difficult or impossible; as is the case for patients suffering from locked-in syndrome. Burgeoning understanding of the various cortical representations of language has brought into question the viability of a speech neuroprosthesis using implanted electrodes. The temporal resolution of intracranial electrophysiological recordings, frequently billed as a great asset of electrocorticography (ECoG), has actually been a hindrance as speech decoders have struggled to take advantage of this timing information. There have been few demonstrations of how well a speech neuroprosthesis will realistically generalize across contexts when constructed using causal feature extraction and language models that can be applied and adapted in real-time. The research detailed in this dissertation aims primarily to characterize the spatiotemporal relationships of high frequency activity across ECoG arrays during word production. Once identified, these relationships map to motor and semantic representations of speech through the use of algorithms and classifiers that rapidly quantify these relationships in single-trials. The primary hypothesis put forward by this dissertation is that the onset, duration and temporal profile of high frequency activity in ECoG recordings is a useful feature for speech decoding. These features have rarely been used in state-of-the-art speech decoders, which tend to produce output from instantaneous high frequency power across cortical sites, or rely upon precise behavioral time-locking to take advantage of high frequency activity at several time-points relative to behavioral onset times. This hypothesis was examined in three separate studies. First, software was created that rapidly characterizes spatiotemporal relationships of neural features. Second, semantic representations of speech were examined using these spatiotemporal features. Finally, utterances were discriminated in single-trials with low latency and high accuracy using spatiotemporal matched filters in a neural keyword-spotting paradigm. Outcomes from this dissertation inform implant placement for a human speech prosthesis and provide the scientific and methodological basis to motivate further research of an implant specifically for speech-based brain-computer-interfaces

    Ontology-based context-aware model for event processing in an IoT environment

    Get PDF
    The Internet of Things (IoT) is more and more becoming one of the fundamental sources of data. The observations produced by these sources are made accessible with heterogeneous vocabularies, models and data formats. The heterogeneity factor in such an enormous environment complicates the task of sharing and reusing this data in a more intelligent way (other than the purposes it was initially set up for). In this research, we investigate these challenges, considering how we can transform raw sensor data into a more meaningful information. This raw data will be modelled using ontology-based information that is accessible through continuous queries for sensor streaming data.Interoperability among heterogeneous entities is an important issue in an IoT environment. Semantic modelling is a key element to support interoperability. Most of the current ontologies for IoT mainly focus on resources and services information. This research builds upon the current state-of-the-art ontologies to provide contextual information and facilitate sensor data querying. In this research, we present an Ontology to represent an IoT environment, with emphasis on temporal and geospatial context enrichment. Furthermore, the Ontology is used alongside a proposed syntax based on Description Logic to build an Event Processing Model. The aim of this model is to interconnect ontology-based reasoning with event processing. This model enables to perform event processing over high-level ontological concepts.The Ontology was developed using the NeOn methodology, which emphasises on the reuse and modularisation. The Competency Questions techniques was used to develop the requirements of this Ontology. This was later evaluated by domain experts in software engineering and cloud computing. The ontology was evaluated based on its completeness, conciseness, consistency and expandability, over 70% of the domain experts agreed on the core modules, concepts and relationships within the ontology. The resulted Ontology provides a core IoT ontology that could be used for further development within a specific IoT domain. IIThe proposed Ontology-Based Context-Aware model for Event-Processing in an IoT environment “OCEM-IoT”, implements all the time operators used in complex event processing engines. Throughput and latency were used as performance comparison metrics for the syntax evaluation; the results obtained show an improved performance over existing event processing languages

    Low-latency, query-driven analytics over voluminous multidimensional, spatiotemporal datasets

    Get PDF
    2017 Summer.Includes bibliographical references.Ubiquitous data collection from sources such as remote sensing equipment, networked observational devices, location-based services, and sales tracking has led to the accumulation of voluminous datasets; IDC projects that by 2020 we will generate 40 zettabytes of data per year, while Gartner and ABI estimate 20-35 billion new devices will be connected to the Internet in the same time frame. The storage and processing requirements of these datasets far exceed the capabilities of modern computing hardware, which has led to the development of distributed storage frameworks that can scale out by assimilating more computing resources as necessary. While challenging in its own right, storing and managing voluminous datasets is only the precursor to a broader field of study: extracting knowledge, insights, and relationships from the underlying datasets. The basic building block of this knowledge discovery process is analytic queries, encompassing both query instrumentation and evaluation. This dissertation is centered around query-driven exploratory and predictive analytics over voluminous, multidimensional datasets. Both of these types of analysis represent a higher-level abstraction over classical query models; rather than indexing every discrete value for subsequent retrieval, our framework autonomously learns the relationships and interactions between dimensions in the dataset (including time series and geospatial aspects), and makes the information readily available to users. This functionality includes statistical synopses, correlation analysis, hypothesis testing, probabilistic structures, and predictive models that not only enable the discovery of nuanced relationships between dimensions, but also allow future events and trends to be predicted. This requires specialized data structures and partitioning algorithms, along with adaptive reductions in the search space and management of the inherent trade-off between timeliness and accuracy. The algorithms presented in this dissertation were evaluated empirically on real-world geospatial time-series datasets in a production environment, and are broadly applicable across other storage frameworks

    An Analysis of the Allergy Comments on Twitter Using Data Mining Approach

    Get PDF
    Allergies are one of the most common chronic illnesses in the world. The prevalence of social media allows people to express their opinions and exchange information including symptoms of personal health. Mining those publicly accessible health-related data on social media, such as Twitter, offers a unique approach to get valuable healthcare insights. In this paper, a multi-component data mining framework was developed to collect Twitter data, detect time series patterns, discover topics of interest about allergies, and analyze the contents of tweets. From the extracted 2.2 million tweets in 2019, my experimental results show that allergy-related tweet volume is strongly correlated to the pollen data (r = .699, p < .01). Also, 152 unique topics are identified with a -28.36 perplexity score and a .67 coherence score. Furthermore, many linguistic dimensions such as the sentiment are analyzed to learn about the tweet contents. I consider this to be one of the many studies examining a large-scale social media stream to deeply analyze allergy activities. And with the growing social media, publicly available data such as Twitter posts can be used to support healthcare practitioners and social scientists in better understanding common public opinions, not just allergies.Master of Scienc

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested
    corecore