3,380 research outputs found

    A schema-based P2P network to enable publish-subscribe for multimedia content in open hypermedia systems

    No full text
    Open Hypermedia Systems (OHS) aim to provide efficient dissemination, adaptation and integration of hyperlinked multimedia resources. Content available in Peer-to-Peer (P2P) networks could add significant value to OHS provided that challenges for efficient discovery and prompt delivery of rich and up-to-date content are successfully addressed. This paper proposes an architecture that enables the operation of OHS over a P2P overlay network of OHS servers based on semantic annotation of (a) peer OHS servers and of (b) multimedia resources that can be obtained through the link services of the OHS. The architecture provides efficient resource discovery. Semantic query-based subscriptions over this P2P network can enable access to up-to-date content, while caching at certain peers enables prompt delivery of multimedia content. Advanced query resolution techniques are employed to match different parts of subscription queries (subqueries). These subscriptions can be shared among different interested peers, thus increasing the efficiency of multimedia content dissemination

    Forecasting in Database Systems

    Get PDF
    Time series forecasting is a fundamental prerequisite for decision-making processes and crucial in a number of domains such as production planning and energy load balancing. In the past, forecasting was often performed by statistical experts in dedicated software environments outside of current database systems. However, forecasts are increasingly required by non-expert users or have to be computed fully automatically without any human intervention. Furthermore, we can observe an ever increasing data volume and the need for accurate and timely forecasts over large multi-dimensional data sets. As most data subject to analysis is stored in database management systems, a rising trend addresses the integration of forecasting inside a DBMS. Yet, many existing approaches follow a black-box style and try to keep changes to the database system as minimal as possible. While such approaches are more general and easier to realize, they miss significant opportunities for improved performance and usability. In this thesis, we introduce a novel approach that seamlessly integrates time series forecasting into a traditional database management system. In contrast to flash-back queries that allow a view on the data in the past, we have developed a Flash-Forward Database System (F2DB) that provides a view on the data in the future. It supports a new query type - a forecast query - that enables forecasting of time series data and is automatically and transparently processed by the core engine of an existing DBMS. We discuss necessary extensions to the parser, optimizer, and executor of a traditional DBMS. We furthermore introduce various optimization techniques for three different types of forecast queries: ad-hoc queries, recurring queries, and continuous queries. First, we ease the expensive model creation step of ad-hoc forecast queries by reducing the amount of processed data with traditional sampling techniques. Second, we decrease the runtime of recurring forecast queries by materializing models in a specialized index structure. However, a large number of time series as well as high model creation and maintenance costs require a careful selection of such models. Therefore, we propose a model configuration advisor that determines a set of forecast models for a given query workload and multi-dimensional data set. Finally, we extend forecast queries with continuous aspects allowing an application to register a query once at our system. As new time series values arrive, we send notifications to the application based on predefined time and accuracy constraints. All of our optimization approaches intend to increase the efficiency of forecast queries while ensuring high forecast accuracy

    Top-k spatial-keyword publish/subscribe over sliding window

    Full text link
    Ā© 2017, Springer-Verlag Berlin Heidelberg. With the prevalence of social media and GPS-enabled devices, a massive amount of geo-textual data have been generated in a stream fashion, leading to a variety of applications such as location-based recommendation and information dissemination. In this paper, we investigate a novel real-time top-k monitoring problem over sliding window of streaming data; that is, we continuously maintain the top-k most relevant geo-textual messages (e.g., geo-tagged tweets) for a large number of spatial-keyword subscriptions (e.g., registered users interested in local events) simultaneously. To provide the most recent information under controllable memory cost, sliding window model is employed on the streaming geo-textual data. To the best of our knowledge, this is the first work to study top-k spatial-keyword publish/subscribe over sliding window. A novel centralized system, called Skype (Top-kSpatial-keyword Publish/Subscribe), is proposed in this paper. In Skype, to continuously maintain top-k results for massive subscriptions, we devise a novel indexing structure upon subscriptions such that each incoming message can be immediately delivered on its arrival. To reduce the expensive top-k re-evaluation cost triggered by message expiration, we develop a novel cost-basedk-skyband technique to reduce the number of re-evaluations in a cost-effective way. Extensive experiments verify the great efficiency and effectiveness of our proposed techniques. Furthermore, to support better scalability and higher throughput, we propose a distributed version of Skype, namely DSkype, on top of Storm, which is a popular distributed stream processing system. With the help of fine-tuned subscription/message distribution mechanisms, DSkype can achieve orders of magnitude speed-up than its centralized version

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Distributed information extraction from large-scale wireless sensor networks

    Get PDF

    DeMMon Decentralized Management and Monitoring Framework

    Get PDF
    The centralized model proposed by the Cloud computing paradigm mismatches the decentralized nature of mobile and IoT applications, given the fact that most of the data production and consumption is performed by end-user devices outside of the Data Center (DC). As the number of these devices grows, and given the need to transport data to and from DCs for computation, application providers incur additional infrastructure costs, and end-users incur delays when performing operations. These reasons have led us into a post-cloud era, where a new computing paradigm arose: Edge Computing. Edge Computing takes into account the broad spectrum of devices residing outside of the DC, closer to the clients, as potential targets for computations, potentially reducing infrastructure costs, improving the quality of service (QoS) for end-users and allowing new interaction paradigms between users and applications. Managing and monitoring the execution of these devices raises new challenges previously unaddressed by Cloud computing, given the scale of these systems and the devicesā€™ (potentially) unreliable data connections and heterogenous computational power. The study of the state-of-the-art has revealed that existing resource monitoring and management solutions require manual configuration and have centralized components, which we believe do not scale for larger-scale systems. In this work, we address these limitations by presenting a novel Decentralized Management and Monitoring (ā€œDeMMonā€) system, targeted for edge settings. DeMMon provides primitives to ease the development of tools that manage computational resources that support edge-enabled applications, decomposed in components, through decentralized actions, taking advantage of partial knowledge of the system. Our solution was evaluated to amount to its benefits regarding information dissemination and monitoring capabilities across a set of realistic emulated scenarios of up to 750 nodes with variable failure rates. The results show the validity of our approach and that it can outperform state-of-the-art solutions regarding scalability and reliabilityO modelo centralizado de computaĆ§Ć£o utilizado no paradigma da ComputaĆ§Ć£o na Nuvem apresenta limitaƧƵes no contexto de aplicaƧƵes no domĆ­nio da Internet das Coisas e aplicaƧƵes mĆ³veis. Neste tipo de aplicaƧƵes, os dados sĆ£o produzidos e consumidos maioritariamente por dispositivos que se encontram na periferia da rede. Desta forma, transportar estes dados de e para os centros de dados impƵe uma carga excessiva nas infraestruturas de rede que ligam os dispositivos aos centros de dados, aumentando a latĆŖncia de respostas e diminuindo a qualidade de serviƧo para os utilizadores. Para combater estas limitaƧƵes, surgiu o paradigma da ComputaĆ§Ć£o na Periferia, este paradigma propƵe a execuĆ§Ć£o de computaƧƵes, e potencialmente armazenamento de dados, em dispositivos fora dos centros de dados, mais perto dos clientes, reduzindo custos e criando um novo leque de possibilidades para efetuar computaƧƵes distribuĆ­das mais prĆ³ximas dos dispositivos que produzem e consomem os dados. Contudo, gerir e supervisionar a execuĆ§Ć£o desses dispositivos levanta obstĆ”culos nĆ£o equacionados pela ComputaĆ§Ć£o na Nuvem, como a escala destes sistemas, ou a variabilidade na conectividade e na capacidade de computaĆ§Ć£o dos dispositivos que os compƵem. O estudo da literatura revela que ferramentas populares para gerir e supervisionar aplicaƧƵes e dispositivos possuem limitaƧƵes para a sua escalabilidade, como por exemplo, pontos de falha centralizados, ou requerem a configuraĆ§Ć£o manual de cada dispositivo. Nesta dissertaĆ§Ć£o, propƵem-se uma nova soluĆ§Ć£o de monitorizaĆ§Ć£o e disseminaĆ§Ć£o de informaĆ§Ć£o descentralizada. Esta soluĆ§Ć£o oferece operaƧƵes que permitem recolher informaĆ§Ć£o sobre o estado do sistema, de modo a ser utilizada por soluƧƵes (tambĆ©m descentralizadas) que gerem aplicaƧƵes especializadas para executar na periferia da rede. A nossa soluĆ§Ć£o foi avaliada em redes emuladas de vĆ”rias dimensƵes com um mĆ”ximo de 750 nĆ³s, no contexto de disseminaĆ§Ć£o e de monitorizaĆ§Ć£o de informaĆ§Ć£o. Os nossos resultados mostram que o nosso sistema consegue ser mais robusto ao mesmo tempo que Ć© mais escalĆ”vel quando comparado com o estado da arte
    • ā€¦
    corecore