17 research outputs found

    An Architecture to Support the Collection of Big Data in the Internet of Things

    Get PDF
    International audienceThe Internet of Things (IoT) relies on physical objects interconnected between each others, creating a mesh of devices producing information. In this context, sensors are surrounding our environment (e.g., cars, buildings, smartphones) and continuously collect data about our living environment. Thus, the IoT is a prototypical example of Big Data. The contribution of this paper is to define a software architecture supporting the collection of sensor-based data in the context of the IoT. The architecture goes from the physical dimension of sensors to the storage of data in a cloud-based system. It supports Big Data research effort as its instantiation supports a user while collecting data from the IoT for experimental or production purposes. The results are instantiated and validated on a project named SMARTCAMPUS, which aims to equip the SophiaTech campus with sensors to build innovative applications that supports end-users

    Using Semantic Modeling to Improve the Processing Efficiency of Big Data in the Internet of Things Domain

    No full text
    The purpose of the article is to use deep machine learning, based on convolutional neural networks because this model of machine learning corresponds to processing of unstructured and complex nature of the IoT domain. Results. Proposed approach increases the efficiency of IoT Big Data processing and differs from traditional processing systems by using NoSQL database, distributed architectures and semantic modeling.Метою статті є використання глибокого машинного навчання, основаного на згорткових нейронних мережах, оскільки ця модель машинного навчання відповідає обробленню неструктурованого та складного домену IoT. Результати. Запропонований підхід підвищує ефективність оброблення великих даних IoT і відрізняється від традиційних систем оброблення за допомоги бази даних NoSQL, розподілених архітектур і семантичного моделювання. Запропоновано використовувати глибоке машинне навчання, що базується на нейронних мережах, пристосованих для неструктурованих даних IoT.Запропоновану концептуальну архітектуру системи оброблення великих даних для IoT описано на прикладі бази даних NoSQL для Smart Home.Рассмотрена специфика Big Data, которые генерирует технология Интернет вещей, а также представлена методология их обработки семантического моделирования (онтологий) на всех этапах жизненного цикла больших данных. Семантическое моделирование позволяет устранить такие противоречия в технологиях, как гетерогенность устройств и данных. Предлагается использование машинное обучение для анализа Big Data, создаваемых информационной системой умного дома. Предложено использовать глубокое машинное обучение, базирующееся на сверточных нейронных сетях, приспособленных для неструктурированных данных IoT. Представлены новые подходы для обработки больших данных, которые повышают эффективность обработки Big Data в IoT. Представлена концептуальная архитектура системы обработки больших данных для Интернета вещей на примере сгенерированной базы данных NoSQL для умного дома. Данная архитектура состоит из пяти независимых уровней, каждый из которых может использовать семантическое моделирование

    Software Development Support for Shared Sensing Infrastructures: A Generative and Dynamic Approach

    Get PDF
    International audienceSensors networks are the backbone of large sensing infras-tructures such as Smart Cities or Smart Buildings. Classical approaches suffer from several limitations hampering developers' work (e.g., lack of sensor sharing, lack of dynamicity in data collection policies, need to dig inside big data sets, absence of reuse between implementation platforms). This paper presents a tooled approach that tackles these issues. It couples (i) an abstract model of developers' requirements in a given infrastructure to (ii) timed automata and code generation techniques, to support the efficient deployment of reusable data collection policies on different infrastructures. The approach has been validated on several real-world scenarios and is currently experimented on an academic campus

    A Secure Mechanism for Big Data Collection in Large Scale Internet of Vehicle

    Get PDF
    As an extension for Internet of Things (IoT), Internet of Vehicles (IoV) achieves unified management in smart transportation area. With the development of IoV, an increasing number of vehicles are connected to the network. Large scale IoV collects data from different places and various attributes, which conform with heterogeneous nature of big data in size, volume, and dimensionality. Big data collection between vehicle and application platform becomes more and more frequent through various communication technologies, which causes evolving security attack. However, the existing protocols in IoT cannot be directly applied in big data collection in large scale IoV. The dynamic network structure and growing amount of vehicle nodes increases the complexity and necessary of the secure mechanism. In this paper, a secure mechanism for big data collection in large scale IoV is proposed for improved security performance and efficiency. To begin with, vehicles need to register in the big data center to connect into the network. Afterwards, vehicles associate with big data center via mutual authentication and single sign-on algorithm. Two different secure protocols are proposed for business data and confidential data collection. The collected big data is stored securely using distributed storage. The discussion and performance evaluation result shows the security and efficiency of the proposed secure mechanism

    Using Hadoop to Support Big Data Analysis: Design and Performance Characteristics

    Get PDF
    Today, the amount of data generated is extremely large and is growing faster than computational speeds can keep up with. Therefore, using the traditional ways or we can say using a single machine to store or process data can no longer be beneficial and can take a huge amount of time. As a result, we need a different and better way to process data such as having data distributed over large computing clusters. Hadoop is a framework that allows the distributed processing of large data sets. Hadoop is an open source application available under the Apache License. It is designed to scale up from a single server to thousands of machines, where each machine can perform computations locally and store them. The literature indicates that processing Big Data in a reasonable time frame can be a challenging task. One of the most promising platforms is a concept of Exascale computing. This paper created a testbed based on recommendations for Big Data within the Exascale architecture. This testbed featured three nodes, Hadoop distributed file system. Data from Twitter logs was stored in both the Hadoop file system as well as a traditional MySQL database. The Hadoop file system consistently outperformed the MySQL database. The further research uses larger data sets and more complex queries to truly assess the capabilities of distributed file systems. This research also addresses optimizing the number of processing nodes and the intercommunication paths in the underlying infrastructure of the distributed file system. HIVE.apache.org states that the Apache HIVE data warehouse software facilitates reading, writing, and managing large datasets residing in distributes storage using SQL. At the end, there is an explanation of how to install and launch Hadoop and HIVE, how to configure the rules in a Hadoop ecosystem and the few use cases to check the performance

    Machine-to-machine emergency system for urban safety

    Get PDF
    Nowadays most people live in urban areas. As populations grow, demand on the city ecosystem increases, directly affecting the entities responsible for the city control. Challenges like this make leaders adopt ways to engage with the surroundings of their city, making them more prepared and aware. The decisions they make not only directly affect the city in short term, but are also a means to improve the decision making process. This work aimed to develop a system which can act as an emergency and security supervisor in a city, generating alerts to empower entities responsible for disaster management. The system is capable of monitoring data from sensors and provide useful knowledge from it. This work presents an architecture for the collection of data in the Internet of Things (IoT). It delivers the analysis of the used tools and the choices made regarding the implemented system. Also, it provides the necessary inputs for developers to participate in the project, since it describes all the techniques, languages, strategies and programming paradigms used. Finally, it describes the prototype that receives data and processes it to generate alerts with the purpose of warning emergency response teams and the future implementation of a prediction module that can act as a useful tool to better manage the emergency personnel. The completion of the internship allowed the learning of new concepts and techniques, as well as the development of those that were already familiar. With regard to the company, the developed system will integrate the company’s Citibrain platform and will act as a central point, in which, every application (e.g. water management, waste management) can be subscribed to receive alerts

    Machine Learning for Internet of Things Data Analysis: A Survey

    Get PDF
    Rapid developments in hardware, software, and communication technologies have facilitated the emergence of Internet-connected sensory devices that provide observations and data measurements from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As these numbers grow and technologies become more mature, the volume of data being published will increase. The technology of Internet-connected devices, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interactions between the physical and cyber worlds. In addition to an increased volume, the IoT generates big data characterized by its velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this big data are the key to developing smart IoT applications. This article assesses the various machine learning methods that deal with the challenges presented by IoT data by considering smart cities as the main use case. The key contribution of this study is the presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying a Support Vector Machine (SVM) to Aarhus smart city traffic data is presented for a more detailed exploration

    Designing & Implementing a Java Web Application to Interact with Data Stored in a Distributed File System

    Get PDF
    Every day there is an exponential increase of information and this data must be stored and analyzed. Traditional data warehousing solutions are expensive. Apache Hadoop is a popular open source data store which implements map-reduce concepts to create a distributed database architecture. In this paper, a performance analysis project was devised that compares Apache Hive, which is built on top of Apache Hadoop, with a traditional database such as MySQL. Hive supports HiveQueryLanguage, a SQL like directive language which implements MapReduce jobs. These jobs can then be executed using Hadoop. Hive also has a system catalog – Metastore which is used to index data components. The Hadoop framework is developed to include a duplication detection system which helps managing multiple copies of the same data at the file level. The Java Server Pages and Java Servlet framework were used to build a Java web application to provide a web interface for the clients to access and analyze large data sets present in Apache Hive or MySQL databases
    corecore