1,024 research outputs found

    XML content warehousing: Improving sociological studies of mailing lists and web data

    Get PDF
    In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis

    Challenging Issues of Spatio-Temporal Data Mining

    Get PDF
    The spatio-temporal database (STDB) has received considerable attention during the past few years, due to the emergence of numerous applications (e.g., flight control systems, weather forecast, mobile computing, etc.) that demand efficient management of moving objects. These applications record objects' geographical locations (sometimes also shapes) at various timestamps and support queries that explore their historical and future (predictive) behaviors. The STDB significantly extends the traditional spatial database, which deals with only stationary data and hence is inapplicable to moving objects, whose dynamic behavior requires re-investigation of numerous topics including data modeling, indexes, and the related query algorithms. In many application areas, huge amounts of data are generated, explicitly or implicitly containing spatial or spatiotemporal information. However, the ability to analyze these data remains inadequate, and the need for adapted data mining tools becomes a major challenge. In this paper, we have presented the challenging issues of spatio-temporal data mining. Keywords: database, data mining, spatial, temporal, spatio-tempora

    A UML Profile for Variety and Variability Awareness in Multidimensional Design: An application to Agricultural Robots

    Get PDF
    Variety and variability are an inherent source of information wealth in schemaless sources, and executing OLAP sessions on multidimensional data in their presence has recently become an object of research. However, all models devised so far propose a ``rigid'' view of the multidimensional content, without taking into account variety and variability. To fill this gap, in this paper we propose V-ICSOLAP, an extension of the ICSOLAP UML profile that supports extensibility and type/name variability for each multidimensional element, as well as complex data types for measures and levels. The real case study we use to motivate and illustrate our approach is that of trajectory analysis for agricultural robots. As a proof-of-concept for V-ICSOLAP, we propose an implementation that relies on the PostgreSQL multi-model DBMS and we evaluate its performances. We also provide a validation of our UML profile by ranking it against other meta-models based on a set of quality metrics

    Efficient AIS Data Processing for Environmentally Safe Shipping

    Get PDF
    Reducing ship accidents at sea is important to all economic, environmental, and cultural sectors of Greece. Despite an increase in traffic and national monitoring, ships formulate routes according to their best judgment risking an accident. In this study we take a dataset spanning in 3 years from the AIS (Automatic Identification System) network, which is transmitting in public a ship's identity and location with an interval of seconds, and we load it in a trajectory database supported by the Hermes Moving Objects Database (MOD) system. Presented analysis begins by extracting statistics for the dataset, both general (number of ships and position reports) as well as safety related ones. Simple queries on the dataset illustrate the capabilities of Hermes and allow to gain insight on how the ships move in the Greek Seas. Analysis of movement based on an Origin-Destination matrix between interesting areas in the Greek territory is presented. One of the newest challenges that emerged during this process is that the amount of the positioning data is becoming more and more massive. As a conclusion, a preliminary review of possible solutions to this challenge along with others such as dealing with the noise in AIS data is mentioned and we also briefly discuss the need for interdisciplinary cooperation.This research was partially supported by AMINESS project funded by the Greek government (www.aminess.eu). Cyril Ray was supported by a Short Term Scientific Mission performed at the University of Piraeus by the COST Action IC0903 on “Knowledge Discovery from Moving Objects” (http://www.move-cost.info). IMIS Hellas (www.imishel las.gr) kindly provided the AIS dataset for research purposes

    Models for Storing Relationships: Relational vs. Graph Databases

    Get PDF
    Relational databases have been the universal industry standard for almost as long as databases have existed. While relational databases are undoubtedly useful for storing tabular data that fits into a pre-defined schema of rows and columns, they are not very accommodating of interconnections within a data set. Forcing a highly connected data set into a relational database commonly results in severe performance issues in query return time. With the recent rise of social networks and other modern technological advancements, data is quickly becoming more connected and thus less suitable for relational databases. As a result, a new type of database, called a graph database, has emerged to store relationship-oriented data naturally and efficiently using nodes and edges. Deciding which database is more suitable for the task at hand is not always trivial, however. This paper sheds light on the differences between the two databases and delves into why one database might be more advantageous in certain situations

    When Things Matter: A Data-Centric View of the Internet of Things

    Full text link
    With the recent advances in radio-frequency identification (RFID), low-cost wireless sensor devices, and Web technologies, the Internet of Things (IoT) approach has gained momentum in connecting everyday objects to the Internet and facilitating machine-to-human and machine-to-machine communication with the physical world. While IoT offers the capability to connect and integrate both digital and physical entities, enabling a whole new class of applications and services, several significant challenges need to be addressed before these applications and services can be fully realized. A fundamental challenge centers around managing IoT data, typically produced in dynamic and volatile environments, which is not only extremely large in scale and volume, but also noisy, and continuous. This article surveys the main techniques and state-of-the-art research efforts in IoT from data-centric perspectives, including data stream processing, data storage models, complex event processing, and searching in IoT. Open research issues for IoT data management are also discussed

    Auto-ID enabled tracking and tracing data sharing over dynamic B2B and B2G relationships

    Get PDF
    RFID 2011 collocated with the 2011 IEEE MTT-S International Microwave Workshop Series on Millimeter Wave Integration Technologies (IMWS 2011)Growing complexity and uncertainty are still the key challenges enterprises are facing in managing and re-engineering their existing supply chains. To tackle these challenges, they are continuing innovating management practices and piloting emerging technologies for achieving supply chain visibility, agility, adaptability and security. Nowadays, subcontracting has already become a common practice in modern logistics industry through partnership establishment between the involved stakeholders for delivering consignments from a consignor to a consignee. Companies involved in international supply chain are piloting various supply chain security and integrity initiatives promoted by customs to establish trusted business-to-customs partnership for facilitating global trade and cutting out avoidable supply chain costs and delays due to governmental regulations compliance and unnecessary customs inspection. While existing Auto-ID enabled tracking and tracing solutions are promising for implementing these practices, they provide few efficient privacy protection mechanisms for stakeholders involved in the international supply chain to communicate logistics data over dynamic business-to-business and business-government relationships. A unified privacy protection mechanism is proposed in this work to fill in this gap. © 2011 IEEE.published_or_final_versio
    • …
    corecore