1,024 research outputs found
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
Challenging Issues of Spatio-Temporal Data Mining
The spatio-temporal database (STDB) has received considerable attention during the past few years, due to the emergence of numerous applications (e.g., flight control systems, weather forecast, mobile computing, etc.) that demand efficient management of moving objects. These applications record objects' geographical locations (sometimes also shapes) at various timestamps and support queries that explore their historical and future (predictive) behaviors. The STDB significantly extends the traditional spatial database, which deals with only stationary data and hence is inapplicable to moving objects, whose dynamic behavior requires re-investigation of numerous topics including data modeling, indexes, and the related query algorithms. In many application areas, huge amounts of data are generated, explicitly or implicitly containing spatial or spatiotemporal information. However, the ability to analyze these data remains inadequate, and the need for adapted data mining tools becomes a major challenge. In this paper, we have presented the challenging issues of spatio-temporal data mining. Keywords: database, data mining, spatial, temporal, spatio-tempora
A UML Profile for Variety and Variability Awareness in Multidimensional Design: An application to Agricultural Robots
Variety and variability are an inherent source of information wealth in schemaless sources, and executing OLAP sessions on multidimensional data in their presence has recently become an object of research. However, all models devised so far propose a ``rigid'' view of the multidimensional content, without taking into account variety and variability. To fill this gap, in this paper we propose V-ICSOLAP, an extension of the ICSOLAP UML profile that supports extensibility and type/name variability for each multidimensional element, as well as complex data types for measures and levels. The real case study we use to motivate and illustrate our approach is that of trajectory analysis for agricultural robots. As a proof-of-concept for V-ICSOLAP, we propose an implementation that relies on the PostgreSQL multi-model DBMS and we evaluate its performances. We also provide a validation of our UML profile by ranking it against other meta-models based on a set of quality metrics
Efficient AIS Data Processing for Environmentally Safe Shipping
Reducing ship accidents at sea is important to all economic, environmental, and cultural sectors of Greece. Despite an increase in traffic and national monitoring, ships formulate routes according to their best judgment risking an accident. In this study we take a dataset spanning in 3 years from the AIS (Automatic Identification System) network, which is transmitting in public a ship's identity and location with an interval of seconds, and we load it in a trajectory database supported by the Hermes Moving Objects Database (MOD) system. Presented analysis begins by extracting statistics for the dataset, both general (number of ships and position reports) as well as safety related ones. Simple queries on the dataset illustrate the capabilities of Hermes and allow to gain insight on how the ships move in the Greek Seas. Analysis of movement based on an Origin-Destination matrix between interesting areas in the Greek territory is presented. One of the newest challenges that emerged during this process is that the amount of the positioning data is becoming more and more massive. As a conclusion, a preliminary review of possible solutions to this challenge along with others such as dealing with the noise in AIS data is mentioned and we also briefly discuss the need for interdisciplinary cooperation.This research was partially supported by AMINESS project funded by the Greek government (www.aminess.eu). Cyril Ray was supported by a Short Term Scientific Mission performed at the University of Piraeus by the COST Action IC0903 on “Knowledge Discovery from Moving Objects” (http://www.move-cost.info). IMIS Hellas (www.imishel las.gr) kindly provided the AIS dataset for research purposes
Models for Storing Relationships: Relational vs. Graph Databases
Relational databases have been the universal industry standard for almost as long as databases have existed. While relational databases are undoubtedly useful for storing tabular data that fits into a pre-defined schema of rows and columns, they are not very accommodating of interconnections within a data set. Forcing a highly connected data set into a relational database commonly results in severe performance issues in query return time. With the recent rise of social networks and other modern technological advancements, data is quickly becoming more connected and thus less suitable for relational databases. As a result, a new type of database, called a graph database, has emerged to store relationship-oriented data naturally and efficiently using nodes and edges. Deciding which database is more suitable for the task at hand is not always trivial, however. This paper sheds light on the differences between the two databases and delves into why one database might be more advantageous in certain situations
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
Auto-ID enabled tracking and tracing data sharing over dynamic B2B and B2G relationships
RFID 2011 collocated with the 2011 IEEE MTT-S International Microwave Workshop Series on Millimeter Wave Integration Technologies (IMWS 2011)Growing complexity and uncertainty are still the key challenges enterprises are facing in managing and re-engineering their existing supply chains. To tackle these challenges, they are continuing innovating management practices and piloting emerging technologies for achieving supply chain visibility, agility, adaptability and security. Nowadays, subcontracting has already become a common practice in modern logistics industry through partnership establishment between the involved stakeholders for delivering consignments from a consignor to a consignee. Companies involved in international supply chain are piloting various supply chain security and integrity initiatives promoted by customs to establish trusted business-to-customs partnership for facilitating global trade and cutting out avoidable supply chain costs and delays due to governmental regulations compliance and unnecessary customs inspection. While existing Auto-ID enabled tracking and tracing solutions are promising for implementing these practices, they provide few efficient privacy protection mechanisms for stakeholders involved in the international supply chain to communicate logistics data over dynamic business-to-business and business-government relationships. A unified privacy protection mechanism is proposed in this work to fill in this gap. © 2011 IEEE.published_or_final_versio
- …