9 research outputs found

    Adapting a quality model for a Big Data application: the case of a feature prediction system

    Get PDF
    En la última década hemos sido testigos del considerable incremento de proyectos basados en aplicaciones de Big Data. Algunos de los tipos más populares de esas aplicaciones han sido: los sistemas de recomendaciones, la predicción de características y la toma de decisiones. En este nuevo auge han surgido propuestas de implementación de modelos de calidad para las aplicaciones de Big data que por su gran heterogeneidad se hace difícil la selección del modelo de calidad ideal para el desarrollo de un tipo específico de aplicación de Big Data. En el presente Trabajo de Fin de Máster se realiza un estudio de mapeo sistemático (SMS, por sus siglas en inglés) que parte de dos preguntas clave de investigación. La primera trata sobre cuál es el estado en la identificación de riesgos, problemas o desafíos en las aplicaciones de Big Data. La segunda, trata sobre qué modelos de calidad se han aplicado hasta la fecha a las aplicaciones de Big Data, específicamente a los sistemas de predicción de características. El objetivo final es analizar los modelos de calidad disponibles y adaptar un modelo de calidad a partir de los existentes que se puedan aplicar a un tipo específico de aplicación de Big Data: los sistemas de predicción de características. El modelo definido comprende un conjunto de características de calidad definidas como parte del modelo y métricas de calidad para evaluarlas. Finalmente, se realiza una aproximación a un caso de estudio donde se aplica el modelo y se evalúan las características de calidad definidas a través de sus métricas de calidad presentándose los resultados obtenidos.In the last decade, we have been witnesses of the considerable increment of projects based on big data applications. Some of the most popular types of those applications have been: Recommendations, Feature Predictions, and Decision making. In this new context, several proposals have arisen for the implementation of quality models applied to Big Data applications. As part of the current Master thesis, a Systematic Mapping Study (SMS) is conducted which starts from two key research questions. The first one is about what is the state of the art about the identification of risks, issues, problems, or challenges in big data applications. The second one, is about which quality models have been applied up to date to big data applications, specifically to feature prediction systems. The main objective is to analyze the available quality models and adapt a quality model from the existing ones that can be applied to a specific type of Big Data application: The Feature Prediction Systems. The defined model comprises a set of quality characteristics defined as part of the model and a set of quality metrics to evaluate them. Finally, an approach is made to a case study where the model is applied, and the quality characteristics defined through its quality metrics are evaluated. The results are presented and discussed.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Máster en Ingeniería Informátic

    Geospatial Data Management Research: Progress and Future Directions

    Get PDF
    Without geospatial data management, today´s challenges in big data applications such as earth observation, geographic information system/building information modeling (GIS/BIM) integration, and 3D/4D city planning cannot be solved. Furthermore, geospatial data management plays a connecting role between data acquisition, data modelling, data visualization, and data analysis. It enables the continuous availability of geospatial data and the replicability of geospatial data analysis. In the first part of this article, five milestones of geospatial data management research are presented that were achieved during the last decade. The first one reflects advancements in BIM/GIS integration at data, process, and application levels. The second milestone presents theoretical progress by introducing topology as a key concept of geospatial data management. In the third milestone, 3D/4D geospatial data management is described as a key concept for city modelling, including subsurface models. Progress in modelling and visualization of massive geospatial features on web platforms is the fourth milestone which includes discrete global grid systems as an alternative geospatial reference framework. The intensive use of geosensor data sources is the fifth milestone which opens the way to parallel data storage platforms supporting data analysis on geosensors. In the second part of this article, five future directions of geospatial data management research are presented that have the potential to become key research fields of geospatial data management in the next decade. Geo-data science will have the task to extract knowledge from unstructured and structured geospatial data and to bridge the gap between modern information technology concepts and the geo-related sciences. Topology is presented as a powerful and general concept to analyze GIS and BIM data structures and spatial relations that will be of great importance in emerging applications such as smart cities and digital twins. Data-streaming libraries and “in-situ” geo-computing on objects executed directly on the sensors will revolutionize geo-information science and bridge geo-computing with geospatial data management. Advanced geospatial data visualization on web platforms will enable the representation of dynamically changing geospatial features or moving objects’ trajectories. Finally, geospatial data management will support big geospatial data analysis, and graph databases are expected to experience a revival on top of parallel and distributed data stores supporting big geospatial data analysis

    User Perceptions of Information Quality in China: The Boomerang Decade

    Get PDF
    China has adopted and implemented the Internet as a vehicle for economic development during the past several decades. As this has occurred, the Chinese national government has sought to control access to information in various ways over time. As political philosophies have changed over time, so has control over the ways in which users are able to publish and access information through the Internet in China. This study examines user perceptions of information quality in China over the decade beginning in 2007 and ending in 2017. Data were collected three times at five-year intervals. The results show that user perceptions have changed in a way that is consistent with changes in control over use of the Internet in China during this ten-year period. Specifically, user perceptions of information quality along a number of dimensions are similar at the beginning and end of this decade and either significantly higher or lower in the middle of the decade in ways that are consistent with Chinese control of the Internet in the middle of this decade. Our research shows that users are sensitive to information quality issues in that the changes in Chinese Internet users’ perceptions have shifted in parallel with public events and governmental practices. China is a prototypical case of tight government control of the Internet. The findings of this study shed light on user perceptions in one society of this type. In the long run, information providers should strive to provide high quality information as a strategy for mitigating the effects of fake news

    Big data quality framework: a holistic approach to continuous quality management

    Get PDF
    Big Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper

    Problems of Improving the Quality Management System of a University in the Era of Big Data (Experience of Professional Training of Railway Engineers)

    Get PDF
    The application of the analytical approach to quality management in higher education bodies is particularly relevant. The problems of formation, implementation, and practical use of quality management system in higher education have not yet been sufficiently reflected in the studies of foreign and domestic specialists, although there are a large number of publications on the formation and implementation of quality management systems in industrial production conditions. The publication intends to consider the main triad of education quality components: conditions, process, and result of educational activity. The quality of education is presented as a system functionally related to all parameters and a measurable characteristic of the functioning of an educational organization. The quality of such functioning is represented as a degree of implementation of the main goal, which consists of achieving a given (normative) level of students’ preparedness and providing the basis for sustainable development. As an illustration to the theoretical provisions, the experience of implementation and certification of quality management system in the leading technical university of Russia is considered, recommendations on the development of strategic and operational management documents in the implementation and practical use of quality management system in the university are given

    Big Data and Its Applications in Smart Real Estate and the Disaster Management Life Cycle: A Systematic Analysis

    Get PDF
    Big data is the concept of enormous amounts of data being generated daily in different fields due to the increased use of technology and internet sources. Despite the various advancements and the hopes of better understanding, big data management and analysis remain a challenge, calling for more rigorous and detailed research, as well as the identifications of methods and ways in which big data could be tackled and put to good use. The existing research lacks in discussing and evaluating the pertinent tools and technologies to analyze big data in an efficient manner which calls for a comprehensive and holistic analysis of the published articles to summarize the concept of big data and see field-specific applications. To address this gap and keep a recent focus, research articles published in last decade, belonging to top-tier and high-impact journals, were retrieved using the search engines of Google Scholar, Scopus, and Web of Science that were narrowed down to a set of 139 relevant research articles. Different analyses were conducted on the retrieved papers including bibliometric analysis, keywords analysis, big data search trends, and authors’ names, countries, and affiliated institutes contributing the most to the field of big data. The comparative analyses show that, conceptually, big data lies at the intersection of the storage, statistics, technology, and research fields and emerged as an amalgam of these four fields with interlinked aspects such as data hosting and computing, data management, data refining, data patterns, and machine learning. The results further show that major characteristics of big data can be summarized using the seven Vs, which include variety, volume, variability, value, visualization, veracity, and velocity. Furthermore, the existing methods for big data analysis, their shortcomings, and the possible directions were also explored that could be taken for harnessing technology to ensure data analysis tools could be upgraded to be fast and efficient. The major challenges in handling big data include efficient storage, retrieval, analysis, and visualization of the large heterogeneous data, which can be tackled through authentication such as Kerberos and encrypted files, logging of attacks, secure communication through Secure Sockets Layer (SSL) and Transport Layer Security (TLS), data imputation, building learning models, dividing computations into sub-tasks, checkpoint applications for recursive tasks, and using Solid State Drives (SDD) and Phase Change Material (PCM) for storage. In terms of frameworks for big data management, two frameworks exist including Hadoop and Apache Spark, which must be used simultaneously to capture the holistic essence of the data and make the analyses meaningful, swift, and speedy. Further field-specific applications of big data in two promising and integrated fields, i.e., smart real estate and disaster management, were investigated, and a framework for field-specific applications, as well as a merger of the two areas through big data, was highlighted. The proposed frameworks show that big data can tackle the ever-present issues of customer regrets related to poor quality of information or lack of information in smart real estate to increase the customer satisfaction using an intermediate organization that can process and keep a check on the data being provided to the customers by the sellers and real estate managers. Similarly, for disaster and its risk management, data from social media, drones, multimedia, and search engines can be used to tackle natural disasters such as floods, bushfires, and earthquakes, as well as plan emergency responses. In addition, a merger framework for smart real estate and disaster risk management show that big data generated from the smart real estate in the form of occupant data, facilities management, and building integration and maintenance can be shared with the disaster risk management and emergency response teams to help prevent, prepare, respond to, or recover from the disasters

    Quality Management in Big Data

    No full text
    Due to the importance of quality issues in Big Data, Big Data quality management has attracted significant research attention on how to measure, improve and manage the quality for Big Data. This special issue in the Journal of Informatics thus tends to address the quality problems in Big Data as well as promote further research for Big Data quality. Our editorial describes the state-of-the-art research challenges in the Big Data quality research, and highlights the contributions of each paper accepted in this special issue
    corecore