280 research outputs found

    Investigating the attainment of optimum data quality for EHR Big Data: proposing a new methodological approach

    Get PDF
    The value derivable from the use of data is continuously increasing since some years. Both commercial and non-commercial organisations have realised the immense benefits that might be derived if all data at their disposal could be analysed and form the basis of decision taking. The technological tools required to produce, capture, store, transmit and analyse huge amounts of data form the background to the development of the phenomenon of Big Data. With Big Data, the aim is to be able to generate value from huge amounts of data, often in non-structured format and produced extremely frequently. However, the potential value derivable depends on general level of governance of data, more precisely on the quality of the data. The field of data quality is well researched for traditional data uses but is still in its infancy for the Big Data context. This dissertation focused on investigating effective methods to enhance data quality for Big Data. The principal deliverable of this research is in the form of a methodological approach which can be used to optimize the level of data quality in the Big Data context. Since data quality is contextual, (that is a non-generalizable field), this research study focuses on applying the methodological approach in one use case, in terms of the Electronic Health Records (EHR). The first main contribution to knowledge of this study systematically investigates which data quality dimensions (DQDs) are most important for EHR Big Data. The two most important dimensions ascertained by the research methods applied in this study are accuracy and completeness. These are two well-known dimensions, and this study confirms that they are also very important for EHR Big Data. The second important contribution to knowledge is an investigation into whether Artificial Intelligence with a special focus upon machine learning could be used in improving the detection of dirty data, focusing on the two data quality dimensions of accuracy and completeness. Regression and clustering algorithms proved to be more adequate for accuracy and completeness related issues respectively, based on the experiments carried out. However, the limits of implementing and using machine learning algorithms for detecting data quality issues for Big Data were also revealed and discussed in this research study. It can safely be deduced from the knowledge derived from this part of the research study that use of machine learning for enhancing data quality issues detection is a promising area but not yet a panacea which automates this entire process. The third important contribution is a proposed guideline to undertake data repairs most efficiently for Big Data; this involved surveying and comparing existing data cleansing algorithms against a prototype developed for data reparation. Weaknesses of existing algorithms are highlighted and are considered as areas of practice which efficient data reparation algorithms must focus upon. Those three important contributions form the nucleus for a new data quality methodological approach which could be used to optimize Big Data quality, as applied in the context of EHR. Some of the activities and techniques discussed through the proposed methodological approach can be transposed to other industries and use cases to a large extent. The proposed data quality methodological approach can be used by practitioners of Big Data Quality who follow a data-driven strategy. As opposed to existing Big Data quality frameworks, the proposed data quality methodological approach has the advantage of being more precise and specific. It gives clear and proven methods to undertake the main identified stages of a Big Data quality lifecycle and therefore can be applied by practitioners in the area. This research study provides some promising results and deliverables. It also paves the way for further research in the area. Technical and technological changes in Big Data is rapidly evolving and future research should be focusing on new representations of Big Data, the real-time streaming aspect, and replicating same research methods used in this current research study but on new technologies to validate current results

    Towards A Formal And Scalable Approach For Quantifying Software Reliability At Early Development Stages

    Get PDF
    Problems which originate in early development stages can have a lasting influence on the reliability, safety, and cost of a software system. The requirements document, which is usually available at the requirements analysis stage, must be correct, unambiguous, and complete if the rest of the development effort is to succeed. The ability to identify faults in requirements and predict the reliability of a software system early in its development can help organizations make informative decisions about corrective actions and improve the system's quality in a cost-effective manner. A review of the literature reveals that existing approaches are unsuited to provide trustworthy reliability prediction either due to the ignorance of the requirements documents, or because of the informal and fairly sketchy way in detecting faults in requirements. This study explores the use of a preselected software reliability measurement for early software faults detection and reliability prediction. This measurement, originally a black-box testing technique, was broadly recognized for its ability to detect incomplete and ambiguous requirements, although no information was found in the literature about how to take advantage of its power. This study mathematically formalized the measurement to enhance its rigidity, repeatability and scalability and further extended it as an effective requirements faults detection technique. An automation-oriented algorithm was developed for quantifying the impact of the detected requirements faults on software reliability. The feasibility and scalability of the proposed approach for early faults detection and reliability prediction were examined using two real applications. The results clearly confirmed its feasibility and usefulness, particularly when no failure data is available and other methods are not applicable. The scalability barriers were also spotted in the approach. An empirical study was thus conducted to gain insight into the nature of the technical barriers. As an attempt to overcome the barrier, a set of rules was proposed based on the observed patterns. Finally, a preliminarily controlled experiment was conducted to evaluate the usability of the proposed rules. This study will enable software project stakeholders to effectively detect requirements faults and assess the quality of requirements early in development, and ultimately lead to improved software reliability if the identified faults are removed in time. Software project practitioners, regulators, and policy makers involved in the certification of software systems can benefit most from the techniques proposed in this study

    Maintenance Optimization and Inspection Planning of Wind Energy Assets: Models, Methods and Strategies

    Get PDF
    Designing cost-effective inspection and maintenance programmes for wind energy farms is a complex task involving a high degree of uncertainty due to diversity of assets and their corresponding damage mechanisms and failure modes, weather-dependent transport conditions, unpredictable spare parts demand, insufficient space or poor accessibility for maintenance and repair, limited availability of resources in terms of equipment and skilled manpower, etc. In recent years, maintenance optimization has attracted the attention of many researchers and practitioners from various sectors of the wind energy industry, including manufacturers, component suppliers, maintenance contractors and others. In this paper, we propose a conceptual classification framework for the available literature on maintenance policy optimization and inspection planning of wind energy systems and structures (turbines, foundations, power cables and electrical substations). The developed framework addresses a wide range of theoretical and practical issues, including the models, methods, and the strategies employed to optimise maintenance decisions and inspection procedures in wind farms. The literature published to date on the subject of this article is critically reviewed and several research gaps are identified. Moreover, the available studies are systematically classified using different criteria and some research directions of potential interest to operational researchers are highlighted

    Data Challenges and Data Analytics Solutions for Power Systems

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Early aspects: aspect-oriented requirements engineering and architecture design

    Get PDF
    This paper reports on the third Early Aspects: Aspect-Oriented Requirements Engineering and Architecture Design Workshop, which has been held in Lancaster, UK, on March 21, 2004. The workshop included a presentation session and working sessions in which the particular topics on early aspects were discussed. The primary goal of the workshop was to focus on challenges to defining methodical software development processes for aspects from early on in the software life cycle and explore the potential of proposed methods and techniques to scale up to industrial applications

    Combining SOA and BPM Technologies for Cross-System Process Automation

    Get PDF
    This paper summarizes the results of an industry case study that introduced a cross-system business process automation solution based on a combination of SOA and BPM standard technologies (i.e., BPMN, BPEL, WSDL). Besides discussing major weaknesses of the existing, custom-built, solution and comparing them against experiences with the developed prototype, the paper presents a course of action for transforming the current solution into the proposed solution. This includes a general approach, consisting of four distinct steps, as well as specific action items that are to be performed for every step. The discussion also covers language and tool support and challenges arising from the transformation

    Formal methods and digital systems validation for airborne systems

    Get PDF
    This report has been prepared to supplement a forthcoming chapter on formal methods in the FAA Digital Systems Validation Handbook. Its purpose is as follows: to outline the technical basis for formal methods in computer science; to explain the use of formal methods in the specification and verification of software and hardware requirements, designs, and implementations; to identify the benefits, weaknesses, and difficulties in applying these methods to digital systems used on board aircraft; and to suggest factors for consideration when formal methods are offered in support of certification. These latter factors assume the context for software development and assurance described in RTCA document DO-178B, 'Software Considerations in Airborne Systems and Equipment Certification,' Dec. 1992

    Big Data and Its Applications in Smart Real Estate and the Disaster Management Life Cycle: A Systematic Analysis

    Get PDF
    Big data is the concept of enormous amounts of data being generated daily in different fields due to the increased use of technology and internet sources. Despite the various advancements and the hopes of better understanding, big data management and analysis remain a challenge, calling for more rigorous and detailed research, as well as the identifications of methods and ways in which big data could be tackled and put to good use. The existing research lacks in discussing and evaluating the pertinent tools and technologies to analyze big data in an efficient manner which calls for a comprehensive and holistic analysis of the published articles to summarize the concept of big data and see field-specific applications. To address this gap and keep a recent focus, research articles published in last decade, belonging to top-tier and high-impact journals, were retrieved using the search engines of Google Scholar, Scopus, and Web of Science that were narrowed down to a set of 139 relevant research articles. Different analyses were conducted on the retrieved papers including bibliometric analysis, keywords analysis, big data search trends, and authors’ names, countries, and affiliated institutes contributing the most to the field of big data. The comparative analyses show that, conceptually, big data lies at the intersection of the storage, statistics, technology, and research fields and emerged as an amalgam of these four fields with interlinked aspects such as data hosting and computing, data management, data refining, data patterns, and machine learning. The results further show that major characteristics of big data can be summarized using the seven Vs, which include variety, volume, variability, value, visualization, veracity, and velocity. Furthermore, the existing methods for big data analysis, their shortcomings, and the possible directions were also explored that could be taken for harnessing technology to ensure data analysis tools could be upgraded to be fast and efficient. The major challenges in handling big data include efficient storage, retrieval, analysis, and visualization of the large heterogeneous data, which can be tackled through authentication such as Kerberos and encrypted files, logging of attacks, secure communication through Secure Sockets Layer (SSL) and Transport Layer Security (TLS), data imputation, building learning models, dividing computations into sub-tasks, checkpoint applications for recursive tasks, and using Solid State Drives (SDD) and Phase Change Material (PCM) for storage. In terms of frameworks for big data management, two frameworks exist including Hadoop and Apache Spark, which must be used simultaneously to capture the holistic essence of the data and make the analyses meaningful, swift, and speedy. Further field-specific applications of big data in two promising and integrated fields, i.e., smart real estate and disaster management, were investigated, and a framework for field-specific applications, as well as a merger of the two areas through big data, was highlighted. The proposed frameworks show that big data can tackle the ever-present issues of customer regrets related to poor quality of information or lack of information in smart real estate to increase the customer satisfaction using an intermediate organization that can process and keep a check on the data being provided to the customers by the sellers and real estate managers. Similarly, for disaster and its risk management, data from social media, drones, multimedia, and search engines can be used to tackle natural disasters such as floods, bushfires, and earthquakes, as well as plan emergency responses. In addition, a merger framework for smart real estate and disaster risk management show that big data generated from the smart real estate in the form of occupant data, facilities management, and building integration and maintenance can be shared with the disaster risk management and emergency response teams to help prevent, prepare, respond to, or recover from the disasters
    • …
    corecore