5,438 research outputs found

    Automatic generation of software interfaces for supporting decisionmaking processes. An application of domain engineering & machine learning

    Get PDF
    [EN] Data analysis is a key process to foster knowledge generation in particular domains or fields of study. With a strong informative foundation derived from the analysis of collected data, decision-makers can make strategic choices with the aim of obtaining valuable benefits in their specific areas of action. However, given the steady growth of data volumes, data analysis needs to rely on powerful tools to enable knowledge extraction. Information dashboards offer a software solution to analyze large volumes of data visually to identify patterns and relations and make decisions according to the presented information. But decision-makers may have different goals and, consequently, different necessities regarding their dashboards. Moreover, the variety of data sources, structures, and domains can hamper the design and implementation of these tools. This Ph.D. Thesis tackles the challenge of improving the development process of information dashboards and data visualizations while enhancing their quality and features in terms of personalization, usability, and flexibility, among others. Several research activities have been carried out to support this thesis. First, a systematic literature mapping and review was performed to analyze different methodologies and solutions related to the automatic generation of tailored information dashboards. The outcomes of the review led to the selection of a modeldriven approach in combination with the software product line paradigm to deal with the automatic generation of information dashboards. In this context, a meta-model was developed following a domain engineering approach. This meta-model represents the skeleton of information dashboards and data visualizations through the abstraction of their components and features and has been the backbone of the subsequent generative pipeline of these tools. The meta-model and generative pipeline have been tested through their integration in different scenarios, both theoretical and practical. Regarding the theoretical dimension of the research, the meta-model has been successfully integrated with other meta-model to support knowledge generation in learning ecosystems, and as a framework to conceptualize and instantiate information dashboards in different domains. In terms of the practical applications, the focus has been put on how to transform the meta-model into an instance adapted to a specific context, and how to finally transform this later model into code, i.e., the final, functional product. These practical scenarios involved the automatic generation of dashboards in the context of a Ph.D. Programme, the application of Artificial Intelligence algorithms in the process, and the development of a graphical instantiation platform that combines the meta-model and the generative pipeline into a visual generation system. Finally, different case studies have been conducted in the employment and employability, health, and education domains. The number of applications of the meta-model in theoretical and practical dimensions and domains is also a result itself. Every outcome associated to this thesis is driven by the dashboard meta-model, which also proves its versatility and flexibility when it comes to conceptualize, generate, and capture knowledge related to dashboards and data visualizations

    Research and Education in Computational Science and Engineering

    Get PDF
    Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Geo-Spotting: Mining Online Location-based Services for Optimal Retail Store Placement

    Full text link
    The problem of identifying the optimal location for a new retail store has been the focus of past research, especially in the field of land economy, due to its importance in the success of a business. Traditional approaches to the problem have factored in demographics, revenue and aggregated human flow statistics from nearby or remote areas. However, the acquisition of relevant data is usually expensive. With the growth of location-based social networks, fine grained data describing user mobility and popularity of places has recently become attainable. In this paper we study the predictive power of various machine learning features on the popularity of retail stores in the city through the use of a dataset collected from Foursquare in New York. The features we mine are based on two general signals: geographic, where features are formulated according to the types and density of nearby places, and user mobility, which includes transitions between venues or the incoming flow of mobile users from distant areas. Our evaluation suggests that the best performing features are common across the three different commercial chains considered in the analysis, although variations may exist too, as explained by heterogeneities in the way retail facilities attract users. We also show that performance improves significantly when combining multiple features in supervised learning algorithms, suggesting that the retail success of a business may depend on multiple factors.Comment: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, Chicago, 2013, Pages 793-80

    Smart Maintenance: an empirically grounded conceptualization

    Get PDF
    How do modernized maintenance operations, often referred to as “Smart Maintenance”, impact the performance of manufacturing plants? The inability to answer this question backed by data is a problem for industrial maintenance management, especially in light of the ongoing rapid transition towards an industrial environment with pervasive digital technologies. To this end, this paper, which is the first part of a two-paper series, aims to investigate and answer the question, “What is Smart Maintenance?”. The authors deployed an empirical, inductive research approach to conceptualize Smart Maintenance using focus groups and interviews with more than 110 experts from over 20 different firms. By viewing our original data through the lens of multiple general theories, our findings chart new directions for contemporary and future maintenance research. This paper describes empirical observations and theoretical interpretations cumulating in the first empirically grounded definition of Smart Maintenance and its four underlying dimensions; data-driven decision-making, human capital resource, internal integration, and external integration. In addition, the relationships between the underlying dimensions are specified and the concept structure formally modeled. This study thus achieves concept clarity with respect to Smart Maintenance, thereby making several theoretical and managerial contributions that guide both scholars and practitioners within the field of industrial maintenance management

    The Elements of Big Data Value

    Get PDF
    This open access book presents the foundations of the Big Data research and innovation ecosystem and the associated enablers that facilitate delivering value from data for business and society. It provides insights into the key elements for research and innovation, technical architectures, business models, skills, and best practices to support the creation of data-driven solutions and organizations. The book is a compilation of selected high-quality chapters covering best practices, technologies, experiences, and practical recommendations on research and innovation for big data. The contributions are grouped into four parts: · Part I: Ecosystem Elements of Big Data Value focuses on establishing the big data value ecosystem using a holistic approach to make it attractive and valuable to all stakeholders. · Part II: Research and Innovation Elements of Big Data Value details the key technical and capability challenges to be addressed for delivering big data value. · Part III: Business, Policy, and Societal Elements of Big Data Value investigates the need to make more efficient use of big data and understanding that data is an asset that has significant potential for the economy and society. · Part IV: Emerging Elements of Big Data Value explores the critical elements to maximizing the future potential of big data value. Overall, readers are provided with insights which can support them in creating data-driven solutions, organizations, and productive data ecosystems. The material represents the results of a collective effort undertaken by the European data community as part of the Big Data Value Public-Private Partnership (PPP) between the European Commission and the Big Data Value Association (BDVA) to boost data-driven digital transformation

    PADL: A Modeling and Deployment Language for Advanced Analytical Services

    Get PDF
    In the smart city context, Big Data analytics plays an important role in processing the data collected through IoT devices. The analysis of the information gathered by sensors favors the generation of specific services and systems that not only improve the quality of life of the citizens, but also optimize the city resources. However, the difficulties of implementing this entire process in real scenarios are manifold, including the huge amount and heterogeneity of the devices, their geographical distribution, and the complexity of the necessary IT infrastructures. For this reason, the main contribution of this paper is the PADL description language, which has been specifically tailored to assist in the definition and operationalization phases of the machine learning life cycle. It provides annotations that serve as an abstraction layer from the underlying infrastructure and technologies, hence facilitating the work of data scientists and engineers. Due to its proficiency in the operationalization of distributed pipelines over edge, fog, and cloud layers, it is particularly useful in the complex and heterogeneous environments of smart cities. For this purpose, PADL contains functionalities for the specification of monitoring, notifications, and actuation capabilities. In addition, we provide tools that facilitate its adoption in production environments. Finally, we showcase the usefulness of the language by showing the definition of PADL-compliant analytical pipelines over two uses cases in a smart city context (flood control and waste management), demonstrating that its adoption is simple and beneficial for the definition of information and process flows in such environments.This work was partially supported by the SPRI–Basque Government through their ELKARTEK program (3KIA project, ref. KK-2020/00049). Aitor Almeida’s participation was supported by the FuturAAL-Ego project (RTI2018-101045-A-C22) granted by the Spanish Ministry of Science, Innovation and Universities. Javier Del Ser also acknowledges funding support from the Consolidated Research Group MATHMODE (IT1294-19), granted by the Department of Education of the Basque Government
    • …
    corecore