68 research outputs found

    Guidance in Business Intelligence & Analytics Systems: A Review and Research Agenda

    Get PDF
    While the data amount grows exponentially, the number of people with analytical and technical skills is only slowly increasing. This skill gap is putting pressure on the labor market and increasing the need for personnel with these skills. At the same time, companies are forced to think of alternative ways to empower their less-skilled workforce to take on Business Intelligence and Analytics (BI&A) tasks. One promising attempt to address these challenges may turn to the concept of guidance. However, the current body of research on guidance in BI&A systems is scattered and lacks a structured investigation from which future research avenues can be derived. To address this gap, this article analyzes five categories, namely BI&A phases, guidance degree, guidance generation, user roles, and interactivity form. Reviewing 82 articles, our contribution is to synopsize articles on guidance in BI&A systems and to suggest five research avenues

    Extension of a task-based model to functional programming

    Get PDF
    Recently, efforts have been made to bring together the areas of high-performance computing (HPC) and massive data processing (Big Data). Traditional HPC frameworks, like COMPSs, are mostly task-based, while popular big-data environments, like Spark, are based on functional programming principles. The earlier are know for their good performance for regular, matrix-based computations; on the other hand, for fine-grained, data-parallel workloads, the later has often been considered more successful. In this paper we present our experience with the integration of some dataflow techniques into COMPSs, a task-based framework, in an effort to bring together the best aspects of both worlds. We present our API, called DDF, which provides a new data abstraction that addresses the challenges of integrating Big Data application scenarios into COMPSs. DDF has a functional-based interface, similar to many Data Science tools, that allows us to use dynamic evaluation to adapt the task execution in runtime. Besides the performance optimization it provides, the API facilitates the development of applications by experts in the application domain. In this paper we evaluate DDF's effectiveness by comparing the resulting programs to their original versions in COMPSs and Spark. The results show that DDF can improve COMPSs execution time and even outperform Spark in many use cases.This work was partially supported by CAPES, CNPq, Fapemig and NIC.BR, and by projects Atmosphere (H2020-EU.2.1.1 777154) and INCT-Cyber.Peer ReviewedPostprint (author's final draft

    Adaptive Automated Machine Learning

    Get PDF
    The ever-growing demand for machine learning has led to the development of automated machine learning (AutoML) systems that can be used off the shelf by non-experts. Further, the demand for ML applications with high predictive performance exceeds the number of machine learning experts and makes the development of AutoML systems necessary. Automated Machine Learning tackles the problem of finding machine learning models with high predictive performance. Existing approaches incorporating deep learning techniques assume that all data is available at the beginning of the training process (offline learning). They configure and optimise a pipeline of preprocessing, feature engineering, and model selection by choosing suitable hyperparameters in each model pipeline step. Furthermore, they assume that the user is fully aware of the choice and, thus, the consequences of the underlying metric (such as precision, recall, or F1-measure). By variation of this metric, the search for suitable configurations and thus the adaptation of algorithms can be tailored to the user’s needs. With the creation of a vast amount of data from all kinds of sources every day, our capability to process and understand these data sets in a single batch is no longer viable. By training machine learning models incrementally (i.ex. online learning), the flood of data can be processed sequentially within data streams. However, if one assumes an online learning scenario, where an AutoML instance executes on evolving data streams, the question of the best model and its configuration remains open. In this work, we address the adaptation of AutoML in an offline learning scenario toward a certain utility an end-user might pursue as well as the adaptation of AutoML towards evolving data streams in an online learning scenario with three main contributions: 1. We propose a System that allows the adaptation of AutoML and the search for neural architectures towards a particular utility an end-user might pursue. 2. We introduce an online deep learning framework that fosters the research of deep learning models under the online learning assumption and enables the automated search for neural architectures. 3. We introduce an online AutoML framework that allows the incremental adaptation of ML models. We evaluate the contributions individually, in accordance with predefined requirements and to state-of-the- art evaluation setups. The outcomes lead us to conclude that (i) AutoML, as well as systems for neural architecture search, can be steered towards individual utilities by learning a designated ranking model from pairwise preferences and using the latter as the target function for the offline learning scenario; (ii) architectual small neural networks are in general suitable assuming an online learning scenario; (iii) the configuration of machine learning pipelines can be automatically be adapted to ever-evolving data streams and lead to better performances

    Machine Learning for Microcontroller-Class Hardware -- A Review

    Get PDF
    The advancements in machine learning opened a new opportunity to bring intelligence to the low-end Internet-of-Things nodes such as microcontrollers. Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers. This paper highlights the unique requirements of enabling onboard machine learning for microcontroller class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of machine learning model development for microcontroller class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa

    Artificial intelligence and machine learning : current applications in real estate

    Get PDF
    Thesis: S.M. in Real Estate Development, Massachusetts Institute of Technology, Program in Real Estate Development in conjunction with the Center for Real Estate, 2018.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 113-117).Real estate meets machine learning: real contribution or just hype? Creating and managing the built environment is a complicated task fraught with difficult decisions, challenging relationships, and a multitude of variables. Today's technology experts are building computers and software that can help resolve many of these challenges, some of them using what is broadly called artificial intelligence and machine learning. This thesis will define machine learning and artificial intelligence for the investor and real estate audience, examine the ways in which these new analytic, predictive, and automating technologies are being used in the real estate industry, and postulate potential future applications and associated challenges. Machine learning and artificial intelligence can and will be used to facilitate real estate investment in myriad ways, spanning all aspects of the real estate profession -- from property management, to investment decisions, to development processes -- transforming real estate into a more efficient and data-driven industry.by Jennifer Conway.S.M. in Real Estate Developmen

    Open Government Data and sustainable value : multi-case comparative analyses of software startups in Brazil

    Get PDF
    Trabalho de conclusão de curso (graduação)—Universidade de Brasília, Faculdade de Economia, Administração e Contabilidade, Departamento de Administração, 2017.This work consists of a comparative analysis of three software startups that consider Open Government Data (OGD) as a key resource of their value propositions. The main objective involves describing and comparing the current and potential for multistakeholder value generation in startups that use Open Government Data. To do this, the author referred to OGD theories and compared them with the primary qualitative data collected. The value generation and the barriers to value delivery identified and analyzed. The external factors thought to influence the startups were contrasted, which contributed to the evaluation of the relationship that this group of OGD users with the overall OGD ecosystem. All of the primary source qualitative data information used was based on the perceptions of the startup’s founders collected through semi-structured interviews. The research is characterized as a descriptive and interpretative multi-case study. From the results obtained strong relationships were noticed between OGD and sustainable value, signaling the potential for these organizations to be multipliers of a scalable solution that generates sustainable value through OGD. The author hopes that the research enriches and creates trails of investigation regarding the potential for private sector startups to contribute to the evolution of the OGD Ecosystem

    Adaptive algorithms for real-world transactional data mining.

    Get PDF
    The accurate identification of the right customer to target with the right product at the right time, through the right channel, to satisfy the customer’s evolving needs, is a key performance driver and enhancer for businesses. Data mining is an analytic process designed to explore usually large amounts of data (typically business or market related) in search of consistent patterns and/or systematic relationships between variables for the purpose of generating explanatory/predictive data models from the detected patterns. It provides an effective and established mechanism for accurate identification and classification of customers. Data models derived from the data mining process can aid in effectively recognizing the status and preference of customers - individually and as a group. Such data models can be incorporated into the business market segmentation, customer targeting and channelling decisions with the goal of maximizing the total customer lifetime profit. However, due to costs, privacy and/or data protection reasons, the customer data available for data mining is often restricted to verified and validated data,(in most cases,only the business owned transactional data is available). Transactional data is a valuable resource for generating such data models. Transactional data can be electronically collected and readily made available for data mining in large quantity at minimum extra cost. Transactional data is however, inherently sparse and skewed. These inherent characteristics of transactional data give rise to the poor performance of data models built using customer data based on transactional data. Data models for identifying, describing, and classifying customers, constructed using evolving transactional data thus need to effectively handle the inherent sparseness and skewness of evolving transactional data in order to be efficient and accurate. Using real-world transactional data, this thesis presents the findings and results from the investigation of data mining algorithms for analysing, describing, identifying and classifying customers with evolving needs. In particular, methods for handling the issues of scalability, uncertainty and adaptation whilst mining evolving transactional data are analysed and presented. A novel application of a new framework for integrating transactional data binning and classification techniques is presented alongside an effective prototype selection algorithm for efficient transactional data model building. A new change mining architecture for monitoring, detecting and visualizing the change in customer behaviour using transactional data is proposed and discussed as an effective means for analysing and understanding the change in customer buying behaviour over time. Finally, the challenging problem of discerning between the change in the customer profile (which may necessitate the effective change of the customer’s label) and the change in performance of the model(s) (which may necessitate changing or adapting the model(s)) is introduced and discussed by way of a novel flexible and efficient architecture for classifier model adaptation and customer profiles class relabeling

    Fintech report

    Get PDF
    FinTech refers to Financial Technology companies, particularly businesses that use technology to provide productsor solutions in the field of finance. The main goal of this report is to depict the Iberian Fin Tech environment. In order to do this, the report can be broadly divided in three different parts. The Global and European Fin Tech Environment, that show the growth of financial technology and its main players bothin European din the restof the World, and a deepdivein Iberia, where the Fin Tech market is show cased through at horough research of its main players, sectors and industries
    corecore