46 research outputs found

    Temporal Information in Data Science: An Integrated Framework and its Applications

    Get PDF
    Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems.Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems

    Blockchain-IoT-driven nursing workforce planning for effective long-term care management in nursing homes

    Get PDF
    Due to the global ageing population, the increasing demand for long-term care services for the elderly has directed considerable attention towards the renovation of nursing homes. Although nursing homes play an essential role within residential elderly care, professional shortages have created serious pressure on the elderly service sector. Effective workforce planning is vital for improving the efficacy and workload balance of existing nursing staff in today's complex and volatile long-term care service market. Currently, there is lack of an integrated solution to monitor care services and determine the optimal nursing staffing strategy in nursing homes. This study addresses the above challenge through the formulation of nursing staffing optimisation under the blockchain-internet of things (BIoT) environment. Embedding a blockchain into IoT establishes the long-term care platform for the elderly and care workers, thereby decentralising long-term care information in the nursing home network to achieve effective care service monitoring. Moreover, such information is further utilised to optimise nursing staffing by using a genetic algorithm. A case study of a Hong Kong nursing home was conducted to illustrate the effectiveness of the proposed system. We found that the total monthly staffing cost after using the proposed model was significantly lower than the existing practice with a change of -13.48%, which considers the use of heterogeneous workforce and temporary staff. Besides, the care monitoring and staffing flexibility are further enhanced, in which the concept of skill substitution is integrated in nursing staffing optimisation

    A Polyhedral Study of Mixed 0-1 Set

    Get PDF
    We consider a variant of the well-known single node fixed charge network flow set with constant capacities. This set arises from the relaxation of more general mixed integer sets such as lot-sizing problems with multiple suppliers. We provide a complete polyhedral characterization of the convex hull of the given set

    Explainable clinical decision support system: opening black-box meta-learner algorithm expert's based

    Get PDF
    Mathematical optimization methods are the basic mathematical tools of all artificial intelligence theory. In the field of machine learning and deep learning the examples with which algorithms learn (training data) are used by sophisticated cost functions which can have solutions in closed form or through approximations. The interpretability of the models used and the relative transparency, opposed to the opacity of the black-boxes, is related to how the algorithm learns and this occurs through the optimization and minimization of the errors that the machine makes in the learning process. In particular in the present work is introduced a new method for the determination of the weights in an ensemble model, supervised and unsupervised, based on the well known Analytic Hierarchy Process method (AHP). This method is based on the concept that behind the choice of different and possible algorithms to be used in a machine learning problem, there is an expert who controls the decisionmaking process. The expert assigns a complexity score to each algorithm (based on the concept of complexity-interpretability trade-off) through which the weight with which each model contributes to the training and prediction phase is determined. In addition, different methods are presented to evaluate the performance of these algorithms and explain how each feature in the model contributes to the prediction of the outputs. The interpretability techniques used in machine learning are also combined with the method introduced based on AHP in the context of clinical decision support systems in order to make the algorithms (black-box) and the results interpretable and explainable, so that clinical-decision-makers can take controlled decisions together with the concept of "right to explanation" introduced by the legislator, because the decision-makers have a civil and legal responsibility of their choices in the clinical field based on systems that make use of artificial intelligence. No less, the central point is the interaction between the expert who controls the algorithm construction process and the domain expert, in this case the clinical one. Three applications on real data are implemented with the methods known in the literature and with those proposed in this work: one application concerns cervical cancer, another the problem related to diabetes and the last one focuses on a specific pathology developed by HIV-infected individuals. All applications are supported by plots, tables and explanations of the results, implemented through Python libraries. The main case study of this thesis regarding HIV-infected individuals concerns an unsupervised ensemble-type problem, in which a series of clustering algorithms are used on a set of features and which in turn produce an output used again as a set of meta-features to provide a set of labels for each given cluster. The meta-features and labels obtained by choosing the best algorithm are used to train a Logistic regression meta-learner, which in turn is used through some explainability methods to provide the value of the contribution that each algorithm has had in the training phase. The use of Logistic regression as a meta-learner classifier is motivated by the fact that it provides appreciable results and also because of the easy explainability of the estimated coefficients

    Modelo de decisión para el diseño conceptual de un sistema de suministro sostenible de energía para la Sede Leticia de la Universidad Nacional de Colombia

    Get PDF
    ilustraciones, gráficas, tablasThe decentralized model of energy generation has emerged as a solution to provide electricity to isolated areas, ensuring energy security and increasing coverage. This model frequently leads to a dependency on a unique energy source; thus, it is necessary to change the paradigm of energy generation by adding other more sustainable sources. Unfortunately, there is not a well-defined route to establish which energy sources should be linked and in what way, making this restructuring a very complex problem involving a decision-making process. Generally, decisions are made only considering the economic or technical dimensions, ignoring the other dimensions such as environmental, social, and political, which could provide a more contextualized perspective. The aim of this study is to develop and test a methodology to find an optimal arrangement of energy sources in a decentralized electricity production model considering all sustainability dimensions. A methodology as the proposed in this work can support the stakeholders during the planning stages of energy supply systems. The methodology was applied to a specific case in Colombia, the campus Amazonia of the Universidad Nacional de Colombia, located in Leticia, a municipality where on-site generators are employed due to the difficulty of access. As a result, the proposed methodology generated nine different scenarios of energy arrangements according to an evaluation of energy sources using a sustainability approach that considered context aspects along with a carefully selected set of indicators and stakeholders' preferences.El modelo descentralizado de generación de energía surgió como una solución para el suministro de energía en áreas aisladas, asegurando la seguridad energética e incrementando la cobertura. No obstante, este modelo frecuentemente conlleva a una dependencia a una única fuente de energía, por lo que es necesario cambiar el paradigma de la generación de energía añadiendo otras fuentes más sostenibles. Desafortunadamente, no existe una ruta definida para establecer cuales fuentes de energía deben ser agregadas y de qué manera, convirtiendo esta reestructuración en un problema muy complejo que involucra la toma de decisiones. Generalmente, estas decisiones se toman considerando aspectos económicos o técnicos, dejando de lado otras dimensiones como la ambiental, social y política, que podrían proporcionar una perspectiva más contextualizada. El objetivo de este estudio es desarrollar y probar una metodología que permita encontrar un arreglo óptimo de fuentes de energía en un modelo de producción de electricidad descentralizado teniendo en cuenta todas las dimensiones de la sostenibilidad. La metodología propuesta en este trabajo puede ayudar a los principales involucrados durante las fases de planeación de sistemas de suministro de energía. Esta metodología fue aplicada a un caso específico en Colombia, la sede Amazonas de la Universidad Nacional de Colombia, ubicada en Leticia, un municipio donde generadores in situ son empleados debido al difícil acceso. Como resultado, la metodología propuesta generó nueve escenarios diferentes de arreglos energéticos de acuerdo a una evaluación de fuentes de energía en un enfoque de sostenibilidad considerando aspectos de contexto junto a una selección cuidadosa de indicadores y las preferencias de las partes interesadas. (Texto tomado de la fuente).Incluye anexosMaestríaBiorefinerías y biorefinació

    Fuelling the zero-emissions road freight of the future: routing of mobile fuellers

    Get PDF
    The future of zero-emissions road freight is closely tied to the sufficient availability of new and clean fuel options such as electricity and Hydrogen. In goods distribution using Electric Commercial Vehicles (ECVs) and Hydrogen Fuel Cell Vehicles (HFCVs) a major challenge in the transition period would pertain to their limited autonomy and scarce and unevenly distributed refuelling stations. One viable solution to facilitate and speed up the adoption of ECVs/HFCVs by logistics, however, is to get the fuel to the point where it is needed (instead of diverting the route of delivery vehicles to refuelling stations) using "Mobile Fuellers (MFs)". These are mobile battery swapping/recharging vans or mobile Hydrogen fuellers that can travel to a running ECV/HFCV to provide the fuel they require to complete their delivery routes at a rendezvous time and space. In this presentation, new vehicle routing models will be presented for a third party company that provides MF services. In the proposed problem variant, the MF provider company receives routing plans of multiple customer companies and has to design routes for a fleet of capacitated MFs that have to synchronise their routes with the running vehicles to deliver the required amount of fuel on-the-fly. This presentation will discuss and compare several mathematical models based on different business models and collaborative logistics scenarios

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp
    corecore