262 research outputs found

    CBR and MBR techniques: review for an application in the emergencies domain

    Get PDF
    The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system. RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to: a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location. In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations. This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version

    Facing-up Challenges of Multiobjective Clustering Based on Evolutionary Algorithms: Representations, Scalability and Retrieval Solutions

    Get PDF
    Aquesta tesi es centra en algorismes de clustering multiobjectiu, que estan basats en optimitzar varis objectius simultàniament obtenint una col•lecció de solucions potencials amb diferents compromisos entre objectius. El propòsit d'aquesta tesi consisteix en dissenyar i implementar un nou algorisme de clustering multiobjectiu basat en algorismes evolutius per afrontar tres reptes actuals relacionats amb aquest tipus de tècniques. El primer repte es centra en definir adequadament l'àrea de possibles solucions que s'explora per obtenir la millor solució i que depèn de la representació del coneixement. El segon repte consisteix en escalar el sistema dividint el conjunt de dades original en varis subconjunts per treballar amb menys dades en el procés de clustering. El tercer repte es basa en recuperar la solució més adequada tenint en compte la qualitat i la forma dels clusters a partir de la regió més interessant de la col•lecció de solucions ofertes per l’algorisme.Esta tesis se centra en los algoritmos de clustering multiobjetivo, que están basados en optimizar varios objetivos simultáneamente obteniendo una colección de soluciones potenciales con diferentes compromisos entre objetivos. El propósito de esta tesis consiste en diseñar e implementar un nuevo algoritmo de clustering multiobjetivo basado en algoritmos evolutivos para afrontar tres retos actuales relacionados con este tipo de técnicas. El primer reto se centra en definir adecuadamente el área de posibles soluciones explorada para obtener la mejor solución y que depende de la representación del conocimiento. El segundo reto consiste en escalar el sistema dividiendo el conjunto de datos original en varios subconjuntos para trabajar con menos datos en el proceso de clustering El tercer reto se basa en recuperar la solución más adecuada según la calidad y la forma de los clusters a partir de la región más interesante de la colección de soluciones ofrecidas por el algoritmo.This thesis is focused on multiobjective clustering algorithms, which are based on optimizing several objectives simultaneously obtaining a collection of potential solutions with different trade¬offs among objectives. The goal of the thesis is to design and implement a new multiobjective clustering technique based on evolutionary algorithms for facing up three current challenges related to these techniques. The first challenge is focused on successfully defining the area of possible solutions that is explored in order to find the best solution, and this depends on the knowledge representation. The second challenge tries to scale-up the system splitting the original data set into several data subsets in order to work with less data in the clustering process. The third challenge is addressed to the retrieval of the most suitable solution according to the quality and shape of the clusters from the most interesting region of the collection of solutions returned by the algorithm

    A Principled Methodology: A Dozen Principles of Software Effort Estimation

    Get PDF
    Software effort estimation (SEE) is the activity of estimating the total effort required to complete a software project. Correctly estimating the effort required for a software project is of vital importance for the competitiveness of the organizations. Both under- and over-estimation leads to undesirable consequences for the organizations. Under-estimation may result in overruns in budget and schedule, which in return may cause the cancellation of projects; thereby, wasting the entire effort spent until that point. Over-estimation may cause promising projects not to be funded; hence, harming the organizational competitiveness.;Due to the significant role of SEE for software organizations, there is a considerable research effort invested in SEE. Thanks to the accumulation of decades of prior research, today we are able to identify the core issues and search for the right principles to tackle pressing questions. For example, regardless of decades of work, we still lack concrete answers to important questions such as: What is the best SEE method? The introduced estimation methods make use of local data, however not all the companies have their own data, so: How can we handle the lack of local data? Common SEE methods take size attributes for granted, yet size attributes are costly and the practitioners place very little trust in them. Hence, we ask: How can we avoid the use of size attributes? Collection of data, particularly dependent variable information (i.e. effort values) is costly: How can find an essential subset of the SEE data sets? Finally, studies make use of sampling methods to justify a new method\u27s performance on SEE data sets. Yet, trade-off among different variants is ignored: How should we choose sampling methods for SEE experiments? ;This thesis is a rigorous investigation towards identification and tackling of the pressing issues in SEE. Our findings rely on extensive experimentation performed with a large corpus of estimation techniques on a large set of public and proprietary data sets. We summarize our findings and industrial experience in the form of 12 principles: 1) Know your domain 2) Let the Experts Talk 3) Suspect your data 4) Data Collection is Cyclic 5) Use a Ranking Stability Indicator 6) Assemble Superior Methods 7) Weighting Analogies is Over-elaboration 8) Use Easy-path Design 9) Use Relevancy Filtering 10) Use Outlier Pruning 11) Combine Outlier and Synonym Pruning 12) Be Aware of Sampling Method Trade-off

    Framework for data quality in knowledge discovery tasks

    Get PDF
    Actualmente la explosión de datos es tendencia en el universo digital debido a los avances en las tecnologías de la información. En este sentido, el descubrimiento de conocimiento y la minería de datos han ganado mayor importancia debido a la gran cantidad de datos disponibles. Para un exitoso proceso de descubrimiento de conocimiento, es necesario preparar los datos. Expertos afirman que la fase de preprocesamiento de datos toma entre un 50% a 70% del tiempo de un proceso de descubrimiento de conocimiento. Herramientas software basadas en populares metodologías para el descubrimiento de conocimiento ofrecen algoritmos para el preprocesamiento de los datos. Según el cuadrante mágico de Gartner de 2018 para ciencia de datos y plataformas de aprendizaje automático, KNIME, RapidMiner, SAS, Alteryx, y H20.ai son las mejores herramientas para el desucrimiento del conocimiento. Estas herramientas proporcionan diversas técnicas que facilitan la evaluación del conjunto de datos, sin embargo carecen de un proceso orientado al usuario que permita abordar los problemas en la calidad de datos. Adem´as, la selección de las técnicas adecuadas para la limpieza de datos es un problema para usuarios inexpertos, ya que estos no tienen claro cuales son los métodos más confiables. De esta forma, la presente tesis doctoral se enfoca en abordar los problemas antes mencionados mediante: (i) Un marco conceptual que ofrezca un proceso guiado para abordar los problemas de calidad en los datos en tareas de descubrimiento de conocimiento, (ii) un sistema de razonamiento basado en casos que recomiende los algoritmos adecuados para la limpieza de datos y (iii) una ontología que representa el conocimiento de los problemas de calidad en los datos y los algoritmos de limpieza de datos. Adicionalmente, esta ontología contribuye en la representacion formal de los casos y en la fase de adaptación, del sistema de razonamiento basado en casos.The creation and consumption of data continue to grow by leaps and bounds. Due to advances in Information and Communication Technologies (ICT), today the data explosion in the digital universe is a new trend. The Knowledge Discovery in Databases (KDD) gain importance due the abundance of data. For a successful process of knowledge discovery is necessary to make a data treatment. The experts affirm that preprocessing phase take the 50% to 70% of the total time of knowledge discovery process. Software tools based on Knowledge Discovery Methodologies offers algorithms for data preprocessing. According to Gartner 2018 Magic Quadrant for Data Science and Machine Learning Platforms, KNIME, RapidMiner, SAS, Alteryx and H20.ai are the leader tools for knowledge discovery. These software tools provide different techniques and they facilitate the evaluation of data analysis, however, these software tools lack any kind of guidance as to which techniques can or should be used in which contexts. Consequently, the use of suitable data cleaning techniques is a headache for inexpert users. They have no idea which methods can be confidently used and often resort to trial and error. This thesis presents three contributions to address the mentioned problems: (i) A conceptual framework to provide the user a guidance to address data quality issues in knowledge discovery tasks, (ii) a Case-based reasoning system to recommend the suitable algorithms for data cleaning, and (iii) an Ontology that represent the knowledge in data quality issues and data cleaning methods. Also, this ontology supports the case-based reasoning system for case representation and reuse phase.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Fernando Fernández Rebollo.- Secretario: Gustavo Adolfo Ramírez.- Vocal: Juan Pedro Caraça-Valente Hernánde

    Optimisation sous contraintes de problèmes distribués par auto-organisation coopérative

    Get PDF
    Quotidiennement, divers problèmes d'optimisation : minimiser un coût de production, optimiser le parcours d'un véhicule, etc sont à résoudre. Ces problèmes se caractérisent par un degré élevé de complexité dû à l'hétérogénéité et la diversité des acteurs en jeu, à la masse importante des données ainsi qu'à la dynamique des environnements dans lesquels ils sont plongés. Face à la complexité croissante de ces applications, les approches de résolution classiques ont montré leurs limites. Depuis quelques années, la communauté scientifique s'intéresse aux développements de nouvelles solutions basées sur la distribution du calcul et la décentralisation du contrôle plus adaptées à ce genre de problème. La théorie des AMAS (Adaptive Multi-Agents Systems) propose le développement de solutions utilisant des systèmes multi-agents auto-adaptatifs par auto-organisation coopérative. Cette théorie a montré son adéquation pour la résolution de problèmes complexes et dynamiques, mais son application reste à un niveau d'abstraction assez élevé. L'objectif de ce travail est de spécialiser cette théorie pour la résolution de ce genre de problèmes. Ainsi, son utilisation en sera facilitée. Pour cela, le modèle d'agents AMAS4Opt avec des comportements et des interactions coopératifs et locaux a été défini. La validation s'est effectuée sur deux problèmes clés d'optimisation : le contrôle manufacturier et la conception de produit complexe. De plus, afin de montrer la robustesse et l'adéquation des solutions développées, un ensemble de critères d'évaluation permettant de souligner les points forts et faibles des systèmes adaptatifs et de les comparer à des systèmes existants a été défini.We solve problems and make decisions all day long. Some problems and decisions are very challenging: What is the best itinerary to deliver orders given the weather, the traffic and the hour? How to improve product manufacturing performances? etc. Problems that are characterized by a high level of complexity due to the heterogeneity and diversity of the participating actors, to the increasing volume of manipulated data and to the dynamics of the applications environments. Classical solving approaches have shown their limits to cope with this growing complexity. For the last several years, the scientific community has been interested in the development of new solutions based on computation distribution and control decentralization. The AMAS (Adaptive Multi-Agent-Systems) theory proposes to build solutions based on self-adaptive multi-agent systems using cooperative self-organization. This theory has shown its adequacy to solve different complex and dynamic problems, but remains at a high abstraction level. This work proposes a specialization of this theory for complex optimization problem solving under constraints. Thus, the usage of this theory is made accessible to different non-AMAS experts' engineers. Thus, the AMAS4Opt agent model with cooperative, local and generic behaviours and interactions has been defined.This model is validated on two well-known optimization problems: scheduling in manufacturing control and complex product design. Finally, in order to show the robustness and adequacy of the developed solutions, a set of evaluation criteria is proposed to underline the advantages and limits of adaptive systems and to compare them with already existing systems

    Enhanced evolutionary algorithm with cuckoo search for nurse scheduling and rescheduling problem

    Get PDF
    Nurse shortage, uncertain absenteeism and stress are the constituents of an unhealthy working environment in a hospital. These matters have impact on nurses' social lives and medication errors that threaten patients' safety, which lead to nurse turnover and low quality service. To address some of the issues, utilizing the existing nurses through an effective work schedule is the best alternative. However, there exists a problem of creating undesirable and non-stable nurse schedules for nurses' shift work. Thus, this research attempts to overcome these challenges by integrating components of a nurse scheduling and rescheduling problem which have normally been addressed separately in previous studies. However, when impromptu schedule changes are required and certain numbers of constraints need to be satisfied, there is a lack of flexibility element in most of scheduling and rescheduling approaches. By embedding the element, this gives a potential platform for enhancing the Evolutionary Algorithm (EA) which has been identified as the solution approach. Therefore, to minimize the constraint violations and make little but attentive changes to a postulated schedule during a disruption, an integrated model of EA with Cuckoo Search (CS) is proposed. A concept of restriction enzyme is adapted in the CS. A total of 11 EA model variants were constructed with three new parent selections, two new crossovers, and a crossover-based retrieval operator, that specifically are theoretical contributions. The proposed EA with Discovery Rate Tournament and Cuckoo Search Restriction Enzyme Point Crossover (DᵣT_CSREP) model emerges as the most effective in producing 100% feasible schedules with the minimum penalty value. Moreover, all tested disruptions were solved successfully through preretrieval and Cuckoo Search Restriction Enzyme Point Retrieval (CSREPᵣ) operators. Consequently, the EA model is able to fulfill nurses' preferences, offer fair on-call delegation, better quality of shift changes for retrieval, and comprehension on the two-way dependency between scheduling and rescheduling by examining the seriousness of disruptions

    A new strategy for case-based reasoning retrieval using classification based on association

    Get PDF
    Cased Based Reasoning (CBR) is an important area of research in the field of Artificial Intelli-gence. It aims to solve new problems by adapting solutions, that were used to solve previous similar ones. Among the four typical phases - retrieval, reuse, revise and retain, retrieval is a key phase in CBR approach, as the retrieval of wrong cases can lead to wrong decisions. To ac-complish the retrieval process, a CBR system exploits Similarity-Based Retrieval (SBR). How-ever, SBR tends to depend strongly on similarity knowledge, ignoring other forms of knowledge, that can further improve retrieval performance.The aim of this study is to integrate class association rules (CARs) as a special case of associa-tion rules (ARs), to discover a set (of rules) that can form an accurate classifier in a database. It is an efficient method when used to build a classifier, where the target is pre-determined. The proposition for this research is to answer the question of whether CARs can be integrated into a CBR system. A new strategy is proposed that suggests and uses mining class association rules from previous cases, which could strengthen similarity based retrieval (SBR). The propo-sition question can be answered by adapting the pattern of CARs, to be compared with the end of the Retrieval phase. Previous experiments and their results to date, show a link between CARs and CBR cases. This link has been developed to achieve the aim and objectives.A novel strategy, Case-Based Reasoning using Association Rules (CBRAR) is proposed to improve the performance of the SBR and to disambiguate wrongly retrieved cases in CBR. CBRAR uses CARs to generate an optimum frequent pattern tree (FP-tree) which holds a val-ue of each node. The possible advantage offered is that more efficient results can be gained, when SBR returns uncertain answers. In addition, CBRAR has been evaluated using two sources of CBR frameworks - Jcolibri and Free CBR. With the experimental evaluation on real datasets indicating that the proposed CBRAR is a better approach when compared to CBR systems, offering higher accuracy and lower error rate

    A case-based reasoning approach to improve risk identification in construction projects

    Get PDF
    Risk management is an important process to enhance the understanding of the project so as to support decision making. Despite well established existing methods, the application of risk management in practice is frequently poor. The reasons for this are investigated as accuracy, complexity, time and cost involved and lack of knowledge sharing. Appropriate risk identification is fundamental for successful risk management. Well known risk identification methods require expert knowledge, hence risk identification depends on the involvement and the sophistication of experts. Subjective judgment and intuition usually from par1t of experts’ decision, and sharing and transferring this knowledge is restricted by the availability of experts. Further, psychological research has showed that people have limitations in coping with complex reasoning. In order to reduce subjectivity and enhance knowledge sharing, artificial intelligence techniques can be utilised. An intelligent system accumulates retrievable knowledge and reasoning in an impartial way so that a commonly acceptable solution can be achieved. Case-based reasoning enables learning from experience, which matches the manner that human experts catch and process information and knowledge in relation to project risks. A case-based risk identification model is developed to facilitate human experts making final decisions. This approach exploits the advantage of knowledge sharing, increasing confidence and efficiency in investment decisions, and enhancing communication among the project participants
    corecore