77 research outputs found

    Integration of Context Information through Probabilistic Ontological Knowledge into Image Classification

    Get PDF
    The use of ontological knowledge to improve classification results is a promising line of research. The availability of a probabilistic ontology raises the possibility of combining the probabilities coming from the ontology with the ones produced by a multi-class classifier that detects particular objects in an image. This combination not only provides the relations existing between the different segments, but can also improve the classification accuracy. In fact, it is known that the contextual information can often give information that suggests the correct class. This paper proposes a possible model that implements this integration, and the experimental assessment shows the effectiveness of the integration, especially when the classifier’s accuracy is relatively low. To assess the performance of the proposed model, we designed and implemented a simulated classifier that allows a priori decisions of its performance with sufficient precision

    Three fundamental pillars of decision-centered teamwork

    Get PDF
    This thesis introduces a novel paradigm in artificial intelligence: decision-centered teamwork. Decision-centered teamwork is the analysis of agent teams that iteratively take joint decisions into solving complex problems. Although teams of agents have been used to take decisions in many important domains, such as: machine learning, crowdsourcing, forecasting systems, and even board games; a study of a general framework for decisioncentered teamwork has never been presented in the literature before. I divide decision-centered teamwork in three fundamental challenges: (i) Agent Selection, which consists of selecting a set of agents from an exponential universe of possible teams; (ii) Aggregation of Opinions, which consists of designing methods to aggregate the opinions of different agents into taking joint team decisions; (iii) Team Assessment, which consists of designing methods to identify whether a team is failing, allowing a “coordinator” to take remedial procedures. In this thesis, I handle all these challenges. For Agent Selection, I introduce novel models of diversity for teams of voting agents. My models rigorously show that teams made of the best agents are not necessarily optimal, and also clarify in which situations diverse teams should be preferred. In particular, I show that diverse teams get stronger as the number of actions increases, by analyzing how the agents’ probability distribution function over actions changes. This has never been presented before in the ensemble systems literature. I also show that diverse teams have a great applicability for design problems, where the objective is to maximize the number of optimal solutions for human selection, combining for the first time social choice with number theory. All of these theoretical models and predictions are verified in real systems, such as Computer Go and architectural design. In particular, for architectural design I optimize the design of buildings with agent teams not only for cost and project requirements, but also for energy-efficiency, being thus an essential domain for sustainability. Concerning Aggregation of Opinions, I evaluate classical ranked voting rules from social choice in Computer Go, only to discover that plurality leads to the best results. This happens because real agents tend to have very noisy rankings. Hence, I create a ranking by sampling extraction technique, leading to significantly better results with the Borda voting rule. A similar study is also performed in the social networks domain, in the context of influence maximization. Additionally, I study a novel problem in social networks: I assume only a subgraph of the network is initially known, and we must spread influence and learn the graph simultaneously. I analyze a linear combination of two greedy algorithms, outperforming both of them. This domain has a great potential for health, as I run experiments in four real-life social networks from the homeless population of Los Angeles, aiming at spreading HIV prevention information. Finally, with regards to Team Assessment, I develop a domain independent team assessment methodology for teams of voting agents. My method is within a machine learning framework, and learns a prediction model over the voting patterns of a team, instead of learning over the possible states of the problem. The methodology is tested and verified in Computer Go and Ensemble Learning

    Models and algorithms for promoting diverse and fair query results

    Get PDF
    Ensuring fairness and diversity in search results are two key concerns in compelling search and recommendation applications. This work explicitly studies these two aspects given multiple users\u27 preferences as inputs, in an effort to create a single ranking or top-k result set that satisfies different fairness and diversity criteria. From group fairness standpoint, it adapts demographic parity like group fairness criteria and proposes new models that are suitable for ranking or producing top-k set of results. This dissertation also studies equitable exposure of individual search results in long tail data, a concept related to individual fairness. First, the dissertation focuses on aggregating ranks while achieving proportionate fairness (ensures proportionate representation of every group) for multiple protected groups. Then, the dissertation explores how to minimally modify original users\u27 preferences under plurality voting, aiming to produce top-k result set that satisfies complex fairness constraints. A concept referred to as manipulation by modifications is introduced, which involves making minimal changes to the original user preferences to ensure query satisfaction. This problem is formalized as the margin finding problem. A follow up work studies this problem considering a popular ranked choice voting mechanism, namely, the Instant Run-off Voting or IRV, as the preference aggregation method. From the standpoint of individual fairness, this dissertation studies an exposure concern that top-k set based algorithms exhibit when the underlying data has long tail properties, and designs techniques to make those results equitable. For result diversification, the work studies efficiency opportunities in existing diversification algorithms, and designs a generic access primitive called DivGetBatch() to enable that. The contributions of this dissertation lie in (a) formalizing principal problems and studying them analytically. (b) designing scalable algorithms with theoretical guarantees, and (c) extensive experimental study to evaluate the efficacy and scalability of the designed solutions by comparing them with the state-of-the-art solutions using large-scale datasets

    Individual-Based Modeling and Data Analysis of Ecological Systems Using Machine Learning Techniques

    Get PDF
    Artificial life (Alife) studies the logic of living systems in an artificial environment in order to gain a deeper insight of the complex processes and governing rules in such systems. EcoSim, an Alife simulation for ecological modeling, is an individual-based predator-prey ecosystem simulation and a generic platform designed to investigate several broad ecological questions, as well as long-term evolutionary patterns and processes in biology and ecology. Speciation and extinction of species are two essential phenomena in evolutionary biology. Many factors are involved in the emergence and disappearance of species. Due to the complexity of the interactions between different factors, such as interaction of individuals with their environment, and the long time required for the observation, studying such phenomena is not easy in the real world. Using data sets obtained from EcoSim and machine learning techniques, we predicted speciation and extinction of species based on numerous factors. Experimental results showed that factors, such as demographics, genetics, and environment are important in the occurrence of these two events in EcoSim.We identified the best species-area relationship (SAR) models, using EcoSim, along with investigating how sampling approaches and sampling scales affect SARs. Further, we proposed a machine learning approach, based on extraction of rules that provide an interpretation of SAR coefficients, to find plausible relationships between the models\u27 coefficients and the spatial information that likely affect SARs. We found the power function family to be a reasonable choice for SAR. Furthermore, the simple power function was the best ranked model in nested sampling amongst models with two coefficients. For some of the SAR model coefficients, we obtained clear correlations with spatial information, thereby providing an interpretation of these coefficients. Rule extraction is a method to discover the rules explaining a predictive model of a specific phenomenon. A procedure for rule extraction from Random Forest (RF) is proposed. The proposed methods are evaluated on eighteen UCI machine learning repository and four microarray data sets. Our experimental results show that the proposed methods outperform one of the state-of-the art methods in terms of scalability and comprehensibility while preserving the same level of accuracy

    Metalearning

    Get PDF
    This open access book as one of the fastest-growing areas of research in machine learning, metalearning studies principled methods to obtain efficient models and solutions by adapting machine learning and data mining processes. This adaptation usually exploits information from past experience on other tasks and the adaptive processes can involve machine learning approaches. As a related area to metalearning and a hot topic currently, automated machine learning (AutoML) is concerned with automating the machine learning processes. Metalearning and AutoML can help AI learn to control the application of different learning methods and acquire new solutions faster without unnecessary interventions from the user. This book offers a comprehensive and thorough introduction to almost all aspects of metalearning and AutoML, covering the basic concepts and architecture, evaluation, datasets, hyperparameter optimization, ensembles and workflows, and also how this knowledge can be used to select, combine, compose, adapt and configure both algorithms and models to yield faster and better solutions to data mining and data science problems. It can thus help developers to develop systems that can improve themselves through experience. This book is a substantial update of the first edition published in 2009. It includes 18 chapters, more than twice as much as the previous version. This enabled the authors to cover the most relevant topics in more depth and incorporate the overview of recent research in the respective area. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining, data science and artificial intelligence. ; Metalearning is the study of principled methods that exploit metaknowledge to obtain efficient models and solutions by adapting machine learning and data mining processes. While the variety of machine learning and data mining techniques now available can, in principle, provide good model solutions, a methodology is still needed to guide the search for the most appropriate model in an efficient way. Metalearning provides one such methodology that allows systems to become more effective through experience. This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms. It shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems. It can thus help developers improve their algorithms and also develop learning systems that can improve themselves. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining and artificial intelligence

    Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)

    Get PDF
    Peer reviewe

    Metalearning

    Get PDF
    This open access book as one of the fastest-growing areas of research in machine learning, metalearning studies principled methods to obtain efficient models and solutions by adapting machine learning and data mining processes. This adaptation usually exploits information from past experience on other tasks and the adaptive processes can involve machine learning approaches. As a related area to metalearning and a hot topic currently, automated machine learning (AutoML) is concerned with automating the machine learning processes. Metalearning and AutoML can help AI learn to control the application of different learning methods and acquire new solutions faster without unnecessary interventions from the user. This book offers a comprehensive and thorough introduction to almost all aspects of metalearning and AutoML, covering the basic concepts and architecture, evaluation, datasets, hyperparameter optimization, ensembles and workflows, and also how this knowledge can be used to select, combine, compose, adapt and configure both algorithms and models to yield faster and better solutions to data mining and data science problems. It can thus help developers to develop systems that can improve themselves through experience. This book is a substantial update of the first edition published in 2009. It includes 18 chapters, more than twice as much as the previous version. This enabled the authors to cover the most relevant topics in more depth and incorporate the overview of recent research in the respective area. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining, data science and artificial intelligence. ; Metalearning is the study of principled methods that exploit metaknowledge to obtain efficient models and solutions by adapting machine learning and data mining processes. While the variety of machine learning and data mining techniques now available can, in principle, provide good model solutions, a methodology is still needed to guide the search for the most appropriate model in an efficient way. Metalearning provides one such methodology that allows systems to become more effective through experience. This book discusses several approaches to obtaining knowledge concerning the performance of machine learning and data mining algorithms. It shows how this knowledge can be reused to select, combine, compose and adapt both algorithms and models to yield faster, more effective solutions to data mining problems. It can thus help developers improve their algorithms and also develop learning systems that can improve themselves. The book will be of interest to researchers and graduate students in the areas of machine learning, data mining and artificial intelligence

    Data-Driven Simulation Modeling of Construction and Infrastructure Operations Using Process Knowledge Discovery

    Get PDF
    Within the architecture, engineering, and construction (AEC) domain, simulation modeling is mainly used to facilitate decision-making by enabling the assessment of different operational plans and resource arrangements, that are otherwise difficult (if not impossible), expensive, or time consuming to be evaluated in real world settings. The accuracy of such models directly affects their reliability to serve as a basis for important decisions such as project completion time estimation and resource allocation. Compared to other industries, this is particularly important in construction and infrastructure projects due to the high resource costs and the societal impacts of these projects. Discrete event simulation (DES) is a decision making tool that can benefit the process of design, control, and management of construction operations. Despite recent advancements, most DES models used in construction are created during the early planning and design stage when the lack of factual information from the project prohibits the use of realistic data in simulation modeling. The resulting models, therefore, are often built using rigid (subjective) assumptions and design parameters (e.g. precedence logic, activity durations). In all such cases and in the absence of an inclusive methodology to incorporate real field data as the project evolves, modelers rely on information from previous projects (a.k.a. secondary data), expert judgments, and subjective assumptions to generate simulations to predict future performance. These and similar shortcomings have to a large extent limited the use of traditional DES tools to preliminary studies and long-term planning of construction projects. In the realm of the business process management, process mining as a relatively new research domain seeks to automatically discover a process model by observing activity records and extracting information about processes. The research presented in this Ph.D. Dissertation was in part inspired by the prospect of construction process mining using sensory data collected from field agents. This enabled the extraction of operational knowledge necessary to generate and maintain the fidelity of simulation models. A preliminary study was conducted to demonstrate the feasibility and applicability of data-driven knowledge-based simulation modeling with focus on data collection using wireless sensor network (WSN) and rule-based taxonomy of activities. The resulting knowledge-based simulation models performed very well in properly predicting key performance measures of real construction systems. Next, a pervasive mobile data collection and mining technique was adopted and an activity recognition framework for construction equipment and worker tasks was developed. Data was collected using smartphone accelerometers and gyroscopes from construction entities to generate significant statistical time- and frequency-domain features. The extracted features served as the input of different types of machine learning algorithms that were applied to various construction activities. The trained predictive algorithms were then used to extract activity durations and calculate probability distributions to be fused into corresponding DES models. Results indicated that the generated data-driven knowledge-based simulation models outperform static models created based upon engineering assumptions and estimations with regard to compatibility of performance measure outputs to reality
    corecore