35 research outputs found

    The Grand Challenges and Myths of Neural-Symbolic Computation

    Get PDF
    The construction of computational cognitive models integrating the connectionist and symbolic paradigms of artificial intelligence is a standing research issue in the field. The combination of logic-based inference and connectionist learning systems may lead to the construction of semantically sound computational cognitive models in artificial intelligence, computer and cognitive sciences. Over the last decades, results regarding the computation and learning of classical reasoning within neural networks have been promising. Nonetheless, there still remains much do be done. Artificial intelligence, cognitive and computer science are strongly based on several non-classical reasoning formalisms, methodologies and logics. In knowledge representation, distributed systems, hardware design, theorem proving, systems specification and verification classical and non-classical logics have had a great impact on theory and real-world applications. Several challenges for neural-symbolic computation are pointed out, in particular for classical and non-classical computation in connectionist systems. We also analyse myths about neural-symbolic computation and shed new light on them considering recent research advances

    From Theory to Practice: A Data Quality Framework for Classification Tasks

    Get PDF
    The data preprocessing is an essential step in knowledge discovery projects. The experts affirm that preprocessing tasks take between 50% to 70% of the total time of the knowledge discovery process. In this sense, several authors consider the data cleaning as one of the most cumbersome and critical tasks. Failure to provide high data quality in the preprocessing stage will significantly reduce the accuracy of any data analytic project. In this paper, we propose a framework to address the data quality issues in classification tasks DQF4CT. Our approach is composed of: (i) a conceptual framework to provide the user guidance on how to deal with data problems in classification tasks; and (ii) an ontology that represents the knowledge in data cleaning and suggests the proper data cleaning approaches. We presented two case studies through real datasets: physical activity monitoring (PAM) and occupancy detection of an office room (OD). With the aim of evaluating our proposal, the cleaned datasets by DQF4CT were used to train the same algorithms used in classification tasks by the authors of PAM and OD. Additionally, we evaluated DQF4CT through datasets of the Repository of Machine Learning Databases of the University of California, Irvine (UCI). In addition, 84% of the results achieved by the models of the datasets cleaned by DQF4CT are better than the models of the datasets authors.This work has also been supported by: Project: “Red de formación de talento humano para la innovación social y productiva en el Departamento del Cauca InnovAcción Cauca”. Convocatoria 03-2018 Publicación de artículos en revistas de alto impacto. Project: “Alternativas Innovadoras de Agricultura Inteligente para sistemas productivos agrícolas del departamento del Cauca soportado en entornos de IoT - ID 4633” financed by Convocatoria 04C–2018 “Banco de Proyectos Conjuntos UEES-Sostenibilidad” of Project “Red de formación de talento humano para la innovación social y productiva en el Departamento del Cauca InnovAcción Cauca”. Spanish Ministry of Economy, Industry and Competitiveness (Projects TRA2015-63708-R and TRA2016-78886-C3-1-R)

    Exploiting past users’ interests and predictions in an active learning method for dealing with cold start in recommender systems

    Get PDF
    This paper focuses on the new users cold-start issue in the context of recommender systems. New users who do not receive pertinent recommendations may abandon the system. In order to cope with this issue, we use active learning techniques. These methods engage the new users to interact with the system by presenting them with a questionnaire that aims to understand their preferences to the related items. In this paper, we propose an active learning technique that exploits past users’ interests and past users’ predictions in order to identify the best questions to ask. Our technique achieves a better performance in terms of precision (RMSE), which leads to learn the users’ preferences in less questions. The experimentations were carried out in a small and public dataset to prove the applicability for handling cold start issues

    Recognition and Exploitation of Gate Structure in SAT Solving

    Get PDF
    In der theoretischen Informatik ist das SAT-Problem der archetypische Vertreter der Klasse der NP-vollständigen Probleme, weshalb effizientes SAT-Solving im Allgemeinen als unmöglich angesehen wird. Dennoch erzielt man in der Praxis oft erstaunliche Resultate, wo einige Anwendungen Probleme mit Millionen von Variablen erzeugen, die von neueren SAT-Solvern in angemessener Zeit gelöst werden können. Der Erfolg von SAT-Solving in der Praxis ist auf aktuelle Implementierungen des Conflict Driven Clause-Learning (CDCL) Algorithmus zurückzuführen, dessen Leistungsfähigkeit weitgehend von den verwendeten Heuristiken abhängt, welche implizit die Struktur der in der industriellen Praxis erzeugten Instanzen ausnutzen. In dieser Arbeit stellen wir einen neuen generischen Algorithmus zur effizienten Erkennung der Gate-Struktur in CNF-Encodings von SAT Instanzen vor, und außerdem drei Ansätze, in denen wir diese Struktur explizit ausnutzen. Unsere Beiträge umfassen auch die Implementierung dieser Ansätze in unserem SAT-Solver Candy und die Entwicklung eines Werkzeugs für die verteilte Verwaltung von Benchmark-Instanzen und deren Attribute, der Global Benchmark Database (GBD)

    Quality Assessment Methods for Textual Conversational Interfaces: A Multivocal Literature Review

    Get PDF
    The evaluation and assessment of conversational interfaces is a complex task since such software products are challenging to validate through traditional testing approaches. We conducted a systematic Multivocal Literature Review (MLR), on five different literature sources, to provide a view on quality attributes, evaluation frameworks, and evaluation datasets proposed to provide aid to the researchers and practitioners of the field. We came up with a final pool of 118 contributions, including grey (35) and white literature (83). We categorized 123 different quality attributes and metrics under ten different categories and four macro-categories: Relational, Conversational, User-Centered and Quantitative attributes. While Relational and Conversational attributes are most commonly explored by the scientific literature, we testified a predominance of User-Centered Attributes in industrial literature. We also identified five different academic frameworks/tools to automatically compute sets of metrics, and 28 datasets (subdivided into seven different categories based on the type of data contained) that can produce conversations for the evaluation of conversational interfaces. Our analysis of literature highlights that a high number of qualitative and quantitative attributes are available in the literature to evaluate the performance of conversational interfaces. Our categorization can serve as a valid entry point for researchers and practitioners to select the proper functional and non-functional aspects to be evaluated for their products

    Performance analysis of 2D-OCDMA system in long-reach passive optical network

    Get PDF
    International audienceIn this paper, a performance analysis is reported for optical code division multiplexing (OCDM) system for long-reach passive optical network (LR-PON) systems by taking into account multiple access interference (MAI), single-mode fiber (SMF) channel effects and receiver noise. The mathematical model representing the 2-D optical code parameters for different receiver structures used in optical code division multiplexing access (OCDMA) are developed, optimized and implemented using Matlab simulations, where channel imperfections, such as attenuation losses and chromatic dispersion have been considered. In the proposed system configuration, we have investigated the probability of error for Back-to-Back (B2B) with conventional correlation receiver (CCR), SMF with CCR receiver and SMF channel with successive interference cancelation (SIC) receiver. Additionally, SMF channel with SIC receiver system performance has been addressed by taking into account two key metrics, such as BER and Q-factor as function of simultaneous users, and fiber length, respectively. We have managed to substantially improve simultaneous multiuser data transmission over significant fiber lengths without use of amplification, where Q-factor of 6 at fiber length of 190 and 120 km, while a SIC receiver using 5 stages cancelation is employed for 2D prime hop system (2D-PHS) and for 2D hybrid codes (2D-HC), respectively

    On the Nature and Types of Anomalies: A Review

    Full text link
    Anomalies are occurrences in a dataset that are in some way unusual and do not fit the general patterns. The concept of the anomaly is generally ill-defined and perceived as vague and domain-dependent. Moreover, despite some 250 years of publications on the topic, no comprehensive and concrete overviews of the different types of anomalies have hitherto been published. By means of an extensive literature review this study therefore offers the first theoretically principled and domain-independent typology of data anomalies, and presents a full overview of anomaly types and subtypes. To concretely define the concept of the anomaly and its different manifestations, the typology employs five dimensions: data type, cardinality of relationship, anomaly level, data structure and data distribution. These fundamental and data-centric dimensions naturally yield 3 broad groups, 9 basic types and 61 subtypes of anomalies. The typology facilitates the evaluation of the functional capabilities of anomaly detection algorithms, contributes to explainable data science, and provides insights into relevant topics such as local versus global anomalies.Comment: 38 pages (30 pages content), 10 figures, 3 tables. Preprint; review comments will be appreciated. Improvements in version 2: Explicit mention of fifth anomaly dimension; Added section on explainable anomaly detection; Added section on variations on the anomaly concept; Various minor additions and improvement

    Gene selection for cancer classification with the help of bees

    Full text link

    Developing a municipality typology for modelling decentralised energy systems

    Get PDF
    The recent rapid expansion of renewable energy capacities in Germany has been dominated by decentralised wind, photovoltaic (PV) and bioenergy plants. The spatially disperse and partly unpredictable nature of these resources necessitates an increasing exploitation of integration measures such as curtailment, supply and demand side flexibilities, network strengthening and storage capacities. Indeed, one solution to the large-scale integration of renewable energies could be decentralised autonomous municipal energy systems. The achievement of grid parity for some renewable energy technologies has strengthened the desire of some communities to become independent from central markets. Whilst many communities in Germany already strive for socalled energy autonomy, the vast majority do so only on an annual basis. Several studies have already analysed the technical and economic implications of the mainly decentralised future energy system, but most are restricted in their insights by limited temporal and spatial resolution. The large number (11,131) of German municipalities means that a national analysis at this resolution is not feasible. Hence, this study employs a cluster analysis to develop a municipality typology in order to analyse the techno-economic suitability of these municipalities for autonomous energy systems. A total of 34 socio-technical indicators are employed at the municipal level, with a particular focus on the sectors of Private Households and Transport, and the potentials for decentralised renewable energies. The first step is to scale the indicator values and reduce their number by using a factor analysis. Several alternative methods are weighed against each other, and the most suitable methods for the factor analysis are chosen. Secondly, selected quantitative cluster validation methods are employed alongside qualitative criteria to determine the optimal number of clusters. This results in a total of ten clusters, which show a large variation as well as some overlap with respect to specific indicators. For example, one cluster contains all major German cities and has a low potential for renewable energies. Another cluster, on the other hand, contains the municipalities with a higher potential for renewable energies due to their high hydrothermal potential for geothermal power. An analysis of the municipalities from three German renewable energy projects “Energy Municipalities”, ”Bioenergy Villages” and “100% Renewable Energy Regions” shows that in eight of the ten clusters municipalities are aiming for energy autonomy (in varying degrees). It is challenging to differentiate between the clusters regarding readiness for energy autonomy projects, however, especially if the degree of social acceptance and engagement for such projects is to be considered. To answer the more techno-economical part of this question, future work will employ the developed clusters in the context of an energy system optimisation. Insights gained at the municipal level will then be qualitatively transferred to the national context to assess the implications for the whole energy system
    corecore