35 research outputs found

    Timed contract compliance under event timing uncertainty

    Get PDF
    Despite that many real-life contracts include time constraints, for instance explicitly specifying deadlines by when to perform actions, or for how long certain behaviour is prohibited, the literature formalising such notions is surprisingly sparse. Furthermore, one of the major challenges is that compliance is typically computed with respect to timed event traces with event timestamps assumed to be perfect. In this paper we present an approach for evaluating compliance under the effect of imperfect timing information, giving a semantics to analyse contract violation likelihood.peer-reviewe

    Combining Classification and Clustering for Tweet Sentiment Analysis

    Full text link
    The goal of sentiment analysis is to determine opinions, emotions, and attitudes presented in source material. In tweet sentiment analysis, opinions in messages can be typically categorized as positive or negative. To classify them, researchers have been using traditional classifiers like Naive Bayes, Maximum Entropy, and Support Vector Machines (SVM). In this paper, we show that a SVM classifier combined with a cluster ensemble can offer better classification accuracies than a stand-alone SVM. In our study, we employed an algorithm, named 'C POT.3'E-SL, capable to combine classifier and cluster ensembles. This algorithm can refine tweet classifications from additional information provided by clusterers, assuming that similar instances from the same clusters are more likely to share the same class label. The resulting classifier has shown to be competitive with the best results found so far in the literature, thereby suggesting that the studied approach is promising for tweet sentiment classification.Capes (Proc. DS-7253238/D)CNPq (Proc. 303348/2013-5)FAPESP (Proc. 2013/07375-0 and 2010/20830-0

    Adaptation of discourse parsing models for the portuguese language

    Get PDF
    Discourse parsing in Portuguese has two critical limitations. The first is that the task has been explored using only symbolic approaches, i.e., using manually extracted lexical patterns. The second is related to the domain of the lexical patterns, which were extracted through the analysis of a corpus of academic texts, generating many domain-specific patterns. For English, many approaches have been explored using machine learning with features based on a prominent lexicon-syntax notion of dominance sets. In this paper, two works were adapted to Portuguese, improving the results, outperforming the baselines and previous works for Portuguese, considering the task of rhetorical relation identification.São Paulo Research Foundation (FAPESP) (grant 2014/11632-0)Natural Sciences and Engineering Research Council of CanadaUniversity of Toront

    Decoding machine learning benchmarks

    Full text link
    Despite the availability of benchmark machine learning (ML) repositories (e.g., UCI, OpenML), there is no standard evaluation strategy yet capable of pointing out which is the best set of datasets to serve as gold standard to test different ML algorithms. In recent studies, Item Response Theory (IRT) has emerged as a new approach to elucidate what should be a good ML benchmark. This work applied IRT to explore the well-known OpenML-CC18 benchmark to identify how suitable it is on the evaluation of classifiers. Several classifiers ranging from classical to ensembles ones were evaluated using IRT models, which could simultaneously estimate dataset difficulty and classifiers' ability. The Glicko-2 rating system was applied on the top of IRT to summarize the innate ability and aptitude of classifiers. It was observed that not all datasets from OpenML-CC18 are really useful to evaluate classifiers. Most datasets evaluated in this work (84%) contain easy instances in general (e.g., around 10% of difficult instances only). Also, 80% of the instances in half of this benchmark are very discriminating ones, which can be of great use for pairwise algorithm comparison, but not useful to push classifiers abilities. This paper presents this new evaluation methodology based on IRT as well as the tool decodIRT, developed to guide IRT estimation over ML benchmarks.Comment: Paper published at the BRACIS 2020 conference, 15 pages, 4 figure

    Forecasting Time Series Movement Direction with Hybrid Methodology

    Get PDF
    Forecasting the tendencies of time series is a challenging task which gives better understanding. The purpose of this paper is to present the hybrid model of support vector regression associated with Autoregressive Integrated Moving Average which is formulated by hybrid methodology. The proposed model is more convenient for practical usage. The tendencies modeling of time series for Thailand’s south insurgency is of interest in this research article. The empirical results using the time series of monthly number of deaths, injuries, and incidents for Thailand’s south insurgency indicate that the proposed hybrid model is an effective way to construct an estimated hybrid model which is better than the classical time series model or support vector regression. The best forecast accuracy is performed by using mean square error

    Analysis of label noise in graph-based semi-supervised learning

    Full text link
    In machine learning, one must acquire labels to help supervise a model that will be able to generalize to unseen data. However, the labeling process can be tedious, long, costly, and error-prone. It is often the case that most of our data is unlabeled. Semi-supervised learning (SSL) alleviates that by making strong assumptions about the relation between the labels and the input data distribution. This paradigm has been successful in practice, but most SSL algorithms end up fully trusting the few available labels. In real life, both humans and automated systems are prone to mistakes; it is essential that our algorithms are able to work with labels that are both few and also unreliable. Our work aims to perform an extensive empirical evaluation of existing graph-based semi-supervised algorithms, like Gaussian Fields and Harmonic Functions, Local and Global Consistency, Laplacian Eigenmaps, Graph Transduction Through Alternating Minimization. To do that, we compare the accuracy of classifiers while varying the amount of labeled data and label noise for many different samples. Our results show that, if the dataset is consistent with SSL assumptions, we are able to detect the noisiest instances, although this gets harder when the number of available labels decreases. Also, the Laplacian Eigenmaps algorithm performed better than label propagation when the data came from high-dimensional clusters

    Anytime Guarantees for Reachability in Uncountable Markov Decision Processes

    Get PDF
    We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the "boundary" of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees

    Agent programming in the cognitive era

    Get PDF
    It is claimed that, in the nascent ‘Cognitive Era’, intelligent systems will be trained using machine learning techniques rather than programmed by software developers. A contrary point of view argues that machine learning has limitations, and, taken in isolation, cannot form the basis of autonomous systems capable of intelligent behaviour in complex environments. In this paper, we explore the contributions that agent-oriented programming can make to the development of future intelligent systems. We briefly review the state of the art in agent programming, focussing particularly on BDI-based agent programming languages, and discuss previous work on integrating AI techniques (including machine learning) in agent-oriented programming. We argue that the unique strengths of BDI agent languages provide an ideal framework for integrating the wide range of AI capabilities necessary for progress towards the next-generation of intelligent systems. We identify a range of possible approaches to integrating AI into a BDI agent architecture. Some of these approaches, e.g., ‘AI as a service’, exploit immediate synergies between rapidly maturing AI techniques and agent programming, while others, e.g., ‘AI embedded into agents’ raise more fundamental research questions, and we sketch a programme of research directed towards identifying the most appropriate ways of integrating AI capabilities into agent programs

    Análisis de las redes de colaboración entre las Instituciones de Educación Superior en Colombia de acuerdo con ResearchGate

    Get PDF
    The aim of this article is to analyze the collaboration networks between Higher Education Institutions in Colombia, according to the parameter “Top collaborating institutions” in ResearchGate. This paper makes a comparison between the networks of Higher Education Institutions accredited as high quality and those not accredited, according to the indicators of the National Accreditation System in Colombia. The analysis of the institutional collaboration is carried out by constructing joint work networks, using the UCINET software. The first institution registered in the “top of collaboration”, published in the ResearchGate profile of each Higher Education Institution, is taken in account. The results shows that accredited institutions have a well-connected and integrated collaboration network. On the other hand, the non-accredited institutions have a weak and poorly integrated collaboration network. In addition, non-accredited universities seek to collaborate mostly with accredited institutions and not between them. In this way, the efforts of non-accredited institutions are not well coordinated and become diluted in the distribution of their relationships.El objetivo del presente artículo es analizar las redes de colaboración entre las Instituciones de Educación Superior (IES) en Colombia de acuerdo con el parámetro de “Top collaborating institutions” en ResearchGate. El artículo efectúa una comparación entre las redes de las Instituciones de Educación Superior acreditadas en alta calidad y las no acreditadas, de acuerdo con los lineamientos del Sistema Nacional de Acreditación en Colombia. El análisis de la colaboración institucional se realiza mediante la construcción de redes de trabajo conjunto, en el programa UCINET; se considera la primera universidad registrada en el “Top collaborating institutions” publicado en perfil de ResearchGate de cada Institución de Educación Superior. Los resultados demuestran que las instituciones acreditadas poseen una red de colaboración bien conectada e integrada. Por el contrario, las instituciones no acreditadas poseen una red de colaboración débil y poco interconectada. Además, las instituciones no acreditadas buscan colaborar principalmente con IES acreditadas y no entre ellas. De esta manera, los esfuerzos de las instituciones no acreditadas no están bien coordinados y se diluyen en la distribución de sus relaciones colaborativas
    corecore