35 research outputs found

    Corridor Detection from Large GPS Trajectories Datasets

    Get PDF
    Given the widespread use of mobile devices that track their geographical location, it has become increasingly easy to acquire information related to users' trips in real time. This availability has triggered several studies based on user's position, such as the analysis of flows of people in cities, and also new applications, such as route recommendation systems. Given a dataset of geographical trajectories in an urbanmetropolitan area,we propose a algorithmto detect corridors. Corridors can be defined as geographical paths, with a minimum length, that are commonly traversed by a minimum number of different users. We propose an efficient strategy based on the Apriori algorithm to extract frequent trajectory patterns from the geo-spatial dataset. By discretizing the data and adapting the roles of itemsets and baskets of this algorithm to our context, we find the longest corridors formed by cells shared by a minimum number of trajectories. After that, we refine the results obtained with a subsequent filtering step, by using a Radius Neighbors Graph. To illustrate the algorithm, the GeoLife dataset is analyzed by following the proposed method. Our approach is relevant for transportation analytics because it is the base to detect lacking lines in public transportation systems and also to recommend to private users which route to take when moving from one part of the city to another on the basis of behavior of the users who provided their logs

    ¿Qué sabe de usted su robot aspirador?

    Get PDF
    Durante los últimos años, uno de los productos más vendidos durante las jornadas de rebajas conocidas como Black Friday o Cyber Monday ha sido el robot aspirador. Sus avanzadas prestaciones, que permiten su puesta en marcha y funcionamiento cuando el usuario está ausente de su domicilio, han seducido de forma clara a los compradores. Pero este éxito de ventas ha venido acompañado también de algunas dudas: ¿están estos robots adquiriendo datos sensibles que pueden ser vendidos en el mercado global de los datos personales

    Estimand-Agnostic Causal Query Estimation with Deep Causal Graphs

    Full text link
    Causal Queries are usually estimated by means of an estimand, a formula consisting of observational terms that can be computed using passive data. Each query results in a different formula, which makes estimand-based methods extremely ad-hoc. In this work, we propose an estimand-agnostic framework capable of computing any identifiable causal query on an arbitrary Causal Graph (even in the presence of latent confounders) with only one general model. We provide multiple implementations of this general framework that leverage the expressive power of Neural Networks and Normalizing Flows to model complex distributions, and we derive estimation procedures for all kinds of observational, interventional and counterfactual queries, valid for any kind of graph for which the query is identifiable. Finally, we test our techniques in a modelling setting and an estimation benchmark to show how, despite being a query-agnostic framework, it can compete with query-specific models. Our proposal includes an open-source library that allows easy application and extension of our techniques for researchers and practitioners alike

    Uncertainty-based Rejection Wrappers for Black-box Classifiers

    Get PDF
    Machine Learning as a Service platform is a very sensible choice for practitioners that wantto incorporate machine learning to their products while reducing times and costs. However, to benefit theiradvantages, a method for assessing their performance when applied to a target application is needed. In thiswork, we present a robust uncertainty-based method for evaluating the performance of both probabilistic andcategorical classification black-box models, in particular APIs, that enriches the predictions obtained withan uncertainty score. This uncertainty score enables the detection of inputs with very confident but erroneouspredictions while protecting against out of distribution data points when deploying the model in a productivesetting. We validate the proposal in different natural language processing and computer vision scenarios.Moreover, taking advantage of the computed uncertainty score, we show that one can significantly increasethe robustness and performance of the resulting classification system by rejecting uncertain prediction

    A Survey on Uncertainty Estimation in Deep Learning Classification Systems from a Bayesian Perspective

    Get PDF
    Decision-making based on machine learning systems, especially when this decision-making can affect humanlives, is a subject of maximum interest in the Machine Learning community. It is, therefore, necessary to equipthese systems with a means of estimating uncertainty in the predictions they emit in order to help practition-ers make more informed decisions. In the present work, we introduce the topic of uncertainty estimation, andwe analyze the peculiarities of such estimation when applied to classification systems. We analyze differentmethods that have been designed to provide classification systems based on deep learning with mechanismsfor measuring the uncertainty of their predictions. We will take a look at how this uncertainty can be mod-eled and measured using different approaches, as well as practical considerations of different applications ofuncertainty. Moreover, we review some of the properties that should be borne in mind when developing suchmetrics. All in all, the present survey aims at providing a pragmatic overview of the estimation of uncertaintyin classification systems that can be very useful for both academic research and deep learning practitioners

    Self-supervised out-of-distribution detection in wireless capsule endoscopy images.

    Full text link
    While deep learning has displayed excellent performance in a broad spectrum of application areas, neural networks still struggle to recognize what they have not seen, i.e., out-of-distribution (OOD) inputs. In the medical field, building robust models that are able to detect OOD images is highly critical, as these rare images could show diseases or anomalies that should be detected. In this study, we use wireless capsule endoscopy (WCE) images to present a novel patch-based self-supervised approach comprising three stages. First, we train a triplet network to learn vector representations of WCE image patches. Second, we cluster the patch embeddings to group patches in terms of visual similarity. Third, we use the cluster assignments as pseudolabels to train a patch classifier and use the Out-of-Distribution Detector for Neural Networks (ODIN) for OOD detection. The system has been tested on the Kvasir-capsule, a publicly released WCE dataset. Empirical results show an OOD detection improvement compared to baseline methods. Our method can detect unseen pathologies and anomalies such as lymphangiectasia, foreign bodies and blood with > 0.6. This work presents an effective solution for OOD detection models without needing labeled images

    Building uncertainty models on top of black-box predictive APIs

    Get PDF
    With the commoditization of machine learning, more and more off-the-shelf models are available as part of code libraries or cloud services. Typically, data scientists and other users apply these models as ''black boxes'' within larger projects. In the case of regressing a scalar quantity, such APIs typically offer a predict() function, which outputs the estimated target variable (often referred to as y¿ or, in code, y_hat). However, many real-world problems may require some sort of deviation interval or uncertainty score rather than a single point-wise estimate. In other words, a mechanism is needed with which to answer the question ''How confident is the system about that prediction?'' Motivated by the lack of this characteristic in most predictive APIs designed for regression purposes, we propose a method that adds an uncertainty score to every black-box prediction. Since the underlying model is not accessible, and therefore standard Bayesian approaches are not applicable, we adopt an empirical approach and fit an uncertainty model using a labelled dataset (x, y) and the outputs y¿ of the black box. In order to be able to use any predictive system as a black box and adapt to its complex behaviours, we propose three variants of an uncertainty model based on deep networks. The first adds a heteroscedastic noise component to the black-box output, the second predicts the residuals of the black box, and the third performs quantile regression using deep networks. Experiments using real financial data that contain an in-production black-box system and two public datasets (energy forecasting and biology responses) illustrate and quantify how uncertainty scores can be added to black-box outputs

    CatLC: Catalonia Multiresolution Land Cover Dataset

    Full text link
    The availability of large annotated image datasets represented one of the tipping points in the progress of object recognition in the realm of natural images, but other important visual spaces are still lacking this asset. In the case of remote sensing, only a few richly annotated datasets covering small areas are available. In this paper, we present the Catalonia Multiresolution Land Cover Dataset (CatLC), a remote sensing dataset corresponding to a mid-size geographical area which has been carefully annotated with a large variety of land cover classes. The dataset includes pre-processed images from the Cartographic and Geological Institute of Catalonia (ICGC) (https://www.icgc.cat/en/Downloads) and the European Space Agency (ESA) (https://scihub.copernicus.eu) catalogs, captured from both aircraft and satellites. Detailed topographic layers inferred from other sensors are also included. CatLC is a multiresolution, multimodal, multitemporal dataset, that can be readily used by the machine learning community to explore new classification techniques for land cover mapping in different scenarios such as area estimation in forest inventories, hydrologic studies involving microclimatic variables or geologic hazards identification and assessment. Moreover, remote sensing data present some specific characteristics that are not shared by natural images and that have been seldom explored. In this vein, CatLC dataset aims to engage with computer vision experts interested in remote sensing and also stimulate new research and development in the field of machine learning

    Caront : implementació i millora d'activitats d'avaluació i primeres experiències amb diferents organitzacions docents

    Get PDF
    El gestor documental Caront basat en Moodle, del que es va presentar una primera comunicació en l'anterior JID continúa navegant en assignatures d'Enginyeria. Caront gestiona les diferents activitats que configuren una assignatura: gestió de grups, activitats, avaluacions, fòrums, wiki, etc. Algunes d'aquestes activitats s'han incorporat de les que disposa Moodle i d'altres s'han modificat per una millor adaptació als requeriments de les assignatures d'Enginyeria, tot i que fácilmente poden ser extrapolables a assignatures d'altres especialitatsEn aquest article es presenta la implementació o modificació de dues activitats que impliquen avaluació: tasques i autoavaluació. La tasca consisteix en plantejar activitats en què l'alumne ha de fer l'entrega d'un fitxer o conjunt de fitxers corresponents a un exercici, problema o pràctica. Aquesta activitat és avaluable i es pot presentar de forma individual o en grup. L'autoavaluació és un mòdul íntegrament implementat pels autors a partir del mòdul d'enquestes presentat l'anterior JID. Es tracta d'una enquesta que realitza cada membre d'un grup per avaluar el treball dels companys i el d'un mateix. És una activitat que només es pot plantejar si l'assignatura té grups de treball i de la que es poden treure estadístiques i un llistat amb les diferents valoracions fetes per a cada alumne respecte els seus companys. Un dels punts importants que pretenem amb Caront és la seva versatilitat en diferents organitzacions docents. En la darrera part de l'article es presenta l'experiència d'ús de Caront en diferents organitzacions docents en assignatures obligatòries i optatives d'Enginyeria Informàtica durant el curs 2006-07. Fruit d'aquestes primeres experiències han començat a sorgir propostes de millora i funcionalitats que s'adaptin millor a les organitzacions docents que s'han provat. S'espera pel curs vinent poder incorporar noves assignatures d'Enginyeria Informàtica i d'altres titulacions
    corecore