36 research outputs found

    Filtrando atributos para mejorar procesos de aprendizaje

    Get PDF
    IX Conferencia de la Asociación Española para la Inteligencia Artificial. Gijón, EspañaLos sistemas de aprendizaje automático han sido tradicionalmente usados para extraer conocimiento a partir de conjuntos de ejemplos descritos mediante atributos. Cuando la información de partida representa un problema real no se sabe, generalmente, qué atributos influyen en su resolución. En esos casos, la única opción a priori es utilizar toda la información disponible. Para evitar los problemas que esto conlleva se puede emplear un filtrado de atributos, previo al aprendizaje, que nos permita quedarnos sólo con los atributos más relevantes, aquellos que encierran la solución del problema. En este artículo se describe un método que realiza esta selección. Como se mostrará, está técnica mejora los procesos posteriores de aprendizaj

    Using A* for inference in probabilistic classifier chains

    Get PDF
    IJCAI-15, Buenos Aires, Argentina, 25–31 de julio de 2015Probabilistic Classifiers Chains (PCC) offers interesting properties to solve multi-label classification tasks due to its ability to estimate the joint probability of the labels. However, PCC presents the major drawback of having a high computational cost in the inference process required to predict new samples. Lately, several approaches have been proposed to overcome this issue, including beam search and an -Approximate algorithm based on uniform-cost search. Surprisingly, the obvious possibility of using heuristic search has not been considered yet. This paper studies this alternative and proposes an admisible heuristic that, applied in combination with A* algorithm, guarantees, not only optimal predictions in terms of subset 0/1 loss, but also that it always explores less nodes than -Approximate algorithm. In the experiments reported, the number of nodes explored by our method is less than two times the number of labels for all datasets analyzed. But, the difference in explored nodes must be large enough to compensate the overhead of the heuristic in order to improve prediction time. Thus, our proposal may be a good choice for complex multi-label problem

    A simple and efficient method for variable ranking according to their usefulness for learning

    Get PDF
    The selection of a subset of input variables is often based on the previous construction of a ranking to order the variables according to a given criterion of relevancy. The objective is then to linearize the search, estimating the quality of subsets containing the topmost ranked variables. An algorithm devised to rank input variables according to their usefulness in the context of a learning task is presented. This algorithm is the result of a combination of simple and classical techniques, like correlation and orthogonalization, which allow the construction of a fast algorithm that also deals explicitly with redundancy. Additionally, the proposed ranker is endowed with a simple polynomial expansion of the input variables to cope with nonlinear problems. The comparison with some state-of-the-art rankers showed that this combination of simple components is able to yield high-quality rankings of input variables. The experimental validation is made on a wide range of artificial data sets and the quality of the rankings is assessed using a ROC-inspired setting, to avoid biased estimations due to any particular learning algorith

    Utilización de técnicas de Inteligencia Artificial en la clasificación de canales bovinas

    Get PDF
    En esta comunicación se presenta una aplicación de técnicas de Inteligencia Artificial en la industria alimentaria. Se ha desarrollado una metodología de representación de la conformación de canales bovinas, sintetizándose el conocimiento de los expertos mediante herramientas de Aprendizaje Automático. Los resultados obtenidos demuestran la viabilidad de utilizar clasificadores automáticos, que son capaces de realizar su tarea de manera eficaz con una reducción importante del número de atributos inicial. Este trabajo abre un amplio abanico de posibilidades de aplicación del Aprendizaje Automático en la industria de la alimentació

    The usefulness of artificial intelligence techniques to assess subjective quality of products in the food industry

    Get PDF
    In this paper we advocate the application of Artificial Intelligence techniques to quality assessment of food products. Machine Learning algorithms can help us to: (a) extract operative human knowledge from a set of examples; (b) conclude interpretable rules for classifying samples regardless of the non-linearity of the human behaviour or process; and (c) help us to ascertain the degree of influence of each objective attribute of the assessed food on the final decision of an expert. We illustrate these topics with an example of how it is possible to clone the behaviour of bovine carcass classifiers, leading to possible further industrial application

    Clonal chromosomal mosaicism and loss of chromosome Y in elderly men increase vulnerability for SARS-CoV-2

    Full text link
    The pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2, COVID-19) had an estimated overall case fatality ratio of 1.38% (pre-vaccination), being 53% higher in males and increasing exponentially with age. Among 9578 individuals diagnosed with COVID-19 in the SCOURGE study, we found 133 cases (1.42%) with detectable clonal mosaicism for chromosome alterations (mCA) and 226 males (5.08%) with acquired loss of chromosome Y (LOY). Individuals with clonal mosaic events (mCA and/or LOY) showed a 54% increase in the risk of COVID-19 lethality. LOY is associated with transcriptomic biomarkers of immune dysfunction, pro-coagulation activity and cardiovascular risk. Interferon-induced genes involved in the initial immune response to SARS-CoV-2 are also down-regulated in LOY. Thus, mCA and LOY underlie at least part of the sex-biased severity and mortality of COVID-19 in aging patients. Given its potential therapeutic and prognostic relevance, evaluation of clonal mosaicism should be implemented as biomarker of COVID-19 severity in elderly people. Among 9578 individuals diagnosed with COVID-19 in the SCOURGE study, individuals with clonal mosaic events (clonal mosaicism for chromosome alterations and/or loss of chromosome Y) showed an increased risk of COVID-19 lethality

    Diverse Large HIV-1 Non-subtype B Clusters Are Spreading Among Men Who Have Sex With Men in Spain

    Get PDF
    In Western Europe, the HIV-1 epidemic among men who have sex with men (MSM) is dominated by subtype B. However, recently, other genetic forms have been reported to circulate in this population, as evidenced by their grouping in clusters predominantly comprising European individuals. Here we describe four large HIV-1 non-subtype B clusters spreading among MSM in Spain. Samples were collected in 9 regions. A pol fragment was amplified from plasma RNA or blood-extracted DNA. Phylogenetic analyses were performed via maximum likelihood, including database sequences of the same genetic forms as the identified clusters. Times and locations of the most recent common ancestors (MRCA) of clusters were estimated with a Bayesian method. Five large non-subtype B clusters associated with MSM were identified. The largest one, of F1 subtype, was reported previously. The other four were of CRF02_AG (CRF02_1; n = 115) and subtypes A1 (A1_1; n = 66), F1 (F1_3; n = 36), and C (C_7; n = 17). Most individuals belonging to them had been diagnosed of HIV-1 infection in the last 10 years. Each cluster comprised viruses from 3 to 8 Spanish regions and also comprised or was related to viruses from other countries: CRF02_1 comprised a Japanese subcluster and viruses from 8 other countries from Western Europe, Asia, and South America; A1_1 comprised viruses from Portugal, United Kingom, and United States, and was related to the A1 strain circulating in Greece, Albania and Cyprus; F1_3 was related to viruses from Romania; and C_7 comprised viruses from Portugal and was related to a virus from Mozambique. A subcluster within CRF02_1 was associated with heterosexual transmission. Near full-length genomes of each cluster were of uniform genetic form. Times of MRCAs of CRF02_1, A1_1, F1_3, and C_7 were estimated around 1986, 1989, 2013, and 1983, respectively. MRCA locations for CRF02_1 and A1_1 were uncertain (however initial expansions in Spain in Madrid and Vigo, respectively, were estimated) and were most probable in Bilbao, Spain, for F1_3 and Portugal for C_7. These results show that the HIV-1 epidemic among MSM in Spain is becoming increasingly diverse through the expansion of diverse non-subtype B clusters, comprising or related to viruses circulating in other countries

    SAFE: sistema de aprendizaje de funciones a partir de ejemplos

    No full text
    Hay dos corrientes fundamentales en el aprendizaje automático a partir de ejemplos: los sistemas que tratan de sintetizar reglas ( u otros mecanismo equivalentes) capaces de clasificar casos no vistos en un conjunto finito de categorías discretas, y los algoritmos que persiguen la inducción de funciones. En esta tesis se combinan los dos planteamientos para obtener un nuevo método capaz de aprender funciones y que usa, como una herramienta central, un sistema de aprendizaje automático de categorías discretas. El resultado es un mecanismos capaces de manejar de manera natural tanto atributos con valores discretos como continuos y concluyendo funciones parcialmente definidas por aplicaciones lineales a partir de un subconjunto de atributos numéricos del dominio. El sistema así obtenido se llama SAFE, un acrónimo de sistema de aprendizaje de funciones a partir de ejemplos, es comparado con éxito con algoritmos de reconocido prestigio diseñados para llevar a cabo esta tarea

    A heuristic in A* for inference in nonlinear Probabilistic Classifier Chains

    No full text
    Probabilistic Classifier Chains (PCC) is a very interesting method to cope with multi-label classification, since it is able to obtain the entire joint probability distribution of the labels. However, such probability distribution is obtained at the expense of a high computational cost. Several efforts have been made to overcome this pitfall, proposing different inference methods for estimating the probability distribution. Beam search and the - approximate algorithms are two methods of this kind. A more recently approach is based on the A* algorithm with an admissible heuristic, but it is limited to be used just for linear classifiers as base methods for PCC. This paper goes in that direction presenting an alternative admissible heuristic for the A* algorithm with two promising advantages in comparison to the above-mentioned heuristic, namely, i) it is more dominant for the same depth and, hence, it explores fewer nodes and ii) it is suitable for nonlinear classifiers. Additionally, the paper proposes an efficient implementation for the computation of the heuristic that reduces the number of models that must be evaluated by half. The experiments show, as theoretically expected, that this new algorithm reaches Bayes-optimal predictions in terms of subset 0/1 loss and explores fewer nodes than other state-of-the-art methods that also provide optimal predictions. In spite of exploring fewer nodes, this new algorithm is not as fast as the -approximate algorithm with when the search for an optimal solution is highly directed. However, it shows its strengths when the datasets present more uncertainty, making faster predictions than other state-of-the-art approache
    corecore