126 research outputs found

    Autonomous learning multi-model systems from data streams

    Get PDF
    In this paper, an approach to autonomous learning of a multi-model system from streaming data, named ALMMo, is proposed. The proposed approach is generic and can easily be applied also to probabilistic or other types of local models forming multi-model systems. It is fully data-driven and its structure is decided by the nonparametric data clouds extracted from the empirically observed data without making any prior assumptions concerning data distribution and other data properties. All meta-parameters of the proposed system are obtained directly from the data and can be updated recursively, which improves memory- and calculation-efficiency of the proposed algorithm. The structural evolution mechanism and online data cloud quality monitoring mechanism of the ALMMo system largely enhance the ability of handling shifts and/or drifts in the streaming data pattern. Numerical examples of the use of ALMMo system for streaming data analytics, classification and prediction are presented as a proof of the proposed concept

    Definitive Consensus for Distributed Data Inference

    Get PDF
    Inference from data is of key importance in many applications of informatics. The current trend in performing such a task of inference from data is to utilise machine learning algorithms. Moreover, in many applications that it is either required or is preferable to infer from the data in a distributed manner. Many practical difficulties arise from the fact that in many distributed applications we avert from transferring data or parts of it due to costs, privacy and computation considerations. Admittedly, it would be advantageous if the final knowledge, attained through distributed data inference, is common to every participating computing node. The key in achieving the aforementioned task is the distributed average consensus algorithm or simply the consensus algorithm herein. The latter has been used in many applications. Initially the main purpose has been for the estimation of the expectation of scalar valued data distributed over a network of machines without a central node. Notably, the algorithm allows the final outcome to be the same for every participating node. Utilising the consensus algorithm as the centre piece makes the task of distributed data inference feasible. However, there are many difficulties that hinder its direct applicability. Thus, we concentrate on the consensus algorithm with the purpose of addressing these difficulties. There are two main concerns. First, the consensus algorithm has asymptotic convergence. Thus, we may only achieve maximum accuracy if the algorithm is left to run for a large number of iterations. Second, the accuracy attained at any iteration during the consensus algorithm is correlated with the standard deviation of the initial value distribution. The consensus algorithm is inherently imprecise at finite time and this hardens the learning process. We solve this problem by introducing the definitive consensus algorithm. This algorithm attains maximum precision in a finite number of iterations, namely in a number of iterations equal to the diameter of the graph in a distributed and decentralised manner. Additionally, we introduce the nonlinear consensus algorithm and the adaptive consensus algorithm. These are modifications of the original consensus algorithm that allow improved precision with fewer iterations in cases of unknown, partially known and stochastically time-varying network topologies. The definitive consensus algorithm can be incorporated in a distributed data inference framework. We approach the problem of data inference from the perspective of machine learning. Specifically, we tailor this distributed inference framework for machine learning on a communication network with data partitioned on the participating computing nodes. Particularly, the distributed data inference framework is detailed and applied to the case of a multilayer feed forward neural network with error back-propagation. A substantial examination of its performance and its comparison with the non-distributed case, is provided. Theoretical foundation for the definitive consensus algorithm is provided. Moreover, its superior performance is validated by numerical experiments. A brief theoretical examination of the nonlinear and the adaptive consensus algorithms is performed to justify their improved performance with respect to the original consensus algorithm. Moreover, extensive numerical simulations are given to compare the nonlinear and the adaptive algorithm with the original consensus algorithm. The most important contributions of this research are principally the definitive consensus algorithm and the distributed data inference framework. Their combination yields a decentralised distributed process over a communication network capable for inference in agreement over the entire network

    Optimal control and approximations

    Get PDF

    Optimal control and approximations

    Get PDF

    Multivariate methods for interpretable analysis of magnetic resonance spectroscopy data in brain tumour diagnosis

    Get PDF
    Malignant tumours of the brain represent one of the most difficult to treat types of cancer due to the sensitive organ they affect. Clinical management of the pathology becomes even more intricate as the tumour mass increases due to proliferation, suggesting that an early and accurate diagnosis is vital for preventing it from its normal course of development. The standard clinical practise for diagnosis includes invasive techniques that might be harmful for the patient, a fact that has fostered intensive research towards the discovery of alternative non-invasive brain tissue measurement methods, such as nuclear magnetic resonance. One of its variants, magnetic resonance imaging, is already used in a regular basis to locate and bound the brain tumour; but a complementary variant, magnetic resonance spectroscopy, despite its higher spatial resolution and its capability to identify biochemical metabolites that might become biomarkers of tumour within a delimited area, lags behind in terms of clinical use, mainly due to its difficult interpretability. The interpretation of magnetic resonance spectra corresponding to brain tissue thus becomes an interesting field of research for automated methods of knowledge extraction such as machine learning, always understanding its secondary role behind human expert medical decision making. The current thesis aims at contributing to the state of the art in this domain by providing novel techniques for assistance of radiology experts, focusing on complex problems and delivering interpretable solutions. In this respect, an ensemble learning technique to accurately discriminate amongst the most aggressive brain tumours, namely glioblastomas and metastases, has been designed; moreover, a strategy to increase the stability of biomarker identification in the spectra by means of instance weighting is provided. From a different analytical perspective, a tool based on signal source separation, guided by tumour type-specific information has been developed to assess the existence of different tissues in the tumoural mass, quantifying their influence in the vicinity of tumoural areas. This development has led to the derivation of a probabilistic interpretation of some source separation techniques, which provide support for uncertainty handling and strategies for the estimation of the most accurate number of differentiated tissues within the analysed tumour volumes. The provided strategies should assist human experts through the use of automated decision support tools and by tackling interpretability and accuracy from different anglesEls tumors cerebrals malignes representen un dels tipus de càncer més difícils de tractar degut a la sensibilitat de l’òrgan que afecten. La gestió clínica de la patologia esdevé encara més complexa quan la massa tumoral s'incrementa degut a la proliferació incontrolada de cèl·lules; suggerint que una diagnosis precoç i acurada és vital per prevenir el curs natural de desenvolupament. La pràctica clínica estàndard per a la diagnosis inclou la utilització de tècniques invasives que poden arribar a ser molt perjudicials per al pacient, factor que ha fomentat la recerca intensiva cap al descobriment de mètodes alternatius de mesurament dels teixits del cervell, tals com la ressonància magnètica nuclear. Una de les seves variants, la imatge de ressonància magnètica, ja s'està actualment utilitzant de forma regular per localitzar i delimitar el tumor. Així mateix, una variant complementària, la espectroscòpia de ressonància magnètica, malgrat la seva alta resolució espacial i la seva capacitat d'identificar metabòlits bioquímics que poden esdevenir biomarcadors de tumor en una àrea delimitada, està molt per darrera en termes d'ús clínic, principalment per la seva difícil interpretació. Per aquest motiu, la interpretació dels espectres de ressonància magnètica corresponents a teixits del cervell esdevé un interessant camp de recerca en mètodes automàtics d'extracció de coneixement tals com l'aprenentatge automàtic, sempre entesos com a una eina d'ajuda per a la presa de decisions per part d'un metge expert humà. La tesis actual té com a propòsit la contribució a l'estat de l'art en aquest camp mitjançant l'aportació de noves tècniques per a l'assistència d'experts radiòlegs, centrades en problemes complexes i proporcionant solucions interpretables. En aquest sentit, s'ha dissenyat una tècnica basada en comitè d'experts per a una discriminació acurada dels diferents tipus de tumors cerebrals agressius, anomenats glioblastomes i metàstasis; a més, es proporciona una estratègia per a incrementar l'estabilitat en la identificació de biomarcadors presents en un espectre mitjançant una ponderació d'instàncies. Des d'una perspectiva analítica diferent, s'ha desenvolupat una eina basada en la separació de fonts, guiada per informació específica de tipus de tumor per a avaluar l'existència de diferents tipus de teixits existents en una massa tumoral, quantificant-ne la seva influència a les regions tumorals veïnes. Aquest desenvolupament ha portat cap a la derivació d'una interpretació probabilística d'algunes d'aquestes tècniques de separació de fonts, proporcionant suport per a la gestió de la incertesa i estratègies d'estimació del nombre més acurat de teixits diferenciats en cada un dels volums tumorals analitzats. Les estratègies proporcionades haurien d'assistir els experts humans en l'ús d'eines automatitzades de suport a la decisió, donada la interpretabilitat i precisió que presenten des de diferents angles

    Machine Learning Methods with Noisy, Incomplete or Small Datasets

    Get PDF
    In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis

    Assessing sustainability in cities : a complexity science approach to the concept of happiness for the urban environment

    Get PDF
    Where we live affects all aspects of our life and thus our happiness. In recent years, and now for more than half of the Earth’s population, our place of residence or activity has been increasingly transformed into an urban one. We start our quest for happiness using bibliometric research to investigate it framework as scientists constructed it during the past years. We detect that while the impact of happiness studies has grown in importance during the last twenty years, happiness-related concepts find it difficult to penetrate the urban studies field of studies. We map the temporal evolution of both happiness and urban studies fields into dynamic networks obtained by paper keywords co-occurrence analysis. We identify the main concepts of “urban happiness” field and their capacity to agglomerate into coherent thematic clusters. We present a one-parameter spatial network model to reproduce the changes in the topology of these networks. Results explain the evolution and the level of interpenetration of these two fields as a function of “conceptual” distances, mapped into Euclidean ones. Complex networks science appears as a valid alternative to other approaches (i.e., co-frequency matrix of bibliometric analysis), and opens the way for the systematic study of other academic fields in terms of complex evolving networks. We then present a methodology based on Max-Neef, et al. (1991) “human scale development” paradigm to measure current levels of Quality of Life (QoL) for urban environments. We use the fundamental human needs as our study domains. Drawing on the cases of Vila de Gràcia neighbourhood and Virreina square of Barcelona, we assess their fulfilment with a set of questions reflecting the subjective dimension of QoL. We use two consecutive processes to sort questions into needs: a qualitative involving local communities and/or expert groups, and a quantitative involving the definition of weights for each question and per need. We add objective indicators to reflect the objective dimension of QoL. We compare the two dimensions and define an integrative QoL. We identify intervention axes for a potential improvement in the results. We argue that this method can be used to define more holistic urban quality indexes to improve decision making processes, policies and plans. It is a tool to enhance bottom-up approaches and processes of urban analysis to create more liveable places for the dwellers. Next, we present a methodology based on weighted networks and dependence coefficients aimed at revealing connectivity patterns between categories. Using the same case studies and human needs as our categories we show that diverse spatial levels present different and nontrivial patterns of need emergence. A numerical model indicates that these patterns depend on the probability distribution of weights. We suggest that this way of analysing the connectivity of categories (human needs in our case study) in social and ecological systems can be used to define new strategies to cope with complex processes, such as those related to transition management and governance, urban-making, and integrated planning. We conclude our journey with applications that show the strength of collective response regarding social matters. We study dwellers perceptions through the following cases: experimental activities in the public space, discourse analysis and reaction on emerging urban phenomena such as the massive migration of population in the Mediterranean during 2015.Donde vivimos afecta todos los aspectos de nuestra vida y, por lo tanto, nuestra felicidad. En los últimos años, y para más de la mitad de la población de la Tierra, nuestro lugar de residencia o actividad se transforma a uno urbano. Comenzamos nuestra búsqueda de la felicidad aplicando investigación bibliométrica para investigar el marco su tal como lo construyeron los científicos durante los últimos años. Detectamos que si bien el impacto de los estudios de la felicidad ha crecido en importancia durante los últimos veinte años, los conceptos relacionados con la felicidad tienen dificultades en penetrar el campo de los estudios urbanos. Mapeamos la evolución temporal de los campos de felicidad y estudios urbanos en redes dinámicas obtenidas mediante análisis de coocurrencia de palabras clave en artículos científicos. Identificamos los conceptos principales del campo de "felicidad urbana" y su capacidad para aglomerarse en grupos temáticos coherentes. Presentamos un modelo de red espacial de un parámetro para reproducir los cambios en la topología de estas redes. Los resultados explican la evolución y el nivel de interpenetración de estos dos campos en función de las distancias "conceptuales", mapeadas en euclidianas. La ciencia de redes complejas aparece como una alternativa válida a otros enfoques (p.e., matriz de frecuencia conjunta de análisis bibliométrico) y abre el camino para el estudio sistemático de otros campos académicos en términos de redes complejas en evolución. A continuación presentamos una metodología basada en el paradigma de Max-Neef, et al. (1991) de "desarrollo a escala humana" para medir los niveles actuales de calidad de vida en entornos urbanos. Utilizamos las necesidades humanas fundamentales como nuestros campos de estudio. Basados en los casos del barrio de Vila de Gràcia y la plaza Virreina de Barcelona, evaluamos el cumplimiento de un conjunto de preguntas que reflejan la dimensión subjetiva de la calidad de vida. Utilizamos dos procesos consecutivos para clasificar las preguntas en necesidades: una cualitativa que involucra a las comunidades locales y / o grupos de expertos, y una cuantitativa que involucra la definición de pesos para cada pregunta y por necesidad. Agregamos indicadores objetivos para reflejar la dimensión objetiva de la calidad de vida. Comparamos las dos dimensiones y definimos una calidad de vida integrativa. Identificamos ejes de intervención para conseguir una posible mejora en los resultados. Argumentamos que este método puede usarse para definir índices de calidad urbana más holísticos para mejorar los procesos, políticas y planes de toma de decisiones. Es una herramienta para dinamizar los enfoques desde la base (bottom-up) y los procesos de análisis urbano para crear lugares más vivibles para los habitantes. Seguimos con una metodología basada en redes ponderadas y coeficientes de dependencia destinados a revelar patrones de conectividad entre categorías. Usando los mismos casos de estudio y las necesidades humanas como nuestras categorías, mostramos que diversos niveles espaciales presentan patrones de emergencia diferentes y no triviales. Un modelo numérico indica que estos patrones dependen de la distribución de probabilidad de los pesos. Sugerimos que esta forma de analizar la conectividad de las categorías (necesidades humanas en nuestro caso de estudio) en los sistemas socio-ecológicos se puede utilizar para definir nuevas estrategias para hacer frente a procesos complejos, como los relacionados con la gestión de la transición y la gobernanza, la construcción urbana y planificación integrada. Concluimos nuestro viaje con aplicaciones que muestran la fuerza de la respuesta colectiva en asuntos social. Estudiamos percepciones de habitantes a través de los siguientes casos: actividades experimentales en el espacio público, análisis del discurso y reacción ante fenómenos urbanos emergentes, como la migración masiva en el Mediterraneo durante 201
    corecore