32 research outputs found

    Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning

    Full text link
    The paper introduces the application of information geometry to describe the ground states of Ising models by utilizing parity-check matrices of cyclic and quasi-cyclic codes on toric and spherical topologies. The approach establishes a connection between machine learning and error-correcting coding. This proposed approach has implications for the development of new embedding methods based on trapping sets. Statistical physics and number geometry applied for optimize error-correcting codes, leading to these embedding and sparse factorization methods. The paper establishes a direct connection between DNN architecture and error-correcting coding by demonstrating how state-of-the-art architectures (ChordMixer, Mega, Mega-chunk, CDIL, ...) from the long-range arena can be equivalent to of block and convolutional LDPC codes (Cage-graph, Repeat Accumulate). QC codes correspond to certain types of chemical elements, with the carbon element being represented by the mixed automorphism Shu-Lin-Fossorier QC-LDPC code. The connections between Belief Propagation and the Permanent, Bethe-Permanent, Nishimori Temperature, and Bethe-Hessian Matrix are elaborated upon in detail. The Quantum Approximate Optimization Algorithm (QAOA) used in the Sherrington-Kirkpatrick Ising model can be seen as analogous to the back-propagation loss function landscape in training DNNs. This similarity creates a comparable problem with TS pseudo-codeword, resembling the belief propagation method. Additionally, the layer depth in QAOA correlates to the number of decoding belief propagation iterations in the Wiberg decoding tree. Overall, this work has the potential to advance multiple fields, from Information Theory, DNN architecture design (sparse and structured prior graph topology), efficient hardware design for Quantum and Classical DPU/TPU (graph, quantize and shift register architect.) to Materials Science and beyond.Comment: 71 pages, 42 Figures, 1 Table, 1 Appendix. arXiv admin note: text overlap with arXiv:2109.08184 by other author

    DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications

    Full text link
    Convolutional Neural Networks (CNNs) are a cornerstone of the Deep Learning toolbox and have led to many breakthroughs in Artificial Intelligence. These networks have mostly been developed for regular Euclidean domains such as those supporting images, audio, or video. Because of their success, CNN-based methods are becoming increasingly popular in Cosmology. Cosmological data often comes as spherical maps, which make the use of the traditional CNNs more complicated. The commonly used pixelization scheme for spherical maps is the Hierarchical Equal Area isoLatitude Pixelisation (HEALPix). We present a spherical CNN for analysis of full and partial HEALPix maps, which we call DeepSphere. The spherical CNN is constructed by representing the sphere as a graph. Graphs are versatile data structures that can act as a discrete representation of a continuous manifold. Using the graph-based representation, we define many of the standard CNN operations, such as convolution and pooling. With filters restricted to being radial, our convolutions are equivariant to rotation on the sphere, and DeepSphere can be made invariant or equivariant to rotation. This way, DeepSphere is a special case of a graph CNN, tailored to the HEALPix sampling of the sphere. This approach is computationally more efficient than using spherical harmonics to perform convolutions. We demonstrate the method on a classification problem of weak lensing mass maps from two cosmological models and compare the performance of the CNN with that of two baseline classifiers. The results show that the performance of DeepSphere is always superior or equal to both of these baselines. For high noise levels and for data covering only a smaller fraction of the sphere, DeepSphere achieves typically 10% better classification accuracy than those baselines. Finally, we show how learned filters can be visualized to introspect the neural network.Comment: arXiv admin note: text overlap with arXiv:astro-ph/0409513 by other author

    Let's Enhance: A Deep Learning Approach to Extreme Deblurring of Text Images

    Full text link
    This work presents a novel deep-learning-based pipeline for the inverse problem of image deblurring, leveraging augmentation and pre-training with synthetic data. Our results build on our winning submission to the recent Helsinki Deblur Challenge 2021, whose goal was to explore the limits of state-of-the-art deblurring algorithms in a real-world data setting. The task of the challenge was to deblur out-of-focus images of random text, thereby in a downstream task, maximizing an optical-character-recognition-based score function. A key step of our solution is the data-driven estimation of the physical forward model describing the blur process. This enables a stream of synthetic data, generating pairs of ground-truth and blurry images on-the-fly, which is used for an extensive augmentation of the small amount of challenge data provided. The actual deblurring pipeline consists of an approximate inversion of the radial lens distortion (determined by the estimated forward model) and a U-Net architecture, which is trained end-to-end. Our algorithm was the only one passing the hardest challenge level, achieving over 70%70\% character recognition accuracy. Our findings are well in line with the paradigm of data-centric machine learning, and we demonstrate its effectiveness in the context of inverse problems. Apart from a detailed presentation of our methodology, we also analyze the importance of several design choices in a series of ablation studies. The code of our challenge submission is available under https://github.com/theophil-trippe/HDC_TUBerlin_version_1.Comment: This article has been published in a revised form in Inverse Problems and Imagin

    Tensor-variate machine learning on graphs

    Get PDF
    Traditional machine learning algorithms are facing significant challenges as the world enters the era of big data, with a dramatic expansion in volume and range of applications and an increase in the variety of data sources. The large- and multi-dimensional nature of data often increases the computational costs associated with their processing and raises the risks of model over-fitting - a phenomenon known as the curse of dimensionality. To this end, tensors have become a subject of great interest in the data analytics community, owing to their remarkable ability to super-compress high-dimensional data into a low-rank format, while retaining the original data structure and interpretability. This leads to a significant reduction in computational costs, from an exponential complexity to a linear one in the data dimensions. An additional challenge when processing modern big data is that they often reside on irregular domains and exhibit relational structures, which violates the regular grid assumptions of traditional machine learning models. To this end, there has been an increasing amount of research in generalizing traditional learning algorithms to graph data. This allows for the processing of graph signals while accounting for the underlying relational structure, such as user interactions in social networks, vehicle flows in traffic networks, transactions in supply chains, chemical bonds in proteins, and trading data in financial networks, to name a few. Although promising results have been achieved in these fields, there is a void in literature when it comes to the conjoint treatment of tensors and graphs for data analytics. Solutions in this area are increasingly urgent, as modern big data is both large-dimensional and irregular in structure. To this end, the goal of this thesis is to explore machine learning methods that can fully exploit the advantages of both tensors and graphs. In particular, the following approaches are introduced: (i) Graph-regularized tensor regression framework for modelling high-dimensional data while accounting for the underlying graph structure; (ii) Tensor-algebraic approach for computing efficient convolution on graphs; (iii) Graph tensor network framework for designing neural learning systems which is both general enough to describe most existing neural network architectures and flexible enough to model large-dimensional data on any and many irregular domains. The considered frameworks were employed in several real-world applications, including air quality forecasting, protein classification, and financial modelling. Experimental results validate the advantages of the proposed methods, which achieved better or comparable performance against state-of-the-art models. Additionally, these methods benefit from increased interpretability and reduced computational costs, which are crucial for tackling the challenges posed by the era of big data.Open Acces

    Energy-Efficient Recurrent Neural Network Accelerators for Real-Time Inference

    Full text link
    Over the past decade, Deep Learning (DL) and Deep Neural Network (DNN) have gone through a rapid development. They are now vastly applied to various applications and have profoundly changed the life of hu- man beings. As an essential element of DNN, Recurrent Neural Networks (RNN) are helpful in processing time-sequential data and are widely used in applications such as speech recognition and machine translation. RNNs are difficult to compute because of their massive arithmetic operations and large memory footprint. RNN inference workloads used to be executed on conventional general-purpose processors including Central Processing Units (CPU) and Graphics Processing Units (GPU); however, they have un- necessary hardware blocks for RNN computation such as branch predictor, caching system, making them not optimal for RNN processing. To accelerate RNN computations and outperform the performance of conventional processors, previous work focused on optimization methods on both software and hardware. On the software side, previous works mainly used model compression to reduce the memory footprint and the arithmetic operations of RNNs. On the hardware side, previous works also designed domain-specific hardware accelerators based on Field Pro- grammable Gate Arrays (FPGA) or Application Specific Integrated Circuits (ASIC) with customized hardware pipelines optimized for efficient pro- cessing of RNNs. By following this software-hardware co-design strategy, previous works achieved at least 10X speedup over conventional processors. Many previous works focused on achieving high throughput with a large batch of input streams. However, in real-time applications, such as gaming Artificial Intellegence (AI), dynamical system control, low latency is more critical. Moreover, there is a trend of offloading neural network workloads to edge devices to provide a better user experience and privacy protection. Edge devices, such as mobile phones and wearable devices, are usually resource-constrained with a tight power budget. They require RNN hard- ware that is more energy-efficient to realize both low-latency inference and long battery life. Brain neurons have sparsity in both the spatial domain and time domain. Inspired by this human nature, previous work mainly explored model compression to induce spatial sparsity in RNNs. The delta network algorithm alternatively induces temporal sparsity in RNNs and can save over 10X arithmetic operations in RNNs proven by previous works. In this work, we have proposed customized hardware accelerators to exploit temporal sparsity in Gated Recurrent Unit (GRU)-RNNs and Long Short-Term Memory (LSTM)-RNNs to achieve energy-efficient real-time RNN inference. First, we have proposed DeltaRNN, the first-ever RNN accelerator to exploit temporal sparsity in GRU-RNNs. DeltaRNN has achieved 1.2 TOp/s effective throughput with a batch size of 1, which is 15X higher than its related works. Second, we have designed EdgeDRNN to accelerate GRU-RNN edge inference. Compared to DeltaRNN, EdgeDRNN does not rely on on-chip memory to store RNN weights and focuses on reducing off-chip Dynamic Random Access Memory (DRAM) data traffic using a more scalable architecture. EdgeDRNN have realized real-time inference of large GRU-RNNs with submillisecond latency and only 2.3 W wall plug power consumption, achieving 4X higher energy efficiency than commercial edge AI platforms like NVIDIA Jetson Nano. Third, we have used DeltaRNN to realize the first-ever continuous speech recognition sys- tem with the Dynamic Audio Sensor (DAS) as the front-end. The DAS is a neuromorphic event-driven sensor that produces a stream of asyn- chronous events instead of audio data sampled at a fixed sample rate. We have also showcased how an RNN accelerator can be integrated with an event-driven sensor on the same chip to realize ultra-low-power Keyword Spotting (KWS) on the extreme edge. Fourth, we have used EdgeDRNN to control a powered robotic prosthesis using an RNN controller to replace a conventional proportional–derivative (PD) controller. EdgeDRNN has achieved 21 μs latency of running the RNN controller and could maintain stable control of the prosthesis. We have used DeltaRNN and EdgeDRNN to solve these problems to prove their value in solving real-world problems. Finally, we have applied the delta network algorithm on LSTM-RNNs and have combined it with a customized structured pruning method, called Column-Balanced Targeted Dropout (CBTD), to induce spatio-temporal sparsity in LSTM-RNNs. Then, we have proposed another FPGA-based accelerator called Spartus, the first RNN accelerator that exploits spatio- temporal sparsity. Spartus achieved 9.4 TOp/s effective throughput with a batch size of 1, the highest among present FPGA-based RNN accelerators with a power budget around 10 W. Spartus can complete the inference of an LSTM layer having 5 million parameters within 1 μs

    Unsupervised learning for vascular heterogeneity assessment of glioblastoma based on magnetic resonance imaging: The Hemodynamic Tissue Signature

    Full text link
    [ES] El futuro de la imagen médica está ligado a la inteligencia artificial. El análisis manual de imágenes médicas es hoy en día una tarea ardua, propensa a errores y a menudo inasequible para los humanos, que ha llamado la atención de la comunidad de Aprendizaje Automático (AA). La Imagen por Resonancia Magnética (IRM) nos proporciona una rica variedad de representaciones de la morfología y el comportamiento de lesiones inaccesibles sin una intervención invasiva arriesgada. Sin embargo, explotar la potente pero a menudo latente información contenida en la IRM es una tarea muy complicada, que requiere técnicas de análisis computacional inteligente. Los tumores del sistema nervioso central son una de las enfermedades más críticas estudiadas a través de IRM. Específicamente, el glioblastoma representa un gran desafío, ya que, hasta la fecha, continua siendo un cáncer letal que carece de una terapia satisfactoria. Del conjunto de características que hacen del glioblastoma un tumor tan agresivo, un aspecto particular que ha sido ampliamente estudiado es su heterogeneidad vascular. La fuerte proliferación vascular del glioblastoma, así como su robusta angiogénesis han sido consideradas responsables de la alta letalidad de esta neoplasia. Esta tesis se centra en la investigación y desarrollo del método Hemodynamic Tissue Signature (HTS): un método de AA no supervisado para describir la heterogeneidad vascular de los glioblastomas mediante el análisis de perfusión por IRM. El método HTS se basa en el concepto de hábitat, que se define como una subregión de la lesión con un perfil de IRM que describe un comportamiento fisiológico concreto. El método HTS delinea cuatro hábitats en el glioblastoma: el hábitat HAT, como la región más perfundida del tumor con captación de contraste; el hábitat LAT, como la región del tumor con un perfil angiogénico más bajo; el hábitat IPE, como la región adyacente al tumor con índices de perfusión elevados; y el hábitat VPE, como el edema restante de la lesión con el perfil de perfusión más bajo. La investigación y desarrollo de este método ha originado una serie de contribuciones enmarcadas en esta tesis. Primero, para verificar la fiabilidad de los métodos de AA no supervisados en la extracción de patrones de IRM, se realizó una comparativa para la tarea de segmentación de gliomas de grado alto. Segundo, se propuso un algoritmo de AA no supervisado dentro de la familia de los Spatially Varying Finite Mixture Models. El algoritmo propone una densidad a priori basada en un Markov Random Field combinado con la función probabilística Non-Local Means, para codificar la idea de que píxeles vecinos tienden a pertenecer al mismo objeto. Tercero, se presenta el método HTS para describir la heterogeneidad vascular del glioblastoma. El método se ha aplicado a casos reales en una cohorte local de un solo centro y en una cohorte internacional de más de 180 pacientes de 7 centros europeos. Se llevó a cabo una evaluación exhaustiva del método para medir el potencial pronóstico de los hábitats HTS. Finalmente, la tecnología desarrollada en la tesis se ha integrado en la plataforma online ONCOhabitats (https://www.oncohabitats.upv.es). La plataforma ofrece dos servicios: 1) segmentación de tejidos de glioblastoma, y 2) evaluación de la heterogeneidad vascular del tumor mediante el método HTS. Los resultados de esta tesis han sido publicados en diez contribuciones científicas, incluyendo revistas y conferencias de alto impacto en las áreas de Informática Médica, Estadística y Probabilidad, Radiología y Medicina Nuclear y Aprendizaje Automático. También se emitió una patente industrial registrada en España, Europa y EEUU. Finalmente, las ideas originales concebidas en esta tesis dieron lugar a la creación de ONCOANALYTICS CDX, una empresa enmarcada en el modelo de negocio de los companion diagnostics de compuestos farmacéuticos.[EN] The future of medical imaging is linked to Artificial Intelligence (AI). The manual analysis of medical images is nowadays an arduous, error-prone and often unaffordable task for humans, which has caught the attention of the Machine Learning (ML) community. Magnetic Resonance Imaging (MRI) provides us with a wide variety of rich representations of the morphology and behavior of lesions completely inaccessible without a risky invasive intervention. Nevertheless, harnessing the powerful but often latent information contained in MRI acquisitions is a very complicated task, which requires computational intelligent analysis techniques. Central nervous system tumors are one of the most critical diseases studied through MRI. Specifically, glioblastoma represents a major challenge, as it remains a lethal cancer that, to date, lacks a satisfactory therapy. Of the entire set of characteristics that make glioblastoma so aggressive, a particular aspect that has been widely studied is its vascular heterogeneity. The strong vascular proliferation of glioblastomas, as well as their robust angiogenesis and extensive microvasculature heterogeneity have been claimed responsible for the high lethality of the neoplasm. This thesis focuses on the research and development of the Hemodynamic Tissue Signature (HTS) method: an unsupervised ML approach to describe the vascular heterogeneity of glioblastomas by means of perfusion MRI analysis. The HTS builds on the concept of habitats. A habitat is defined as a sub-region of the lesion with a particular MRI profile describing a specific physiological behavior. The HTS method delineates four habitats within the glioblastoma: the HAT habitat, as the most perfused region of the enhancing tumor; the LAT habitat, as the region of the enhancing tumor with a lower angiogenic profile; the potentially IPE habitat, as the non-enhancing region adjacent to the tumor with elevated perfusion indexes; and the VPE habitat, as the remaining edema of the lesion with the lowest perfusion profile. The research and development of the HTS method has generated a number of contributions to this thesis. First, in order to verify that unsupervised learning methods are reliable to extract MRI patterns to describe the heterogeneity of a lesion, a comparison among several unsupervised learning methods was conducted for the task of high grade glioma segmentation. Second, a Bayesian unsupervised learning algorithm from the family of Spatially Varying Finite Mixture Models is proposed. The algorithm integrates a Markov Random Field prior density weighted by the probabilistic Non-Local Means function, to codify the idea that neighboring pixels tend to belong to the same semantic object. Third, the HTS method to describe the vascular heterogeneity of glioblastomas is presented. The HTS method has been applied to real cases, both in a local single-center cohort of patients, and in an international retrospective cohort of more than 180 patients from 7 European centers. A comprehensive evaluation of the method was conducted to measure the prognostic potential of the HTS habitats. Finally, the technology developed in this thesis has been integrated into an online open-access platform for its academic use. The ONCOhabitats platform is hosted at https://www.oncohabitats.upv.es, and provides two main services: 1) glioblastoma tissue segmentation, and 2) vascular heterogeneity assessment of glioblastomas by means of the HTS method. The results of this thesis have been published in ten scientific contributions, including top-ranked journals and conferences in the areas of Medical Informatics, Statistics and Probability, Radiology & Nuclear Medicine and Machine Learning. An industrial patent registered in Spain, Europe and EEUU was also issued. Finally, the original ideas conceived in this thesis led to the foundation of ONCOANALYTICS CDX, a company framed into the business model of companion diagnostics for pharmaceutical compounds.[CA] El futur de la imatge mèdica està lligat a la intel·ligència artificial. L'anàlisi manual d'imatges mèdiques és hui dia una tasca àrdua, propensa a errors i sovint inassequible per als humans, que ha cridat l'atenció de la comunitat d'Aprenentatge Automàtic (AA). La Imatge per Ressonància Magnètica (IRM) ens proporciona una àmplia varietat de representacions de la morfologia i el comportament de lesions inaccessibles sense una intervenció invasiva arriscada. Tanmateix, explotar la potent però sovint latent informació continguda a les adquisicions de IRM esdevé una tasca molt complicada, que requereix tècniques d'anàlisi computacional intel·ligent. Els tumors del sistema nerviós central són una de les malalties més crítiques estudiades a través de IRM. Específicament, el glioblastoma representa un gran repte, ja que, fins hui, continua siguent un càncer letal que manca d'una teràpia satisfactòria. Del conjunt de característiques que fan del glioblastoma un tumor tan agressiu, un aspecte particular que ha sigut àmpliament estudiat és la seua heterogeneïtat vascular. La forta proliferació vascular dels glioblastomes, així com la seua robusta angiogènesi han sigut considerades responsables de l'alta letalitat d'aquesta neoplàsia. Aquesta tesi es centra en la recerca i desenvolupament del mètode Hemodynamic Tissue Signature (HTS): un mètode d'AA no supervisat per descriure l'heterogeneïtat vascular dels glioblastomas mitjançant l'anàlisi de perfusió per IRM. El mètode HTS es basa en el concepte d'hàbitat, que es defineix com una subregió de la lesió amb un perfil particular d'IRM, que descriu un comportament fisiològic concret. El mètode HTS delinea quatre hàbitats dins del glioblastoma: l'hàbitat HAT, com la regió més perfosa del tumor amb captació de contrast; l'hàbitat LAT, com la regió del tumor amb un perfil angiogènic més baix; l'hàbitat IPE, com la regió adjacent al tumor amb índexs de perfusió elevats, i l'hàbitat VPE, com l'edema restant de la lesió amb el perfil de perfusió més baix. La recerca i desenvolupament del mètode HTS ha originat una sèrie de contribucions emmarcades a aquesta tesi. Primer, per verificar la fiabilitat dels mètodes d'AA no supervisats en l'extracció de patrons d'IRM, es va realitzar una comparativa en la tasca de segmentació de gliomes de grau alt. Segon, s'ha proposat un algorisme d'AA no supervisat dintre de la família dels Spatially Varying Finite Mixture Models. L'algorisme proposa un densitat a priori basada en un Markov Random Field combinat amb la funció probabilística Non-Local Means, per a codificar la idea que els píxels veïns tendeixen a pertànyer al mateix objecte semàntic. Tercer, es presenta el mètode HTS per descriure l'heterogeneïtat vascular dels glioblastomas. El mètode HTS s'ha aplicat a casos reals en una cohort local d'un sol centre i en una cohort internacional de més de 180 pacients de 7 centres europeus. Es va dur a terme una avaluació exhaustiva del mètode per mesurar el potencial pronòstic dels hàbitats HTS. Finalment, la tecnologia desenvolupada en aquesta tesi s'ha integrat en una plataforma online ONCOhabitats (https://www.oncohabitats.upv.es). La plataforma ofereix dos serveis: 1) segmentació dels teixits del glioblastoma, i 2) avaluació de l'heterogeneïtat vascular dels glioblastomes mitjançant el mètode HTS. Els resultats d'aquesta tesi han sigut publicats en deu contribucions científiques, incloent revistes i conferències de primer nivell a les àrees d'Informàtica Mèdica, Estadística i Probabilitat, Radiologia i Medicina Nuclear i Aprenentatge Automàtic. També es va emetre una patent industrial registrada a Espanya, Europa i els EEUU. Finalment, les idees originals concebudes en aquesta tesi van donar lloc a la creació d'ONCOANALYTICS CDX, una empresa emmarcada en el model de negoci dels companion diagnostics de compostos farmacèutics.En este sentido quiero agradecer a las diferentes instituciones y estructuras de financiación de investigación que han contribuido al desarrollo de esta tesis. En especial quiero agradecer a la Universitat Politècnica de València, donde he desarrollado toda mi carrera acadèmica y científica, así como al Ministerio de Ciencia e Innovación, al Ministerio de Economía y Competitividad, a la Comisión Europea, al EIT Health Programme y a la fundación Caixa ImpulseJuan Albarracín, J. (2020). Unsupervised learning for vascular heterogeneity assessment of glioblastoma based on magnetic resonance imaging: The Hemodynamic Tissue Signature [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/149560TESI
    corecore