86 research outputs found

    A subpath kernel for learning hierarchical image representations

    Get PDF
    International audienceTree kernels have demonstrated their ability to deal with hierarchical data, as the intrinsic tree structure often plays a discrimi-native role. While such kernels have been successfully applied to various domains such as nature language processing and bioinformatics, they mostly concentrate on ordered trees and whose nodes are described by symbolic data. Meanwhile, hierarchical representations have gained increasing interest to describe image content. This is particularly true in remote sensing, where such representations allow for revealing different objects of interest at various scales through a tree structure. However, the induced trees are unordered and the nodes are equipped with numerical features. In this paper, we propose a new structured kernel for hierarchical image representations which is built on the concept of subpath kernel. Experimental results on both artificial and remote sensing datasets show that the proposed kernel manages to deal with the hierarchical nature of the data, leading to better classification rates

    Combining multiple resolutions into hierarchical representations for kernel-based image classification

    Get PDF
    Geographic object-based image analysis (GEOBIA) framework has gained increasing interest recently. Following this popular paradigm, we propose a novel multiscale classification approach operating on a hierarchical image representation built from two images at different resolutions. They capture the same scene with different sensors and are naturally fused together through the hierarchical representation, where coarser levels are built from a Low Spatial Resolution (LSR) or Medium Spatial Resolution (MSR) image while finer levels are generated from a High Spatial Resolution (HSR) or Very High Spatial Resolution (VHSR) image. Such a representation allows one to benefit from the context information thanks to the coarser levels, and subregions spatial arrangement information thanks to the finer levels. Two dedicated structured kernels are then used to perform machine learning directly on the constructed hierarchical representation. This strategy overcomes the limits of conventional GEOBIA classification procedures that can handle only one or very few pre-selected scales. Experiments run on an urban classification task show that the proposed approach can highly improve the classification accuracy w.r.t. conventional approaches working on a single scale.Comment: International Conference on Geographic Object-Based Image Analysis (GEOBIA 2016), University of Twente in Enschede, The Netherland

    Tree Echo State Networks

    Get PDF
    In this paper we present the Tree Echo State Network (TreeESN) model, generalizing the paradigm of Reservoir Computing to tree structured data. TreeESNs exploit an untrained generalized recursive reservoir, exhibiting extreme efficiency for learning in structured domains. In addition, we highlight through the paper other characteristics of the approach: First, we discuss the Markovian characterization of reservoir dynamics, extended to the case of tree domains, that is implied by the contractive setting of the TreeESN state transition function. Second, we study two types of state mapping functions to map the tree structured state of TreeESN into a fixed-size feature representation for classification or regression tasks. The critical role of the relation between the choice of the state mapping function and the Markovian characterization of the task is analyzed and experimentally investigated on both artificial and real-world tasks. Finally, experimental results on benchmark and real-world tasks show that the TreeESN approach, in spite of its efficiency, can achieve comparable results with state-of-the-art, although more complex, neural and kernel based models for tree structured data

    RNA 상호작용 및 DNA 서열의 정보해독을 위한 기계학습 기법

    Get PDF
    학위논문(박사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2020. 2. 김선.생물체 간 표현형의 차이는 각 개체의 유전적 정보 차이로부터 기인한다. 유전적 정보의 변화에 따라서, 각 생물체는 서로 다른 종으로 진화하기도 하고, 같은 병에 걸린 환자라도 서로 다른 예후를 보이기도 한다. 이처럼 중요한 생물학적 정보는 대용량 시퀀싱 분석 기법 등을 통해 다양한 오믹스 데이터로 측정된다. 그러나, 오믹스 데이터는 고차원 특징 및 소규모 표본 데이터이기 때문에, 오믹스 데이터로부터 생물학적 정보를 해석하는 것은 매우 어려운 문제이다. 일반적으로, 데이터 특징의 개수가 샘플의 개수보다 많을 때, 오믹스 데이터의 해석을 가장 난해한 기계학습 문제들 중 하나로 만듭니다. 본 박사학위 논문은 기계학습 기법을 활용하여 고차원적인 생물학적 데이터로부터 생물학적 정보를 추출하기 위한 새로운 생물정보학 방법들을 고안하는 것을 목표로 한다. 첫 번째 연구는 DNA 서열을 활용하여 종 간 비교와 동시에 DNA 서열상에 있는 다양한 지역에 담긴 생물학적 정보를 유전적 관점에서 해석해보고자 하였다. 이를 위해, 순위 기반 k 단어 문자열 비교방법, RKSS 커널을 개발하여 다양한 게놈 상의 지역에서 여러 종 간 비교 실험을 수행하였다. RKSS 커널은 기존의 k 단어 문자열 커널을 확장한 것으로, k 길이 단어의 순위 정보와 종 간 공통점을 표현하는 비교기준점 개념을 활용하였다. k 단어 문자열 커널은 k의 길이에 따라 단어 수가 급증하지만, 비교기준점은 극소수의 단어로 이루어져 있으므로 서열 간 유사도를 계산하는 데 필요한 계산량을 효율적으로 줄일 수 있다. 게놈 상의 세 지역에 대해서 실험을 진행한 결과, RKSS 커널은 기존의 커널에 비해 종 간 유사도 및 차이를 효율적으로 계산할 수 있었다. 또한, RKSS 커널은 실험에 사용된 생물학적 지역에 포함된 생물학적 정보량 차이를 생물학적 지식과 부합되는 순서로 비교할 수 있었다. 두 번째 연구는 생물학적 네트워크를 통해 복잡하게 얽힌 유전자 상호작용 간 정보를 해석하여, 더 나아가 생물학적 기능 해석을 통해 암의 아형을 분류하고자 하였다. 이를 위해, 그래프 컨볼루션 네트워크와 어텐션 메커니즘을 활용하여 패스웨이 기반 해석 가능한 암 아형 분류 모델(GCN+MAE)을 고안하였다. 그래프 컨볼루션 네트워크를 통해서 생물학적 사전 지식인 패스웨이 정보를 학습하여 복잡한 유전자 상호작용 정보를 효율적으로 다루었다. 또한, 여러 패스웨이 정보를 어텐션 메커니즘을 통해 해석 가능한 수준으로 병합하였다. 마지막으로, 학습한 패스웨이 레벨 정보를 보다 복잡하고 다양한 유전자 레벨로 효율적으로 전달하기 위해서 네트워크 전파 알고리즘을 활용하였다. 다섯 개의 암 데이터에 대해 GCN+MAE 모델을 적용한 결과, 기존의 암 아형 분류 모델들보다 나은 성능을 보였으며 암 아형 특이적인 패스웨이 및 생물학적 기능을 발굴할 수 있었다. 세 번째 연구는 패스웨이로부터 서브 패스웨이/네트워크를 찾기 위한 연구다. 패스웨이나 생물학적 네트워크에 단일 생물학적 기능이 아니라 다양한 생물학적 기능이 포함되어 있음에 주목하였다. 단일 기능을 지닌 유전자 조합을 찾기 위해서 생물학적 네트워크상에서 조건 특이적인 유전자 모듈을 찾고자 하였으며 MIDAS라는 도구를 개발하였다. 패스웨이로부터 유전자 상호작용 간 활성도를 유전자 발현량과 네트워크 구조를 통해 계산하였다. 계산된 활성도들을 활용하여 다중 클래스에서 서로 다르게 활성화된 서브 패스들을 통계적 기법에 기반하여 발굴하였다. 또한, 어텐션 메커니즘과 그래프 컨볼루션 네트워크를 통해서 해당 연구를 패스웨이보다 더 큰 생물학적 네트워크에 확장하려고 시도하였다. 유방암 데이터에 대해 실험을 진행한 결과, MIDAS와 딥러닝 모델을 다중 클래스에서 차이가 나는 유전자 모듈을 효과적으로 추출할 수 있었다. 결론적으로, 본 박사학위 논문은 DNA 서열에 담긴 진화적 정보량 비교, 패스웨이 기반 암 아형 분류, 조건 특이적인 유전자 모듈 발굴을 위한 새로운 기계학습 기법을 제안하였다.Phenotypic differences among organisms are mainly due to the difference in genetic information. As a result of genetic information modification, an organism may evolve into a different species and patients with the same disease may have different prognosis. This important biological information can be observed in the form of various omics data using high throughput instrument technologies such as sequencing instruments. However, interpretation of such omics data is challenging since omics data is with very high dimensions but with relatively small number of samples. Typically, the number of dimensions is higher than the number of samples, which makes the interpretation of omics data one of the most challenging machine learning problems. My doctoral study aims to develop new bioinformatics methods for decoding information in these high dimensional data by utilizing machine learning algorithms. The first study is to analyze the difference in the amount of information between different regions of the DNA sequence. To achieve the goal, a ranked-based k-spectrum string kernel, RKSS kernel, is developed for comparative and evolutionary comparison of various genomic region sequences among multiple species. RKSS kernel extends the existing k-spectrum string kernel by utilizing rank information of k-mers and landmarks of k-mers that represents a species. By using a landmark as a reference point for comparison, the number of k-mers needed to calculating sequence similarities is dramatically reduced. In the experiments on three different genomic regions, RKSS kernel captured more reliable distances between species according to genetic information contents of the target region. Also, RKSS kernel was able to rearrange each region to match a biological common insight. The second study aims to efficiently decode complex genetic interactions using biological networks and, then, to classify cancer subtypes by interpreting biological functions. To achieve the goal, a pathway-based deep learning model using graph convolutional network and multi-attention based ensemble (GCN+MAE) for cancer subtype classification is developed. In order to efficiently reduce the relationships between genes using pathway information, GCN+MAE is designed as an explainable deep learning structure using graph convolutional network and attention mechanism. Extracted pathway-level information of cancer subtypes is transported into gene-level again by network propagation. In the experiments of five cancer data sets, GCN+MAE showed better cancer subtype classification performances and captured subtype-specific pathways and their biological functions. The third study is to identify sub-networks of a biological pathway. The goal is to dissect a biological pathway into multiple sub-networks, each of which is to be of a single functional unit. To achieve the goal, a condition-specific sub-module detection method in a biological network, MIDAS (MIning Differentially Activated Subpaths) is developed. From the pathway, edge activities are measured by explicit gene expression and network topology. Using the activities, differentially activated subpaths are explored by a statistical approach. Also, by extending this idea on graph convolutional network, different sub-networks are highlighted by attention mechanisms. In the experiment with breast cancer data, MIDAS and the deep learning model successfully decomposed gene-level features into sub-modules of single functions. In summary, my doctoral study proposes new computational methods to compare genomic DNA sequences as information contents, to model pathway-based cancer subtype classifications and regulations, and to identify condition-specific sub-modules among multiple cancer subtypes.Chapter 1 Introduction 1 1.1 Biological questions with genetic information 2 1.1.1 Biological Sequences 2 1.1.2 Gene expression 2 1.2 Formulating computational problems for the biological questions 3 1.2.1 Decoding biological sequences by k-mer vectors 3 1.2.2 Interpretation of complex relationships between genes 7 1.3 Three computational problems for the biological questions 9 1.4 Outline of the thesis 14 Chapter 2 Ranked k-spectrum kernel for comparative and evolutionary comparison of DNA sequences 15 2.1 Motivation 16 2.1.1 String kernel for sequence comparison 17 2.1.2 Approach: RKSS kernel 19 2.2 Methods 21 2.2.1 Mapping biological sequences to k-mer space: the k-spectrum string kernel 23 2.2.2 The ranked k-spectrum string kernel with a landmark 24 2.2.3 Single landmark-based reconstruction of phylogenetic tree 27 2.2.4 Multiple landmark-based distance comparison of exons, introns, CpG islands 29 2.2.5 Sequence Data for analysis 30 2.3 Results 31 2.3.1 Reconstruction of phylogenetic tree on the exons, introns, and CpG islands 31 2.3.2 Landmark space captures the characteristics of three genomic regions 38 2.3.3 Cross-evaluation of the landmark-based feature space 45 Chapter 3 Pathway-based cancer subtype classification and interpretation by attention mechanism and network propagation 46 3.1 Motivation 47 3.2 Methods 52 3.2.1 Encoding biological prior knowledge using Graph Convolutional Network 52 3.2.2 Re-producing comprehensive biological process by Multi-Attention based Ensemble 53 3.2.3 Linking pathways and transcription factors by network propagation with permutation-based normalization 55 3.3 Results 58 3.3.1 Pathway database and cancer data set 58 3.3.2 Evaluation of individual GCN pathway models 60 3.3.3 Performance of ensemble of GCN pathway models with multi-attention 60 3.3.4 Identification of TFs as regulator of pathways and GO term analysis of TF target genes 67 Chapter 4 Detecting sub-modules in biological networks with gene expression by statistical approach and graph convolutional network 70 4.1 Motivation 70 4.1.1 Pathway based analysis of transcriptome data 71 4.1.2 Challenges and Summary of Approach 74 4.2 Methods 78 4.2.1 Convert single KEGG pathway to directed graph 79 4.2.2 Calculate edge activity for each sample 79 4.2.3 Mining differentially activated subpath among classes 80 4.2.4 Prioritizing subpaths by the permutation test 82 4.2.5 Extension: graph convolutional network and class activation map 83 4.3 Results 84 4.3.1 Identifying 36 subtype specific subpaths in breast cancer 86 4.3.2 Subpath activities have a good discrimination power for cancer subtype classification 88 4.3.3 Subpath activities have a good prognostic power for survival outcomes 90 4.3.4 Comparison with an existing tool, PATHOME 91 4.3.5 Extension: detection of subnetwork on PPI network 98 Chapter 5 Conclusions 101 국문초록 127Docto

    Efficient Methods for Computational Light Transport

    Get PDF
    En esta tesis presentamos contribuciones sobre distintos retos computacionales relacionados con transporte de luz. Los algoritmos que utilizan información sobre el transporte de luz están presentes en muchas aplicaciones de hoy en día, desde la generación de efectos visuales, a la detección de objetos en tiempo real. La luz es una valiosa fuente de información que nos permite entender y representar nuestro entorno, pero obtener y procesar esta información presenta muchos desafíos debido a la complejidad de las interacciones entre la luz y la materia. Esta tesis aporta contribuciones en este tema desde dos puntos de vista diferentes: algoritmos en estado estacionario, en los que se asume que la velocidad de la luz es infinita; y algoritmos en estado transitorio, que tratan la luz no solo en el dominio espacial, sino también en el temporal. Nuestras contribuciones en algoritmos estacionarios abordan problemas tanto en renderizado offline como en tiempo real. Nos enfocamos en la reducción de varianza para métodos offline,proponiendo un nuevo método para renderizado eficiente de medios participativos. En renderizado en tiempo real, abordamos las limitacionesde consumo de batería en dispositivos móviles proponiendo un sistema de renderizado que incrementa la eficiencia energética en aplicaciones gráficas en tiempo real. En el transporte de luz transitorio, formalizamos la simulación de este tipo transporte en este nuevo dominio, y presentamos nuevos algoritmos y métodos para muestreo eficiente para render transitorio. Finalmente, demostramos la utilidad de generar datos en este dominio, presentando un nuevo método para corregir interferencia multi-caminos en camaras Timeof- Flight, un problema patológico en el procesamiento de imágenes transitorias.n this thesis we present contributions to different challenges of computational light transport. Light transport algorithms are present in many modern applications, from image generation for visual effects to real-time object detection. Light is a rich source of information that allows us to understand and represent our surroundings, but obtaining and processing this information presents many challenges due to its complex interactions with matter. This thesis provides advances in this subject from two different perspectives: steady-state algorithms, where the speed of light is assumed infinite, and transient-state algorithms, which deal with light as it travels not only through space but also time. Our steady-state contributions address problems in both offline and real-time rendering. We target variance reduction in offline rendering by proposing a new efficient method for participating media rendering. In real-time rendering, we target energy constraints of mobile devices by proposing a power-efficient rendering framework for real-time graphics applications. In transient-state we first formalize light transport simulation under this domain, and present new efficient sampling methods and algorithms for transient rendering. We finally demonstrate the potential of simulated data to correct multipath interference in Time-of-Flight cameras, one of the pathological problems in transient imaging.<br /

    Efficient Unbiased Rendering using Enlightened Local Path Sampling

    Get PDF

    異なる空間を繋ぐ光輸送シミュレーション

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 稲葉 雅幸, 東京大学教授 千葉 滋, 東京大学教授 五十嵐 健夫, 東京大学教授 松尾 宇泰, 東京大学講師 中山 英樹, 東京大学講師 蜂須賀 恵也University of Tokyo(東京大学

    Artistic Path Space Editing of Physically Based Light Transport

    Get PDF
    Die Erzeugung realistischer Bilder ist ein wichtiges Ziel der Computergrafik, mit Anwendungen u.a. in der Spielfilmindustrie, Architektur und Medizin. Die physikalisch basierte Bildsynthese, welche in letzter Zeit anwendungsübergreifend weiten Anklang findet, bedient sich der numerischen Simulation des Lichttransports entlang durch die geometrische Optik vorgegebener Ausbreitungspfade; ein Modell, welches für übliche Szenen ausreicht, Photorealismus zu erzielen. Insgesamt gesehen ist heute das computergestützte Verfassen von Bildern und Animationen mit wohlgestalteter und theoretisch fundierter Schattierung stark vereinfacht. Allerdings ist bei der praktischen Umsetzung auch die Rücksichtnahme auf Details wie die Struktur des Ausgabegeräts wichtig und z.B. das Teilproblem der effizienten physikalisch basierten Bildsynthese in partizipierenden Medien ist noch weit davon entfernt, als gelöst zu gelten. Weiterhin ist die Bildsynthese als Teil eines weiteren Kontextes zu sehen: der effektiven Kommunikation von Ideen und Informationen. Seien es nun Form und Funktion eines Gebäudes, die medizinische Visualisierung einer Computertomografie oder aber die Stimmung einer Filmsequenz -- Botschaften in Form digitaler Bilder sind heutzutage omnipräsent. Leider hat die Verbreitung der -- auf Simulation ausgelegten -- Methodik der physikalisch basierten Bildsynthese generell zu einem Verlust intuitiver, feingestalteter und lokaler künstlerischer Kontrolle des finalen Bildinhalts geführt, welche in vorherigen, weniger strikten Paradigmen vorhanden war. Die Beiträge dieser Dissertation decken unterschiedliche Aspekte der Bildsynthese ab. Dies sind zunächst einmal die grundlegende Subpixel-Bildsynthese sowie effiziente Bildsyntheseverfahren für partizipierende Medien. Im Mittelpunkt der Arbeit stehen jedoch Ansätze zum effektiven visuellen Verständnis der Lichtausbreitung, die eine lokale künstlerische Einflussnahme ermöglichen und gleichzeitig auf globaler Ebene konsistente und glaubwürdige Ergebnisse erzielen. Hierbei ist die Kernidee, Visualisierung und Bearbeitung des Lichts direkt im alle möglichen Lichtpfade einschließenden "Pfadraum" durchzuführen. Dies steht im Gegensatz zu Verfahren nach Stand der Forschung, die entweder im Bildraum arbeiten oder auf bestimmte, isolierte Beleuchtungseffekte wie perfekte Spiegelungen, Schatten oder Kaustiken zugeschnitten sind. Die Erprobung der vorgestellten Verfahren hat gezeigt, dass mit ihnen real existierende Probleme der Bilderzeugung für Filmproduktionen gelöst werden können
    corecore