2,136 research outputs found

    Classification and Verification of Online Handwritten Signatures with Time Causal Information Theory Quantifiers

    Get PDF
    We present a new approach for online handwritten signature classification and verification based on descriptors stemming from Information Theory. The proposal uses the Shannon Entropy, the Statistical Complexity, and the Fisher Information evaluated over the Bandt and Pompe symbolization of the horizontal and vertical coordinates of signatures. These six features are easy and fast to compute, and they are the input to an One-Class Support Vector Machine classifier. The results produced surpass state-of-the-art techniques that employ higher-dimensional feature spaces which often require specialized software and hardware. We assess the consistency of our proposal with respect to the size of the training sample, and we also use it to classify the signatures into meaningful groups.Comment: Submitted to PLOS On

    Sparse Coding on Symmetric Positive Definite Manifolds using Bregman Divergences

    Full text link
    This paper introduces sparse coding and dictionary learning for Symmetric Positive Definite (SPD) matrices, which are often used in machine learning, computer vision and related areas. Unlike traditional sparse coding schemes that work in vector spaces, in this paper we discuss how SPD matrices can be described by sparse combination of dictionary atoms, where the atoms are also SPD matrices. We propose to seek sparse coding by embedding the space of SPD matrices into Hilbert spaces through two types of Bregman matrix divergences. This not only leads to an efficient way of performing sparse coding, but also an online and iterative scheme for dictionary learning. We apply the proposed methods to several computer vision tasks where images are represented by region covariance matrices. Our proposed algorithms outperform state-of-the-art methods on a wide range of classification tasks, including face recognition, action recognition, material classification and texture categorization

    RNA ์ƒํ˜ธ์ž‘์šฉ ๋ฐ DNA ์„œ์—ด์˜ ์ •๋ณดํ•ด๋…์„ ์œ„ํ•œ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ๊น€์„ .์ƒ๋ฌผ์ฒด ๊ฐ„ ํ‘œํ˜„ํ˜•์˜ ์ฐจ์ด๋Š” ๊ฐ ๊ฐœ์ฒด์˜ ์œ ์ „์  ์ •๋ณด ์ฐจ์ด๋กœ๋ถ€ํ„ฐ ๊ธฐ์ธํ•œ๋‹ค. ์œ ์ „์  ์ •๋ณด์˜ ๋ณ€ํ™”์— ๋”ฐ๋ผ์„œ, ๊ฐ ์ƒ๋ฌผ์ฒด๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์ข…์œผ๋กœ ์ง„ํ™”ํ•˜๊ธฐ๋„ ํ•˜๊ณ , ๊ฐ™์€ ๋ณ‘์— ๊ฑธ๋ฆฐ ํ™˜์ž๋ผ๋„ ์„œ๋กœ ๋‹ค๋ฅธ ์˜ˆํ›„๋ฅผ ๋ณด์ด๊ธฐ๋„ ํ•œ๋‹ค. ์ด์ฒ˜๋Ÿผ ์ค‘์š”ํ•œ ์ƒ๋ฌผํ•™์  ์ •๋ณด๋Š” ๋Œ€์šฉ๋Ÿ‰ ์‹œํ€€์‹ฑ ๋ถ„์„ ๊ธฐ๋ฒ• ๋“ฑ์„ ํ†ตํ•ด ๋‹ค์–‘ํ•œ ์˜ค๋ฏน์Šค ๋ฐ์ดํ„ฐ๋กœ ์ธก์ •๋œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, ์˜ค๋ฏน์Šค ๋ฐ์ดํ„ฐ๋Š” ๊ณ ์ฐจ์› ํŠน์ง• ๋ฐ ์†Œ๊ทœ๋ชจ ํ‘œ๋ณธ ๋ฐ์ดํ„ฐ์ด๊ธฐ ๋•Œ๋ฌธ์—, ์˜ค๋ฏน์Šค ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ƒ๋ฌผํ•™์  ์ •๋ณด๋ฅผ ํ•ด์„ํ•˜๋Š” ๊ฒƒ์€ ๋งค์šฐ ์–ด๋ ค์šด ๋ฌธ์ œ์ด๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ, ๋ฐ์ดํ„ฐ ํŠน์ง•์˜ ๊ฐœ์ˆ˜๊ฐ€ ์ƒ˜ํ”Œ์˜ ๊ฐœ์ˆ˜๋ณด๋‹ค ๋งŽ์„ ๋•Œ, ์˜ค๋ฏน์Šค ๋ฐ์ดํ„ฐ์˜ ํ•ด์„์„ ๊ฐ€์žฅ ๋‚œํ•ดํ•œ ๊ธฐ๊ณ„ํ•™์Šต ๋ฌธ์ œ๋“ค ์ค‘ ํ•˜๋‚˜๋กœ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ๋ณธ ๋ฐ•์‚ฌํ•™์œ„ ๋…ผ๋ฌธ์€ ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ๊ณ ์ฐจ์›์ ์ธ ์ƒ๋ฌผํ•™์  ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ƒ๋ฌผํ•™์  ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ƒ๋ฌผ์ •๋ณดํ•™ ๋ฐฉ๋ฒ•๋“ค์„ ๊ณ ์•ˆํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” DNA ์„œ์—ด์„ ํ™œ์šฉํ•˜์—ฌ ์ข… ๊ฐ„ ๋น„๊ต์™€ ๋™์‹œ์— DNA ์„œ์—ด์ƒ์— ์žˆ๋Š” ๋‹ค์–‘ํ•œ ์ง€์—ญ์— ๋‹ด๊ธด ์ƒ๋ฌผํ•™์  ์ •๋ณด๋ฅผ ์œ ์ „์  ๊ด€์ ์—์„œ ํ•ด์„ํ•ด๋ณด๊ณ ์ž ํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด, ์ˆœ์œ„ ๊ธฐ๋ฐ˜ k ๋‹จ์–ด ๋ฌธ์ž์—ด ๋น„๊ต๋ฐฉ๋ฒ•, RKSS ์ปค๋„์„ ๊ฐœ๋ฐœํ•˜์—ฌ ๋‹ค์–‘ํ•œ ๊ฒŒ๋†ˆ ์ƒ์˜ ์ง€์—ญ์—์„œ ์—ฌ๋Ÿฌ ์ข… ๊ฐ„ ๋น„๊ต ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. RKSS ์ปค๋„์€ ๊ธฐ์กด์˜ k ๋‹จ์–ด ๋ฌธ์ž์—ด ์ปค๋„์„ ํ™•์žฅํ•œ ๊ฒƒ์œผ๋กœ, k ๊ธธ์ด ๋‹จ์–ด์˜ ์ˆœ์œ„ ์ •๋ณด์™€ ์ข… ๊ฐ„ ๊ณตํ†ต์ ์„ ํ‘œํ˜„ํ•˜๋Š” ๋น„๊ต๊ธฐ์ค€์  ๊ฐœ๋…์„ ํ™œ์šฉํ•˜์˜€๋‹ค. k ๋‹จ์–ด ๋ฌธ์ž์—ด ์ปค๋„์€ k์˜ ๊ธธ์ด์— ๋”ฐ๋ผ ๋‹จ์–ด ์ˆ˜๊ฐ€ ๊ธ‰์ฆํ•˜์ง€๋งŒ, ๋น„๊ต๊ธฐ์ค€์ ์€ ๊ทน์†Œ์ˆ˜์˜ ๋‹จ์–ด๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฏ€๋กœ ์„œ์—ด ๊ฐ„ ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ๊ณ„์‚ฐ๋Ÿ‰์„ ํšจ์œจ์ ์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. ๊ฒŒ๋†ˆ ์ƒ์˜ ์„ธ ์ง€์—ญ์— ๋Œ€ํ•ด์„œ ์‹คํ—˜์„ ์ง„ํ–‰ํ•œ ๊ฒฐ๊ณผ, RKSS ์ปค๋„์€ ๊ธฐ์กด์˜ ์ปค๋„์— ๋น„ํ•ด ์ข… ๊ฐ„ ์œ ์‚ฌ๋„ ๋ฐ ์ฐจ์ด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ, RKSS ์ปค๋„์€ ์‹คํ—˜์— ์‚ฌ์šฉ๋œ ์ƒ๋ฌผํ•™์  ์ง€์—ญ์— ํฌํ•จ๋œ ์ƒ๋ฌผํ•™์  ์ •๋ณด๋Ÿ‰ ์ฐจ์ด๋ฅผ ์ƒ๋ฌผํ•™์  ์ง€์‹๊ณผ ๋ถ€ํ•ฉ๋˜๋Š” ์ˆœ์„œ๋กœ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ๋ณต์žกํ•˜๊ฒŒ ์–ฝํžŒ ์œ ์ „์ž ์ƒํ˜ธ์ž‘์šฉ ๊ฐ„ ์ •๋ณด๋ฅผ ํ•ด์„ํ•˜์—ฌ, ๋” ๋‚˜์•„๊ฐ€ ์ƒ๋ฌผํ•™์  ๊ธฐ๋Šฅ ํ•ด์„์„ ํ†ตํ•ด ์•”์˜ ์•„ํ˜•์„ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด, ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜ ๋„คํŠธ์›Œํฌ์™€ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ™œ์šฉํ•˜์—ฌ ํŒจ์Šค์›จ์ด ๊ธฐ๋ฐ˜ ํ•ด์„ ๊ฐ€๋Šฅํ•œ ์•” ์•„ํ˜• ๋ถ„๋ฅ˜ ๋ชจ๋ธ(GCN+MAE)์„ ๊ณ ์•ˆํ•˜์˜€๋‹ค. ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด์„œ ์ƒ๋ฌผํ•™์  ์‚ฌ์ „ ์ง€์‹์ธ ํŒจ์Šค์›จ์ด ์ •๋ณด๋ฅผ ํ•™์Šตํ•˜์—ฌ ๋ณต์žกํ•œ ์œ ์ „์ž ์ƒํ˜ธ์ž‘์šฉ ์ •๋ณด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๋‹ค๋ฃจ์—ˆ๋‹ค. ๋˜ํ•œ, ์—ฌ๋Ÿฌ ํŒจ์Šค์›จ์ด ์ •๋ณด๋ฅผ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ†ตํ•ด ํ•ด์„ ๊ฐ€๋Šฅํ•œ ์ˆ˜์ค€์œผ๋กœ ๋ณ‘ํ•ฉํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ํ•™์Šตํ•œ ํŒจ์Šค์›จ์ด ๋ ˆ๋ฒจ ์ •๋ณด๋ฅผ ๋ณด๋‹ค ๋ณต์žกํ•˜๊ณ  ๋‹ค์–‘ํ•œ ์œ ์ „์ž ๋ ˆ๋ฒจ๋กœ ํšจ์œจ์ ์œผ๋กœ ์ „๋‹ฌํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋„คํŠธ์›Œํฌ ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ™œ์šฉํ•˜์˜€๋‹ค. ๋‹ค์„ฏ ๊ฐœ์˜ ์•” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด GCN+MAE ๋ชจ๋ธ์„ ์ ์šฉํ•œ ๊ฒฐ๊ณผ, ๊ธฐ์กด์˜ ์•” ์•„ํ˜• ๋ถ„๋ฅ˜ ๋ชจ๋ธ๋“ค๋ณด๋‹ค ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋ฉฐ ์•” ์•„ํ˜• ํŠน์ด์ ์ธ ํŒจ์Šค์›จ์ด ๋ฐ ์ƒ๋ฌผํ•™์  ๊ธฐ๋Šฅ์„ ๋ฐœ๊ตดํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์„ธ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ํŒจ์Šค์›จ์ด๋กœ๋ถ€ํ„ฐ ์„œ๋ธŒ ํŒจ์Šค์›จ์ด/๋„คํŠธ์›Œํฌ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋‹ค. ํŒจ์Šค์›จ์ด๋‚˜ ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ์— ๋‹จ์ผ ์ƒ๋ฌผํ•™์  ๊ธฐ๋Šฅ์ด ์•„๋‹ˆ๋ผ ๋‹ค์–‘ํ•œ ์ƒ๋ฌผํ•™์  ๊ธฐ๋Šฅ์ด ํฌํ•จ๋˜์–ด ์žˆ์Œ์— ์ฃผ๋ชฉํ•˜์˜€๋‹ค. ๋‹จ์ผ ๊ธฐ๋Šฅ์„ ์ง€๋‹Œ ์œ ์ „์ž ์กฐํ•ฉ์„ ์ฐพ๊ธฐ ์œ„ํ•ด์„œ ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ์ƒ์—์„œ ์กฐ๊ฑด ํŠน์ด์ ์ธ ์œ ์ „์ž ๋ชจ๋“ˆ์„ ์ฐพ๊ณ ์ž ํ•˜์˜€์œผ๋ฉฐ MIDAS๋ผ๋Š” ๋„๊ตฌ๋ฅผ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ํŒจ์Šค์›จ์ด๋กœ๋ถ€ํ„ฐ ์œ ์ „์ž ์ƒํ˜ธ์ž‘์šฉ ๊ฐ„ ํ™œ์„ฑ๋„๋ฅผ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰๊ณผ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด ๊ณ„์‚ฐํ•˜์˜€๋‹ค. ๊ณ„์‚ฐ๋œ ํ™œ์„ฑ๋„๋“ค์„ ํ™œ์šฉํ•˜์—ฌ ๋‹ค์ค‘ ํด๋ž˜์Šค์—์„œ ์„œ๋กœ ๋‹ค๋ฅด๊ฒŒ ํ™œ์„ฑํ™”๋œ ์„œ๋ธŒ ํŒจ์Šค๋“ค์„ ํ†ต๊ณ„์  ๊ธฐ๋ฒ•์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ ๋ฐœ๊ตดํ•˜์˜€๋‹ค. ๋˜ํ•œ, ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜๊ณผ ๊ทธ๋ž˜ํ”„ ์ปจ๋ณผ๋ฃจ์…˜ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด์„œ ํ•ด๋‹น ์—ฐ๊ตฌ๋ฅผ ํŒจ์Šค์›จ์ด๋ณด๋‹ค ๋” ํฐ ์ƒ๋ฌผํ•™์  ๋„คํŠธ์›Œํฌ์— ํ™•์žฅํ•˜๋ ค๊ณ  ์‹œ๋„ํ•˜์˜€๋‹ค. ์œ ๋ฐฉ์•” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์‹คํ—˜์„ ์ง„ํ–‰ํ•œ ๊ฒฐ๊ณผ, MIDAS์™€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๋‹ค์ค‘ ํด๋ž˜์Šค์—์„œ ์ฐจ์ด๊ฐ€ ๋‚˜๋Š” ์œ ์ „์ž ๋ชจ๋“ˆ์„ ํšจ๊ณผ์ ์œผ๋กœ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๊ฒฐ๋ก ์ ์œผ๋กœ, ๋ณธ ๋ฐ•์‚ฌํ•™์œ„ ๋…ผ๋ฌธ์€ DNA ์„œ์—ด์— ๋‹ด๊ธด ์ง„ํ™”์  ์ •๋ณด๋Ÿ‰ ๋น„๊ต, ํŒจ์Šค์›จ์ด ๊ธฐ๋ฐ˜ ์•” ์•„ํ˜• ๋ถ„๋ฅ˜, ์กฐ๊ฑด ํŠน์ด์ ์ธ ์œ ์ „์ž ๋ชจ๋“ˆ ๋ฐœ๊ตด์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๊ธฐ๊ณ„ํ•™์Šต ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜์˜€๋‹ค.Phenotypic differences among organisms are mainly due to the difference in genetic information. As a result of genetic information modification, an organism may evolve into a different species and patients with the same disease may have different prognosis. This important biological information can be observed in the form of various omics data using high throughput instrument technologies such as sequencing instruments. However, interpretation of such omics data is challenging since omics data is with very high dimensions but with relatively small number of samples. Typically, the number of dimensions is higher than the number of samples, which makes the interpretation of omics data one of the most challenging machine learning problems. My doctoral study aims to develop new bioinformatics methods for decoding information in these high dimensional data by utilizing machine learning algorithms. The first study is to analyze the difference in the amount of information between different regions of the DNA sequence. To achieve the goal, a ranked-based k-spectrum string kernel, RKSS kernel, is developed for comparative and evolutionary comparison of various genomic region sequences among multiple species. RKSS kernel extends the existing k-spectrum string kernel by utilizing rank information of k-mers and landmarks of k-mers that represents a species. By using a landmark as a reference point for comparison, the number of k-mers needed to calculating sequence similarities is dramatically reduced. In the experiments on three different genomic regions, RKSS kernel captured more reliable distances between species according to genetic information contents of the target region. Also, RKSS kernel was able to rearrange each region to match a biological common insight. The second study aims to efficiently decode complex genetic interactions using biological networks and, then, to classify cancer subtypes by interpreting biological functions. To achieve the goal, a pathway-based deep learning model using graph convolutional network and multi-attention based ensemble (GCN+MAE) for cancer subtype classification is developed. In order to efficiently reduce the relationships between genes using pathway information, GCN+MAE is designed as an explainable deep learning structure using graph convolutional network and attention mechanism. Extracted pathway-level information of cancer subtypes is transported into gene-level again by network propagation. In the experiments of five cancer data sets, GCN+MAE showed better cancer subtype classification performances and captured subtype-specific pathways and their biological functions. The third study is to identify sub-networks of a biological pathway. The goal is to dissect a biological pathway into multiple sub-networks, each of which is to be of a single functional unit. To achieve the goal, a condition-specific sub-module detection method in a biological network, MIDAS (MIning Differentially Activated Subpaths) is developed. From the pathway, edge activities are measured by explicit gene expression and network topology. Using the activities, differentially activated subpaths are explored by a statistical approach. Also, by extending this idea on graph convolutional network, different sub-networks are highlighted by attention mechanisms. In the experiment with breast cancer data, MIDAS and the deep learning model successfully decomposed gene-level features into sub-modules of single functions. In summary, my doctoral study proposes new computational methods to compare genomic DNA sequences as information contents, to model pathway-based cancer subtype classifications and regulations, and to identify condition-specific sub-modules among multiple cancer subtypes.Chapter 1 Introduction 1 1.1 Biological questions with genetic information 2 1.1.1 Biological Sequences 2 1.1.2 Gene expression 2 1.2 Formulating computational problems for the biological questions 3 1.2.1 Decoding biological sequences by k-mer vectors 3 1.2.2 Interpretation of complex relationships between genes 7 1.3 Three computational problems for the biological questions 9 1.4 Outline of the thesis 14 Chapter 2 Ranked k-spectrum kernel for comparative and evolutionary comparison of DNA sequences 15 2.1 Motivation 16 2.1.1 String kernel for sequence comparison 17 2.1.2 Approach: RKSS kernel 19 2.2 Methods 21 2.2.1 Mapping biological sequences to k-mer space: the k-spectrum string kernel 23 2.2.2 The ranked k-spectrum string kernel with a landmark 24 2.2.3 Single landmark-based reconstruction of phylogenetic tree 27 2.2.4 Multiple landmark-based distance comparison of exons, introns, CpG islands 29 2.2.5 Sequence Data for analysis 30 2.3 Results 31 2.3.1 Reconstruction of phylogenetic tree on the exons, introns, and CpG islands 31 2.3.2 Landmark space captures the characteristics of three genomic regions 38 2.3.3 Cross-evaluation of the landmark-based feature space 45 Chapter 3 Pathway-based cancer subtype classification and interpretation by attention mechanism and network propagation 46 3.1 Motivation 47 3.2 Methods 52 3.2.1 Encoding biological prior knowledge using Graph Convolutional Network 52 3.2.2 Re-producing comprehensive biological process by Multi-Attention based Ensemble 53 3.2.3 Linking pathways and transcription factors by network propagation with permutation-based normalization 55 3.3 Results 58 3.3.1 Pathway database and cancer data set 58 3.3.2 Evaluation of individual GCN pathway models 60 3.3.3 Performance of ensemble of GCN pathway models with multi-attention 60 3.3.4 Identification of TFs as regulator of pathways and GO term analysis of TF target genes 67 Chapter 4 Detecting sub-modules in biological networks with gene expression by statistical approach and graph convolutional network 70 4.1 Motivation 70 4.1.1 Pathway based analysis of transcriptome data 71 4.1.2 Challenges and Summary of Approach 74 4.2 Methods 78 4.2.1 Convert single KEGG pathway to directed graph 79 4.2.2 Calculate edge activity for each sample 79 4.2.3 Mining differentially activated subpath among classes 80 4.2.4 Prioritizing subpaths by the permutation test 82 4.2.5 Extension: graph convolutional network and class activation map 83 4.3 Results 84 4.3.1 Identifying 36 subtype specific subpaths in breast cancer 86 4.3.2 Subpath activities have a good discrimination power for cancer subtype classification 88 4.3.3 Subpath activities have a good prognostic power for survival outcomes 90 4.3.4 Comparison with an existing tool, PATHOME 91 4.3.5 Extension: detection of subnetwork on PPI network 98 Chapter 5 Conclusions 101 ๊ตญ๋ฌธ์ดˆ๋ก 127Docto

    Review on Human Re-identification with Multiple Cameras

    Get PDF
    Human re-identification is the core task in most surveillance systems and it is aimed at matching human pairs from different non-overlapping cameras. There are several challenging issues that need to be overcome to achieve reidentification, such as overcoming the variations in viewpoint, pose, image resolution, illumination and occlusion. In this study, we review existing works in human re-identification task. Advantages and limitations of recent works are discussed. At the end, this paper suggests some future research directions for human re-identification

    kernInt : A Kernel Framework for Integrating Supervised and Unsupervised Analyses in Spatio-Temporal Metagenomic Datasets

    Get PDF
    The advent of next-generation sequencing technologies allowed relative quantification of microbiome communities and their spatial and temporal variation. In recent years, supervised learning (i.e., prediction of a phenotype of interest) from taxonomic abundances has become increasingly common in the microbiome field. However, a gap exists between supervised and classical unsupervised analyses, based on computing ecological dissimilarities for visualization or clustering. Despite this, both approaches face common challenges, like the compositional nature of next-generation sequencing data or the integration of the spatial and temporal dimensions. Here we propose a kernel framework to place on a common ground the unsupervised and supervised microbiome analyses, including the retrieval of microbial signatures (taxa importances). We define two compositional kernels (Aitchison-RBF and compositional linear) and discuss how to transform non-compositional beta-dissimilarity measures into kernels. Spatial data is integrated with multiple kernel learning, while longitudinal data is evaluated by specific kernels. We illustrate our framework through a single point soil dataset, a human dataset with a spatial component, and a previously unpublished longitudinal dataset concerning pig production. The proposed framework and the case studies are freely available in the kernInt package at https://github.com/elies-ramon/kernInt
    • โ€ฆ
    corecore