2,155 research outputs found

    Up and Down: Mining Multidimensional Sequential Patterns Using Hierarchies

    Get PDF
    International audienceData warehouses contain large volumes of time-variant data stored to help analysis. Despite the evolution of OLAP analysis tools and methods, it is still impossible for decision makers to find data mining tools taking the specificity of the data (e.g. multidimensionality, hierarchies, time-variant) into account. In this paper, we propose an original method to automatically extract sequential patterns taking hierar- chies into account. This method extracts patterns that describe the inner trends by displaying patterns that either go from precise knowledge to general knowledge or go from general knowledge to precise knowledge. For instance, one rule exhibited could be data contain first many sales of coke in Paris and lemonade in London for the same date, followed by a large number of sales of soft drinks in Europe, which is said to be divergent (as precise results like coke precede general ones like soft drinks). On the opposite, rules like data contain first many sales of soft drinks in Europe and chips in London for the same date, followed by a large number of sales of coke in Paris are said to be convergent. In this paper, we define the concepts related to this original method as well as the associated algorithms. The experiments which we carried out show the interest of our proposal

    Circular RNA CpG island hypermethylation-associated silencing in human cancer

    Get PDF
    Noncoding RNAs (ncRNAs), such as microRNAs and long noncoding RNAs (lncRNAs), participate in cellular transformation. Work done in the last decade has also demonstrated that ncRNAs with growth-inhibitory functions can undergo promoter CpG island hypermethylation-associated silencing in tumorigenesis. Herein, we wondered whether circular RNAs (circRNAs), a type of RNA transcripts lacking 5โ€ฒ-3โ€ฒ ends and forming closed loops that are gaining relevance in cancer biology, are also a target of epigenetic inactivation in tumors. To tackle this issue, we have used cancer cells genetically deficient for the DNA methyltransferase enzymes in conjuction with circRNA expression microarrays. We have found that the loss of DNA methylation provokes a release of circRNA silencing. In particular, we have identified that promoter CpG island hypermethylation of the genes TUSC3 (tumor suppressor candidate 3), POMT1 (protein O-mannosyltransferase 1), ATRNL1 (attractin-like 1) and SAMD4A (sterile alpha motif domain containing 4A) is linked to the transcriptional downregulation of both linear mRNA and the hosted circRNA. Although some circRNAs regulate the linear transcript, we did not observe changes in TUSC3 mRNA levels upon TUSC3 circ104557 overexpression. Interestingly, we found circRNA-mediated regulation of target miRNAs and an in vivo growth inhibitory effect upon TUSC3 circ104557 transduction. Data mining for 5โ€ฒ-end CpG island methylation of TUSC3, ATRNL1, POMT1 and SAMD4A in cancer cell lines and primary tumors showed that the epigenetic defect was commonly observed among different tumor types in association with the diminished expression of the corresponding transcript. Our findings support a role for circRNA DNA methylation-associated loss in human cancer

    Dramatic expansion of the black widow toxin arsenal uncovered by multi-tissue transcriptomics and venom proteomics.

    Get PDF
    BackgroundAnimal venoms attract enormous interest given their potential for pharmacological discovery and understanding the evolution of natural chemistries. Next-generation transcriptomics and proteomics provide unparalleled, but underexploited, capabilities for venom characterization. We combined multi-tissue RNA-Seq with mass spectrometry and bioinformatic analyses to determine venom gland specific transcripts and venom proteins from the Western black widow spider (Latrodectus hesperus) and investigated their evolution.ResultsWe estimated expression of 97,217 L. hesperus transcripts in venom glands relative to silk and cephalothorax tissues. We identified 695 venom gland specific transcripts (VSTs), many of which BLAST and GO term analyses indicate may function as toxins or their delivery agents. ~38% of VSTs had BLAST hits, including latrotoxins, inhibitor cystine knot toxins, CRISPs, hyaluronidases, chitinase, and proteases, and 59% of VSTs had predicted protein domains. Latrotoxins are venom toxins that cause massive neurotransmitter release from vertebrate or invertebrate neurons. We discovered โ‰ฅ 20 divergent latrotoxin paralogs expressed in L. hesperus venom glands, significantly increasing this biomedically important family. Mass spectrometry of L. hesperus venom identified 49 proteins from VSTs, 24 of which BLAST to toxins. Phylogenetic analyses showed venom gland specific gene family expansions and shifts in tissue expression.ConclusionsQuantitative expression analyses comparing multiple tissues are necessary to identify venom gland specific transcripts. We present a black widow venom specific exome that uncovers a trove of diverse toxins and associated proteins, suggesting a dynamic evolutionary history. This justifies a reevaluation of the functional activities of black widow venom in light of its emerging complexity

    Phenotypic convergence in genetically distinct lineages of a Rhinolophus species complex (Mammalia, Chiroptera)

    Get PDF
    Phenotypes of distantly related species may converge through adaptation to similar habitats and/or because they share biological constraints that limit the phenotypic variants produced. A common theme in bats is the sympatric occurrence of cryptic species that are convergent in morphology but divergent in echolocation frequency, suggesting that echolocation may facilitate niche partitioning, reducing competition. If so, allopatric populations freed from competition, could converge in both morphology and echolocation provided they occupy similar niches or share biological constraints. We investigated the evolutionary history of a widely distributed African horseshoe bat, Rhinolophus darlingi , in the context of phenotypic convergence. We used phylogenetic inference to identify and date lineage divergence together with phenotypic comparisons and ecological niche modelling to identify morphological and geographical correlates of those lineages. Our results indicate that R. darlingi is paraphyletic, the eastern and western parts of its distribution forming two distinct non-sister lineages that diverged ~9.7 Mya. We retain R. darlingi for the eastern lineage and argue that the western lineage, currently the sub-species R . d. damarensis , should be elevated to full species status. R. damarensis comprises two lineages that diverged ~5 Mya. Our findings concur with patterns of divergence of other co-distributed taxa which are associated with increased regional aridification between 7-5 Mya suggesting possible vicariant evolution. The morphology and echolocation calls of R. darlingi and R. damarensis are convergent despite occupying different biomes. This suggests that adaptation to similar habitats is not responsible for the convergence. Furthermore, R. darlingi forms part of a clade comprising species that are bigger and echolocate at lower frequencies than R. darlingi , suggesting that biological constraints are unlikely to have influenced the convergence. Instead, the striking similarity in morphology and sensory biology are probably the result of neutral evolutionary processes, resulting in the independent evolution of similar phenotypes

    06051 Abstracts Collection -- Kolmogorov Complexity and Applications

    Get PDF
    From 29.01.06 to 03.02.06, the Dagstuhl Seminar 06051 ``Kolmogorov Complexity and Applications\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Scrutinizing human MHC polymorphism:supertype analysis using Poisson-Boltzmann electrostatics and clustering

    Get PDF
    Peptide-binding MHC proteins are thought the most variable proteins across the human population; the extreme MHC polymorphism observed is functionally important and results from constrained divergent evolution. MHCs have vital functions in immunology and homeostasis: cell surface MHC class I molecules report cell status to CD8+ T cells, NKT cells and NK cells, thus playing key roles in pathogen defence, as well as mediating smell recognition, mate choice, Adverse Drug Reactions, and transplantation rejection. MHC peptide specificity falls into several supertypes exhibiting commonality of binding. It seems likely that other supertypes exist relevant to other functions. Since comprehensive experimental characterization is intractable, structure-based bioinformatics is the only viable solution. We modelled functional MHC proteins by homology and used calculated Poisson-Boltzmann electrostatics projected from the top surface of the MHC as multi-dimensional descriptors, analysing them using state-of-the-art dimensionality reduction techniques and clustering algorithms. We were able to recover the 3 MHC loci as separate clusters and identify clear sub-groups within them, vindicating unequivocally our choice of both data representation and clustering strategy. We expect this approach to make a profound contribution to the study of MHC polymorphism and its functional consequences, and, by extension, other burgeoning structural systems, such as GPCRs

    ์ฒ™์ถ”๋™๋ฌผ์•„๋ฌธ ๋‚ด ๋‹ค๋ฅธ ๊ณ„ํ†ต ๊ฐ„ ๋Œ€์ง„ํ™”๋ฅผ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•œ ์ƒ๋ฌผ์ •๋ณดํ•™์  ์ ‘๊ทผ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์ƒ๋ฌผ์ •๋ณดํ•™์ „๊ณต, 2022. 8. ๊น€ํฌ๋ฐœ.์ƒ๋ฌผ์ •๋ณดํ•™์€ ๋””์ง€ํ„ธํ™”๋œ ์œ ์ „์„œ์—ด์ •๋ณด๋ฅผ ํ† ๋Œ€๋กœ ๋‹ค์–‘ํ•œ ์ƒ๋ช…ํ˜„์ƒ์˜ ์›๋ฆฌ๋ฅผ ๊ทœ๋ช…ํ•˜๊ณ  ์ด๋ฅผ ํ™œ์šฉํ•ด ์ธ๋ฅ˜์˜ ์‚ถ์˜ ์งˆ์„ ํ–ฅ์ƒํ•˜๋Š” ๊ฒƒ์„ ๋ชฉ์ ์œผ๋กœ ํ•  ๊ฒƒ์ด๋‹ค. ์ƒ๋ฌผ์ •๋ณดํ•™์  ์—ฐ๊ตฌ๋Š” ๊ฐ ์ข…์„ ๋Œ€ํ‘œํ•˜๋Š” ํ‘œ์ค€์œ ์ „์ฒด ๊ตฌ์ถ•์œผ๋กœ ์ผ๋ฐ˜์ ์œผ๋กœ ์‹œ์ž‘๋˜๊ณ  ๋ฏธ์†Œ ํ˜น์€ ๋Œ€์ง„ํ™”์— ๋Œ€ํ•œ ํ›„์† ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค. ๋น„๋ก ์งง์€ ๋‹จํŽธ ํ•ด๋… ๊ธฐ์ˆ ์ด ์œ ์ „์ฒด ์‹œ๋Œ€๋ฅผ ์—ด์—ˆ์ง€๋งŒ, ์งง์€ ๋‹จํŽธ์˜ ์กฐ๋ฆฝ์€ ๋‚ฎ์€ ์—ฐ๊ฒฐ์„ฑ์ด๋‚˜ ์˜ค๋ฅ˜๊ฐ€ ํฌํ•จ๋œ ์œ ์ „์ž ์ฃผ์„ ๋“ฑ์˜ ์‹ฌ๊ฐํ•œ ๋ฌธ์ œ๋“ค์„ ๊ฐ€์ง„๋‹ค. ๊ธด ๋‹จํŽธ ํ•ด๋… ๊ธฐ์ˆ ์€ ์—ผ์ƒ‰์ฒด ์ˆ˜์ค€์˜ ์ฃผ์„ (scaffolds)์— ํ•„์ˆ˜์ ์ธ ๋ณด๋‹ค ๊ธด ์ปจํ‹ฐ๊ทธ (contig) ์กฐ๋ฆฝ์„ ์ƒ์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค. ์งง์€ ๋‹จํŽธ์—์„œ ๊ธด ๋‹จํŽธ์œผ๋กœ ๋ณ€ํ™”ํ•˜๋Š” ํŽ˜๋Ÿฌ๋‹ค์ž„์— ๋ฐœ ๋งž์ถ”์–ด, ๋ณธ ๋…ผ๋ฌธ์€ ํ‘œ์ค€์œ ์ „์ฒด ๊ตฌ์ถ•์—์„œ ๋น„๊ต์œ ์ „์ฒด ๋ถ„์„๊นŒ์ง€ ์ด์–ด์ง€๋Š” ์ผ๋ จ์˜ ์ƒ๋ฌผ์ •๋ณดํ•™์  ๋ถ„์„์— ๋Œ€ํ•œ ์ง‘์•ฝ์  ์—ฐ๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ–ˆ์œผ๋ฉฐ, ์ด๋Š” ๋‹ค์–‘ํ•œ ์ฒ™์ถ”๋™๋ฌผ ์ข…๋“ค์˜ ๋Œ€์ง„ํ™”๋ฅผ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ด๋‹ค. ์ œ 1์žฅ์—์„œ๋Š” ์—ฐ๊ตฌ์˜ ์ผ๋ฐ˜์ ์ธ ๋ฐฐ๊ฒฝ์ง€์‹์„ ์ •๋ฆฌํ•˜์˜€๋‹ค. ์ฒซ์งธ๋กœ, ์—ผ์ƒ‰์ฒด ์ˆ˜์ค€์˜ ์ฃผ์„์„ ๋‹ฌ์„ฑํ•œ ํ‘œ์ค€์œ ์ „์ฒด ๊ตฌ์ถ•์˜ ํŽ˜๋Ÿฌ๋‹ค์ž„ ๋ณ€ํ™”๋ฅผ ์„ค๋ช…ํ–ˆ๋‹ค. ๋‹ค์Œ์œผ๋กœ, ํŠน์ด์  ํ˜•์งˆ์— ๊ด€๋ จ๋œ ๋ถ„์ž ์ง„ํ™”๋ฅผ ๊ทœ๋ช…ํ•˜๋Š” ๋น„๊ต์œ ์ „์ฒด ๋ถ„์„ ๋ฐฉ๋ฒ• ๋ฐ ์‚ฌ๋ก€๋ฅผ ์ •๋ฆฌํ–ˆ๋‹ค. ์ œ 2์žฅ์—์„œ๋Š” ํ‘œ์ค€์œ ์ „์ฒด๋ฅผ ๊ตฌ์ถ•ํ•œ ์‚ฌ๋ก€๋กœ์„œ, ๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ๊ณ ์œ ์ข…์ธ ํฐ๋ณ๋ง๋š๋ง๋‘ฅ์–ด์˜ ์—ผ์ƒ‰์ฒด ์ˆ˜์ค€ ํ‘œ์ค€์œ ์ „์ฒด๋ฅผ ๊ตฌ์ถ•ํ–ˆ๋‹ค. ์ฒ™์ถ”๋™๋ฌผ ์œ ์ „์ฒด ํ”„๋กœ์ ํŠธ์™€ ๊ตญ์ œ ํ˜‘๋ ฅ์„ ํ†ตํ•ด 4๊ฐ€์ง€ ์ตœ์‹  ์œ ์ „์ฒด ํ•ด๋…๊ธฐ์ˆ ๋“ค (Pacbio CLR, 10X Genomics linked reads, Bionano optical mapping, ๊ทธ๋ฆฌ๊ณ  Arima Genomics Hi-C)์„ ํ™œ์šฉํ•˜์—ฌ, ๊ธฐ์กด ํ‘œ์ค€์œ ์ „์ฒด์™€ ๋น„๊ตํ•ด ์—ฐ๊ฒฐ์„ฑ (continuity, Scaffold N50 ๊ธฐ์ค€)์ด ์•ฝ 100๋ฐฐ ํ–ฅ์ƒ๋˜๊ณ  ์ด 25๊ฐœ์˜ ์—ผ์ƒ‰์ฒด๋ฅผ ๊ฐ€์ง„ ๊ณ ํ’ˆ์งˆ ํ‘œ์ค€์œ ์ „์ฒด๋ฅผ ์™„์„ฑํ–ˆ๋‹ค. ๋˜ํ•œ, Pacbio Isoseq์ „์‚ฌ์ฒด ๋ฐ์ดํ„ฐ๋ฅผ ์œ ์ „์ž ์ฃผ์„์— ํ™œ์šฉํ•˜์—ฌ ์ด 24,744๊ฐœ์˜ ์œ ์ „์ž๋ฅผ ๋ฐœ๊ตดํ–ˆ๋‹ค. ์ œ 3์žฅ์—์„œ๋Š” ํ‘œ์ค€์œ ์ „์ฒด ํ’ˆ์งˆ ํ‰๊ฐ€ ๋ฐฉ๋ฒ•๊ณผ ๋น„๊ต์œ ์ „์ฒดํ•™์  ๋ถ„์„์„ ์ ‘๋ชฉํ•œ ์‚ฌ๋ก€๋กœ์„œ, ๋ถ„ํ™” ์‹œ๊ธฐ๊ฐ€ ์˜ค๋ž˜๋œ ์ข… ๊ฐ„์—๋„ BUSCO ์œ ์ „์ž๋ฅผ ํ™œ์šฉํ•ด ์—ผ์ƒ‰์ฒด ์ˆ˜์ค€์˜ ์ง„ํ™” ์–‘์ƒ์„ ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ์ฒ™์ถ”๋™๋ฌผ ๋‚ด์—์„œ ์‚ฌ๋ก€๋ฅผ ์ œ์‹œํ–ˆ๋‹ค. ๋˜ํ•œ, ํฌ์œ ๋ฅ˜, ์กฐ๋ฅ˜, ์–ด๋ฅ˜ ๋“ฑ ๋‹ค์–‘ํ•œ ์ฒ™์ถ”๋™๋ฌผ์˜ ํ‘œ์ค€์œ ์ „์ฒด์—์„œ ํ›„์† ๋ถ„์„ ์ƒ์˜ ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•˜๋Š” ํ—ˆ์œ„ ์†Œ์‹ค ๋ฐ ์ค‘๋ณต ์˜ค๋ฅ˜๋ฅผ ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ์‚ฌ๋ก€๋ฅผ ์ œ์‹œํ•˜๊ณ  ๋ฐœ์ƒ์›์ธ์„ ๋ฐํ˜”๋‹ค. ์ œ 4์žฅ์—์„œ๋Š” ๊ธฐ์กด์˜ ๋น„๊ต์œ ์ „์ฒดํ•™์  ๋ถ„์„์„ ์ ์šฉํ•œ ์‚ฌ๋ก€๋กœ์„œ, ์‹ค๋Ÿฌ์บ”์Šค๋ฅผ ํฌํ•จํ•˜๋Š” ์œก๊ธฐ์•„๊ฐ• ๋‹จ๊ณ„ํ†ต ํŒŒ์ƒ์  ์ง„ํ™”์— ๋Œ€ํ•œ ๋ถ„์„์„ ํ†ตํ•ด ์œก์ƒ ์ ์‘ ๋ฐ ์‚ฌ์ง€ ์ถœํ˜„์˜ ๋ถ„์ž ๊ธฐ์ž‘์„ ๊ทœ๋ช…ํ–ˆ๋‹ค. ์ œ 5์žฅ์—์„œ๋Š” ์ƒˆ๋กœ์šด ๋น„๊ต์œ ์ „์ฒดํ•™์  ๋ถ„์„์„ ์ ์šฉํ•œ ์‚ฌ๋ก€๋กœ์„œ, ๋ฐœ์„ฑํ•™์Šต ์กฐ๋ฅ˜ ๋ฐ ๋Œ€์กฐ๊ตฐ ๊ฐ๊ฐ์˜ ๋‹ค๊ณ„ํ†ต ์ˆ˜๋ ด ์ง„ํ™”์— ๋Œ€ํ•œ ๋ถ„์„์„ ํ†ตํ•ด ์•„๋ฏธ๋…ธ์‚ฐ ์ˆ˜๋ ด์˜ ์ง„ํ™”์  ๋ฒ•์น™์„ ์ œ์•ˆํ•˜๊ณ  ๋ฐœ์„ฑ ํ•™์Šต์— ์—ฐ๊ด€๋œ ํ›„๋ณด ์œ ์ „์ž๋ฅผ ๋ฐœ๊ตดํ–ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ‘œ์ค€์œ ์ „์ฒด ๊ตฌ์ถ•์—์„œ๋ถ€ํ„ฐ ๋น„๊ต์œ ์ „์ฒด ๋ถ„์„์œผ๋กœ ์ด์–ด์ง€๋Š” ์ƒ๋ฌผ์ •๋ณดํ•™์  ์ ‘๊ทผ์„ ํ†ตํ•ด ๊ทœ๋ช…๋œ ์ฃผ์š” ์—ฐ๊ตฌ๊ฒฐ๊ณผ ์ค‘์—, ์—ผ์ƒ‰์ฒด ์ƒ ํ…”๋กœ๋ฏธ์–ด ์„œ์—ด ๋ถ„ํฌ ๋ฐ ์•„๋ฏธ๋…ธ์‚ฐ ์ˆ˜๋ ด ์ง„ํ™”์˜ ์›๋ฆฌ๋Š” ์ฒ™์ถ”๋™๋ฌผ ์™ธ์— ๋‹ค๋ฅธ ๋ถ„๋ฅ˜ ๊ตฐ์—์„œ๋„ ๋น„๊ต๋  ๊ธฐ์ค€์ด ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค. ๋˜ํ•œ, ์‚ฌ์ง€ ๋ฐœ๋‹ฌ ๋ฐ ๋ฐœ์„ฑ ํ•™์Šต์— ์—ฐ๊ด€๋œ ํ›„๋ณด ์œ ์ „์ž๋ฅผ ๋ฐœ๊ตดํ•œ ๋น„๊ต์œ ์ „์ฒดํ•™์  ์ ‘๊ทผ๋ฒ•์€ ์ „ ์„ธ๊ณ„ ๋‹ค์–‘ํ•œ ์ƒ๋ฌผ๋“ค์˜ ๋‹ค์–‘ํ•œ ์œ ์šฉ ํ˜•์งˆ์— ์—ฐ๊ด€๋œ ์œ ์šฉ ์œ ์ „์ž๋ฅผ ๋ฐœ๊ตดํ•˜๋Š”๋ฐ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.Bioinformatics aims to improve the quality of life of mankind by decoding molecular mechanisms of biological phenomena based on digitalized sequence information of various species. It generally begins with a construction of reference genomes representing each species and moves on downstream analyses for microevolution within species and macroevolutions between species. Although short-read sequencing technologies initiated genomics era, the short read assemblies had critical problems for lower continuity and erroneous gene annotations causing mis-interpretations. Long read sequencing technologies improved assembly continuities fundamental to chromosome-level scaffolds and corrected false annotations. Following up the paradigm shift from short-reads to long-reads, here, I performed a series of bioinformatic analyses to understand macroevolutions of various vertebrate species from reference genome construction to comparative genome approaches. Chapter 1 summarized the general background of this dissertation. First, it described the paradigm shift of the reference genome constructions achieving chromosome-scale scaffolds. Next, comparative genomic approaches for specific traits were summarized. Chapter 2, as a case of constructing a reference genome, illuminated a chromosome-level reference genome of giant-fin mudskipper, an endemic species in republic of Korea. Based on the four latest genome sequencing technologies (Pacbio CLR, 10X Genomics linked reads, Bionano optical mapping, and Arima Genomics Hi-C) in the international cooperation with the Vertebrate genomes project, it improved the 100-fold longer continuity (Scaffold N50) with a total of 25 chromosomal-level scaffolds compared to that of the previous genome. In addition, a total of 24,744 genes were annotated with Pacbio Isoseq transcriptome data. In Chapter 3, as a case of combining the reference genome quality evaluation method and comparative genomic analyses, a method was developed to explore the chromosomal evolution between vertebrate species in distant lineages focusing on the BUSCO genes. In addition, it suggested methods for detecting false loss and duplication errors that cause problems in downstream analyses in reference genomes of various vertebrate lineages, such as, mammals, birds, and fishes, and revealed how those kinds of errors occurred. In Chapter 4, as a case using the existing comparative genomic approaches, the molecular mechanisms of terrestrial adaptation and limb emergence were identified by applying the series of analyses for apormorphic evolution of the monophyletic lineage of lobed-fin fishes including coelacanths and human. In Chapter 5, as a case developing a new comparative genomic approach, the rule of amino acid convergence was proposed and candidate genes related to vocal learning were discovered through the multi-omic analyses for convergent evolution between polyphyletic lineages of vocal learning bird and control groups. Among the major findings of this study based on the bioinformatics approaches from the reference genome construction to comparative genomic researches, telomere sequence distributions on chromosomes and the principles of amino acid convergence would be a standard for comparisons in various lineages. In addition, the systemized comparative genomic approaches that identified candidate genes involved in limb development and vocal learning may be utilized to discover new candidate genes associated with various useful traits of living things in the world.Chapter 1. LITERATURE REVIEW 1 1.1 Paradigm shift in reference genome constructions 2 1.2 Comparative genomics for specific traits 3 Chapter 2. CHROMOSOME-LEVEL GENOME ASSEMBLY OF PERIOPHTHALMUS MAGNUSPINNATUS: AN INDIGENOUS MUDSKIPPER IN THE YELLOW SEA 5 2.1 Abstract 6 2.2 Introduction 7 2.3 Materials and Methods 9 2.4 Results and Discussion 13 Chapter 3. COMPARATIVE GENOMIC APPROACHES TO DETECT ERRONEOUS GENES IN REFERENCE GENOMES AND TO VISUALIZE CHROMOSOME EVOLUTION ACROSS VERTEBRATES 24 3.1 Abstract 25 3.2 Introduction 26 3.3 Materials and Methods 28 3.4 Results and Discussion 32 Chapter 4. COELACANTH-SPECIFIC ADAPTIVE GENES GIVE INSIGHTS INTO PRIMITIVE EVOLUTION FOR WATER-TO-LAND TRANSITION OF TETRAPODS 59 4.1 Abstract 60 4.2 Introduction 61 4.3 Materials and Methods 63 4.4 Results 69 4.5 Discussion 79 Chapter 5. AMINO ACID CONVERGENCES BETWEEN INDEPENDENT LINEAGES IN BIRDS GIVE EVOLUTIONARY INSIGHTS INTO AVIAN VOCAL LEARNING 85 5.1 Abstract 86 5.2 Introduction 87 5.3 Materials and Methods 89 5.4 Results 98 5.5 Discussion 159 GENERAL DISCUSSUSION 167 REFERENCES 168 ์š”์•ฝ(๊ตญ๋ฌธ์ดˆ๋ก) 176๋ฐ•

    Viral infection reveals hidden sharing of TCR CDR3 sequences between individuals

    Get PDF
    The T cell receptor is generated by a process of random and imprecise somatic recombination. The number of possible T cell receptors which this process can produce is enormous, greatly exceeding the number of T cells in an individual. Thus, the likelihood of identical TCRs being observed in multiple individuals (public TCRs) might be expected to be very low. Nevertheless such public TCRs have often been reported. In this study we explore the extent of TCR publicity in the context of acute resolving Lymphocytic choriomeningitis virus (LCMV) infection in mice. We show that the repertoire of effector T cells following LCMV infection contains a population of highly shared TCR sequences. This subset of TCRs has a distribution of naive precursor frequencies, generation probabilities, and physico-chemical CDR3 properties which lie between those of classic public TCRs, which are observed in uninfected repertoires, and the dominant private TCR repertoire. We have named this set of sequences "hidden public" TCRs, since they are only revealed following infection. A similar repertoire of hidden public TCRs can be observed in humans after a first exposure to SARS-COV-2. The presence of hidden public TCRs which rapidly expand following viral infection may therefore be a general feature of adaptive immunity, identifying an additional level of inter-individual sharing in the TCR repertoire which may form an important component of the effector and memory response
    • โ€ฆ
    corecore