5,575 research outputs found

    Predicting B Cell Receptor Substitution Profiles Using Public Repertoire Data

    Full text link
    B cells develop high affinity receptors during the course of affinity maturation, a cyclic process of mutation and selection. At the end of affinity maturation, a number of cells sharing the same ancestor (i.e. in the same "clonal family") are released from the germinal center, their amino acid frequency profile reflects the allowed and disallowed substitutions at each position. These clonal-family-specific frequency profiles, called "substitution profiles", are useful for studying the course of affinity maturation as well as for antibody engineering purposes. However, most often only a single sequence is recovered from each clonal family in a sequencing experiment, making it impossible to construct a clonal-family-specific substitution profile. Given the public release of many high-quality large B cell receptor datasets, one may ask whether it is possible to use such data in a prediction model for clonal-family-specific substitution profiles. In this paper, we present the method "Substitution Profiles Using Related Families" (SPURF), a penalized tensor regression framework that integrates information from a rich assemblage of datasets to predict the clonal-family-specific substitution profile for any single input sequence. Using this framework, we show that substitution profiles from similar clonal families can be leveraged together with simulated substitution profiles and germline gene sequence information to improve prediction. We fit this model on a large public dataset and validate the robustness of our approach on an external dataset. Furthermore, we provide a command-line tool in an open-source software package (https://github.com/krdav/SPURF) implementing these ideas and providing easy prediction using our pre-fit models.Comment: 23 page

    ์ง€๋„ ํ•™์Šต ๊ธฐ๋ฐ˜ ๋ฐ”์ด์˜คํŒจ๋‹ ํด๋ก  ์ฆํญ ํŒจํ„ด ๋ถ„์„์„ ํ†ตํ•œ ํ•ญ์› ๊ฒฐํ•ฉ ๋ฐ˜์‘์„ฑ ์˜ˆ์ธก

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์˜๊ณผ๋Œ€ํ•™ ์˜๊ณผํ•™๊ณผ, 2021.8. ์ •์ค€ํ˜ธ.Background: Monoclonal antibodies (mAbs) are produced by B cells and specifically binds to target antigens. Technical advances in molecular and cellular cloning made it possible to purify recombinant mAbs in a large scale, enhancing the multiple research area and potential for their clinical application. Since the importance of therapeutic mAbs is increasing, mAbs have become the predominant drug classes for various diseases over the past decades. During that time, immense technological advances have made the discovery and development of mAb therapeutics more efficient. Owing to advances in high-throughput methodology in genomic sequencing, phenotype screening, and computational data analysis, it is conceivable to generate the panel of antibodies with annotated characteristics without experiments. Thesis objective: This thesis aims to develop the next-generation antibody discovery methods utilizing high-throughput antibody repertoire sequencing and bioinformatics analysis. I developed novel methods for construction of in vitro display antibody library, and machine learning based antibody discovery. In chapter 3, I described a new method for generating immunoglobulin (Ig) gene repertoire, which minimizes the amplification bias originated from a large number of primers targeting diverse Ig germline genes. Universal primer-based amplification method was employed in generating Ig gene repertoire then validated by high-throughput antibody repertoire sequencing, in the aspect of clonal diversity and immune repertoire reproducibility. A result of this research work is published in โ€˜Journal of Immunological Methods (2021). doi: 10.1016/j.jim.2021. 113089โ€™. In chapter 4, I described a novel machine learning based antibody discovery method. In conventional colony screening approach, it is impossible to identify antigen specific binders having low clonal abundance, or hindered by non-specific phage particles having antigen reactivity on p8 coat protein. To overcome the limitations, I applied the supervised learning algorithm on high-throughput sequencing data annotated with binding property and clonal frequency through bio-panning. NGS analysis was performed to generate large number of antibody sequences annotated with itsโ€™ clonal frequency at each selection round of the bio-panning. By using random forest (RF) algorithm, antigen reactive binders were predicted and validated with in vitro screening experiment. A result of this research work is published in โ€˜Experimental & Molecular Medicine (2017). doi:0.1038/emm.2017.22โ€™ and โ€˜Biomolecule (2020). doi:10.3390/biom10030421โ€™. Conclusion: By combining conventional antibody discovery techniques and high-throughput antibody repertoire sequencing, it was able to make advances in multiple attributes of the previous methodology. Multi-cycle amplification with Ig germline gene specific primers showed the high level of repertoire distortion, but could be improved by employing universal primer-based amplification method. RF model generates the large number of antigen reactive antibody sequences having various clonal enrichment pattern. This result offers the new insight in interpreting clonal enrichment process, frequency of antigen specific binder does not increase gradually but depends on the multiple selection rounds. Supervised learning-based method also provides the more diverse antigen specific clonotypes than conventional antibody discovery methods.์—ฐ๊ตฌ์˜ ๋ฐฐ๊ฒฝ: ๋‹จ์ผ ํด๋ก  ํ•ญ์ฒด (monoclonal antibody, mAb) ๋Š” B ์„ธํฌ์—์„œ ์ƒ์‚ฐ๋˜์–ด ํ‘œ์  ํ•ญ์›์— ํŠน์ด์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜๋Š” ํด๋ฆฌํŽฉํƒ€์ด๋“œ ๋ณตํ•ฉ์ฒด ์ด๋‹ค. ๋ถ„์ž ๋ฐ ์„ธํฌ ํด๋กœ๋‹ ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์œผ๋กœ ์žฌ์กฐํ•ฉ ๋‹จ์ผ ํด๋ก  ํ•ญ์ฒด๋ฅผ ๋Œ€์šฉ๋Ÿ‰์œผ๋กœ ์ƒ์‚ฐํ•˜๋Š”๊ฒƒ์ด ๊ฐ€๋Šฅํ•ด์กŒ์œผ๋ฉฐ, ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋‹ค์–‘ํ•œ ์—ฐ๊ตฌ ๋ฐ ์ž„์ƒ ๋ถ„์•ผ์—์„œ์˜ ํ™œ์šฉ์ด ํ™•๋Œ€๋˜๊ณ  ์žˆ๋‹ค. ๋˜ํ•œ ์น˜๋ฃŒ์šฉ ํ•ญ์ฒด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๋ฐœ๊ตดํ•˜๊ณ  ๊ฐœ๋ฐœํ•˜๋Š” ๊ธฐ์ˆ ์— ๋Œ€ํ•œ ๋น„์•ฝ์ ์ธ ๋ฐœ์ „์ด ์ด๋ฃจ์–ด์กŒ๋‹ค. ์œ ์ „์ž ์„œ์—ด ๋ถ„์„, ํ‘œํ˜„ํ˜• ์Šคํฌ๋ฆฌ๋‹, ์ปดํ“จํŒ… ๊ธฐ๋ฐ˜ ๋ถ„์„๋ฒ• ๋ถ„์•ผ์—์„œ ์ด๋ฃจ์–ด์ง„ ๊ณ ์ง‘์  ๋ฐฉ๋ฒ•๋ก  (high-throughput methodology) ์˜ ๋ฐœ์ „๊ณผ ์ด์˜ ์‘์šฉ์„ ํ†ตํ•ด, ๋น„์‹คํ—˜์  ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ํ•ญ์› ๋ฐ˜์‘์„ฑ ํ•ญ์ฒด ํŒจ๋„์„ ์ƒ์‚ฐํ•˜๋Š”๊ฒƒ์ด ๊ฐ€๋Šฅํ•ด์กŒ๋‹ค. ์—ฐ๊ตฌ์˜ ๋ชฉํ‘œ: ๋ณธ ๋ฐ•์‚ฌ ํ•™์œ„ ๋…ผ๋ฌธ์€ ๊ณ ์ง‘์  ํ•ญ์ฒด ๋ ˆํผํ† ์–ด ์‹œํ€€์‹ฑ (high-throughput antibody repertoire sequencing) ๊ณผ ์ƒ๋ฌผ์ •๋ณดํ•™ (bioinformatics) ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ์‹ ๊ทœํ•œ (novel) ์ฐจ์„ธ๋Œ€ ํ•ญ์ฒด ๋ฐœ๊ตด๋ฒ• (next-generation antibody discovery method) ์„ ๊ฐœ๋ฐœํ•˜๋Š”๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜๊ณ  ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด in vitro display ํ•ญ์ฒด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ œ์ž‘ํ•˜๊ธฐ ์œ„ํ•œ ์‹ ๊ทœ ํ”„๋กœํ† ์ฝœ ๋ฐ ๊ธฐ๊ณ„ ํ•™์Šต์„ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ ํ•ญ์ฒด ๋ฐœ๊ตด๋ฒ•์„ ๊ฐœ๋ฐœ ํ•˜์˜€๋‹ค. Chapter 3: ํ•ญ์ฒด ๋ ˆํผํ† ์–ด๋ฅผ ์ฆํญํ•˜๋Š” ๊ณผ์ •์—์„œ, ๋‹ค์ˆ˜์˜ ์ƒ์‹์„ธํฌ ๋ฉด์—ญ ๊ธ€๋กœ๋ถˆ๋ฆฐ ์œ ์ „์ž (germline immunoglobulin gene) ํŠน์ด์  ํ”„๋ผ์ด๋จธ ์‚ฌ์šฉ์— ์˜ํ•ด ๋ฐœ์ƒํ•˜๋Š” ์ฆํญ ํŽธ์ฐจ (amplification bias) ๋ฅผ ์ตœ์†Œํ™” ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•ด ๊ธฐ์ˆ ํ•˜์˜€๋‹ค. ์œ ๋‹ˆ๋ฒ„์…œ (universal) ํ”„๋ผ์ด๋จธ๋ฅผ ์‚ฌ์šฉํ•œ ๋‹ค์ค‘ ์‚ฌ์ดํด ์ฆํญ (multi-cycle amplification) ๋ฒ•์ด ์‚ฌ์šฉ๋˜์—ˆ์œผ๋ฉฐ, ๊ณ ์ง‘์  ํ•ญ์ฒด ๋ ˆํผํ† ์–ด ์‹œํ€€์‹ฑ์„ ํ†ตํ•ด, ํด๋ก  ๋‹ค์–‘์„ฑ (clonal diversity) ๋ฐ ๋ฉด์—ญ ๋ ˆํผํ† ์–ด ์žฌ๊ตฌ์„ฑ๋„ (immune repertoire reproducibility) ๋ฅผ ์ƒ๋ฌผ์ •๋ณดํ•™์  ๊ธฐ๋ฒ•์œผ๋กœ ์ธก์ •ํ•˜์—ฌ ์‹ ๊ทœ ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•œ ๊ฒ€์ฆ์„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ ์—ฐ๊ตฌ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ์˜ ํ•™์ˆ ์ง€์— ์ถœํŒ ๋˜์—ˆ๋‹ค: Journal of Immunological Methods (2021). doi: 10.1016/j.jim.2021. 113089. Chapter 4: ๊ธฐ๊ณ„ ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ํ•ญ์ฒด ๋ฐœ๊ตด๋ฒ• ๊ฐœ๋ฐœ์— ๋Œ€ํ•ด ๊ธฐ์ˆ ํ•˜์˜€๋‹ค. ์ „ํ†ต์  ์ฝœ๋กœ๋‹ˆ ์Šคํฌ๋ฆฌ๋‹ (colony screening) ๋ฐฉ๋ฒ•์—์„œ๋Š”, ํด๋ก  ๋นˆ๋„ (clonal abundance) ๊ฐ€ ๋‚ฎ์€ ํด๋ก ์„ ๋ฐœ๊ตด ํ•˜๊ฑฐ๋‚˜ ์„ ํƒ์•• (selective pressure) ์ด ๋ถ€์—ฌ๋˜๋Š” ๊ณผ์ •์—์„œ, p8 ํ‘œ๋ฉด ๋‹จ๋ฐฑ์งˆ์˜ ๋น„ ํŠน์ด์  ํ•ญ์› ํŠน์ด์„ฑ์„ ์ œ๊ฑฐํ•  ์ˆ˜ ์—†๋‹ค. ์ด๋Ÿฌํ•œ ์ œํ•œ์ ์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ•ญ์› ๊ฒฐํ•ฉ๋Šฅ ๋ฐ ๋ฐ”์ด์˜คํŒจ๋‹ ์—์„œ์˜ ํด๋ก  ๋นˆ๋„๊ฐ€ ์ธก์ • ๋˜์–ด์žˆ๋Š” ๊ณ ์ง‘์  ํ•ญ์ฒด ์„œ์—ด ๋ฐ์ดํ„ฐ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์ง€๋„ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์˜€๋‹ค. ๋žœ๋ค ํฌ๋ ˆ์ŠคํŠธ (random forest, RF) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•˜์—ฌ ํ•ญ์› ํŠน์ด์  ํ•ญ์ฒด ํด๋ก ์„ ์˜ˆ์ธกํ•˜์˜€์œผ๋ฉฐ, ์‹œํ—˜๊ด€ ๋‚ด ์Šคํฌ๋ฆฌ๋‹์„ ํ†ตํ•ด ํ•ญ์› ํŠน์ด์„ฑ์„ ๊ฒ€์ฆํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ์˜ ํ•™์ˆ ์ง€์— ์ถœํŒ๋˜์—ˆ๋‹ค: 1) Experimental & Molecular Medicine (2017). doi:0.1038/emm.2017.22., 2) Biomolecule (2020). doi:10.3390/biom10030421. ๊ฒฐ๋ก : ์ „ํ†ต์  ํ•ญ์ฒด ๋ฐœ๊ตด ๊ธฐ์ˆ ๊ณผ ๊ณ ์ง‘์  ํ•ญ์ฒด ๋ ˆํผํ† ์–ด ์‹œํ€€์‹ฑ ๊ธฐ์ˆ ์„ ์œตํ•ฉํ•จ์œผ๋กœ์จ, ๊ธฐ์กด ๋ฐฉ๋ฒ•๋ก ์˜ ๋‹ค์–‘ํ•œ ํ•œ๊ณ„์ ์„ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋ฉด์—ญ ๊ธ€๋กœ๋ถˆ๋ฆฐ ์ƒ์‹์„ธํฌ ์œ ์ „์ž ํŠน์ด์  ํ”„๋ผ์ด๋จธ๋ฅผ ์‚ฌ์šฉํ•œ ๋‹ค์ค‘ ์‚ฌ์ดํด ์ฆํญ์€ ํด๋ก  ๋นˆ๋„ ๋ฐ ๋‹ค์–‘์„ฑ์— ์™œ๊ณก์„ ์œ ๋„ ํ•˜์˜€์œผ๋‚˜, ์œ ๋‹ˆ๋ฒ„์…œ ํ”„๋ผ์ด๋จธ๋ฅผ ์‚ฌ์šฉํ•œ ์ฆํญ๋ฒ•์„ ํ†ตํ•ด ๋†’์€ ํšจ์œจ๋กœ ๋ ˆํผํ† ์–ด ์™œ๊ณก์„ ๊ฐœ์„ ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ๊ด€์ฐฐํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. RF ๋ชจ๋ธ์€ ๋‹ค์–‘ํ•œ ํด๋ก  ์ฆํญ ํŒจํ„ด (enrichment pattern) ์„ ๊ฐ€์ง€๋Š” ํ•ญ์› ๋ฐ˜์‘์„ฑ ํ•ญ์ฒด ์„œ์—ด์„ ์ƒ์„ฑํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ํ•ญ์›์— ํŠน์ด์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜๋Š” ํด๋ก ์ด ๋‹จ๊ณ„์ ์œผ๋กœ ์ฆํญ๋˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ดˆ๊ธฐ ๋ฐ ํ›„๊ธฐ์˜ ๋‹ค์ˆ˜์˜ ์„ ๋ณ„ ๋‹จ๊ณ„ (selection round) ์— ์˜์กดํ•จ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ, ๋ฐ”์ด์˜คํŒจ๋‹ ์—์„œ์˜ ํด๋ก  ์ฆํญ์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ํ•ด์„์„ ์ œ์‹œํ•˜์˜€๋‹ค. ๋˜ํ•œ ์ง€๋„ ํ•™์Šต์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฐœ๊ตด ๋œ ํด๋ก ๋“ค์—์„œ, ์ „ํ†ต์  ์ฝœ๋กœ๋‹ˆ ์Šคํฌ๋ฆฌ๋‹ ๋ฐฉ๋ฒ•๊ณผ ๋Œ€๋น„ํ•˜์—ฌ ๋” ๋†’์€ ์„œ์—ด ๋‹ค์–‘์„ฑ์„ ๊ด€์ฐฐํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.1. Introduction 8 1.1. Antibody and immunoglobulin repertoire 8 1.2. Antibody therapeutics 16 1.3. Methodology: antibody discovery and engineering 21 2. Thesis objective 28 3. Establishment of minimally biased phage display library construction method for antibody discovery 29 3.1. Abstract 29 3.2. Introduction 30 3.3. Results 32 3.4. Discussion 44 3.5. Methods 47 4. In silico identification of target specific antibodies by high-throughput antibody repertoire sequencing and machine learning 58 4.1. Abstract 58 4.2. Introduction 60 4.3. Results 64 4.4. Discussion 111 4.5. Methods 116 5. Future perspectives 129 6. References 135 7. Abstract in Korean 150๋ฐ•

    Genome analysis of a highly virulent serotype 1 strain of streptococcus pneumoniae from West Africa

    Get PDF
    Streptococcus pneumoniae is a leading cause of pneumonia, meningitis, and bacteremia, estimated to cause 2 million deaths annually. The majority of pneumococcal mortality occurs in developing countries, with serotype 1 a leading cause in these areas. To begin to better understand the larger impact that serotype 1 strains have in developing countries, we characterized virulence and genetic content of PNI0373, a serotype 1 strain from a diseased patient in The Gambia. PNI0373 and another African serotype 1 strain showed high virulence in a mouse intraperitoneal challenge model, with 20% survival at a dose of 1 cfu. The PNI0373 genome sequence was similar in structure to other pneumococci, with the exception of a 100 kb inversion. PNI0373 showed only15 lineage specific CDS when compared to the pan-genome of pneumococcus. However analysis of non-core orthologs of pneumococcal genomes, showed serotype 1 strains to be closely related. Three regions were found to be serotype 1 associated and likely products of horizontal gene transfer. A detailed inventory of known virulence factors showed that some functions associated with colonization were absent, consistent with the observation that carriage of this highly virulent serotype is unusual. The African serotype 1 strains thus appear to be closely related to each other and different from other pneumococci despite similar genetic content

    Visualization and Correction of Automated Segmentation, Tracking and Lineaging from 5-D Stem Cell Image Sequences

    Get PDF
    Results: We present an application that enables the quantitative analysis of multichannel 5-D (x, y, z, t, channel) and large montage confocal fluorescence microscopy images. The image sequences show stem cells together with blood vessels, enabling quantification of the dynamic behaviors of stem cells in relation to their vascular niche, with applications in developmental and cancer biology. Our application automatically segments, tracks, and lineages the image sequence data and then allows the user to view and edit the results of automated algorithms in a stereoscopic 3-D window while simultaneously viewing the stem cell lineage tree in a 2-D window. Using the GPU to store and render the image sequence data enables a hybrid computational approach. An inference-based approach utilizing user-provided edits to automatically correct related mistakes executes interactively on the system CPU while the GPU handles 3-D visualization tasks. Conclusions: By exploiting commodity computer gaming hardware, we have developed an application that can be run in the laboratory to facilitate rapid iteration through biological experiments. There is a pressing need for visualization and analysis tools for 5-D live cell image data. We combine accurate unsupervised processes with an intuitive visualization of the results. Our validation interface allows for each data set to be corrected to 100% accuracy, ensuring that downstream data analysis is accurate and verifiable. Our tool is the first to combine all of these aspects, leveraging the synergies obtained by utilizing validation information from stereo visualization to improve the low level image processing tasks.Comment: BioVis 2014 conferenc

    Structural Prediction of Proteinโ€“Protein Interactions by Docking: Application to Biomedical Problems

    Get PDF
    A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of proteinโ€“protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in proteinโ€“protein interactions, or providing modeled structural data for drug discovery targeting proteinโ€“protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

    In silico identification of essential proteins in Corynebacterium pseudotuberculosis based on protein-protein interaction networks

    Get PDF
    Background Corynebacterium pseudotuberculosis (Cp) is a gram-positive bacterium that is classified into equi and ovis serovars. The serovar ovis is the etiological agent of caseous lymphadenitis, a chronic infection affecting sheep and goats, causing economic losses due to carcass condemnation and decreased production of meat, wool, and milk. Current diagnosis or treatment protocols are not fully effective and, thus, require further research of Cp pathogenesis. Results Here, we mapped known protein-protein interactions (PPI) from various species to nine Cp strains to reconstruct parts of the potential Cp interactome and to identify potentially essential proteins serving as putative drug targets. On average, we predict 16,669 interactions for each of the nine strains (with 15,495 interactions shared among all strains). An in silico sanity check suggests that the potential networks were not formed by spurious interactions but have a strong biological bias. With the inferred Cp networks we identify 181 essential proteins, among which 41 are non-host homologous. Conclusions The list of candidate interactions of the Cp strains lay the basis for developing novel hypotheses and designing according wet-lab studies. The non-host homologous essential proteins are attractive targets for therapeutic and diagnostic proposes. They allow for searching of small molecule inhibitors of binding interactions enabling modern drug discovery. Overall, the predicted Cp PPI networks form a valuable and versatile tool for researchers interested in Corynebacterium pseudotuberculosis
    • โ€ฆ
    corecore