1,037 research outputs found

    Deep Sequencing Data Analysis: Challenges and Solutions

    Get PDF

    Identification of β-catenin binding regions in colon cancer cells using ChIP-Seq

    Get PDF
    Deregulation of the Wnt/β-catenin signaling pathway is a hallmark of colon cancer. Mutations in the adenomatous polyposis coli (APC) gene occur in the vast majority of colorectal cancers and are an initiating event in cellular transformation. Cells harboring mutant APC contain elevated levels of the β-catenin transcription coactivator in the nucleus which leads to abnormal expression of genes controlled by β-catenin/T-cell factor 4 (TCF4) complexes. Here, we use chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) to identify β-catenin binding regions in HCT116 human colon cancer cells. We localized 2168 β-catenin enriched regions using a concordance approach for integrating the output from multiple peak alignment algorithms. Motif discovery algorithms found a core TCF4 motif (T/A–T/A–C–A–A–A–G), an extended TCF4 motif (A/T/G–C/G–T/A–T/A–C–A–A–A–G) and an AP-1 motif (T–G–A–C/T–T–C–A) to be significantly represented in β-catenin enriched regions. Furthermore, 417 regions contained both TCF4 and AP-1 motifs. Genes associated with TCF4 and AP-1 motifs bound β-catenin, TCF4 and c-Jun in vivo and were activated by Wnt signaling and serum growth factors. Our work provides evidence that Wnt/β-catenin and mitogen signaling pathways intersect directly to regulate a defined set of target genes

    Genome-wide CRISPR screens for the interrogation of genome integrity maintenance networks

    Full text link
    Le matériel génétique (l’ADN) d’un organisme contient l’information nécessaire à sa survie, sa croissance et sa reproduction. La perte de cette information affecte grandement la santé de l’organisme et cette altération est l’un des facteurs les plus courants dans le vieillissement ou le cancer. Quasiment toutes les cellules d’un organisme contiennent une copie de ce matériel génétique, communément appelé le génome, et font usage de plusieurs mécanismes pour en réparer les sections endommagées ainsi que pour le copier avec précision lors de la division cellulaire. Nous avons cherché à étudier les processus cellulaires qui maintiennent la stabilité génomique en inactivant systématiquement chacun des gènes avec la technique de criblage par CRISPR afin d’en étudier les rôles. Nous avons effectué ces criblages à l’échelle du génome dans des lignées cellulaires humaines en combinaison avec des perturbations chimiques dans le but d’identifier l’effet du traitement chimique ou le rôle de gènes qui exacerbent ou atténuent la perturbation. Nous nous sommes d’abord concentrés sur le resvératrol, une molécule initialement extraite de plantes qui a démontré des propriétés antivieillissement dans certains organismes modèles ainsi que la capacité d’inhiber la prolifération cellulaire. Notre criblage génétique a révélé que le resvératrol inhibait la réplication de l’ADN. En comparant les effets cellulaires du resvératrol à l’hydroxyurée, un agent connu pour causer du stress réplicatif, nous avons montré que ces deux traitements menaient à une diminution similaire de la progression de la fourche de réplication ainsi qu’à une activation de la signalisation en réponse au stress réplicatif. Nous avons également démontré que l’inhibition de la réplication de l’ADN dans les cellules humaines par le resvératrol est l’un des effets principaux de la molécule sur la prolifération cellulaire et ne requiert pas la présence de la déacétylase d’histone Sirtuin-1, protéine qui a été suggérée comme étant la cible principale du resvératrol pour son effet antivieillissement. Nous avons également étudié la perturbation d’un second processus cellulaire, soit le maintien des télomères. Ces séquences spéciales aux extrémités des chromosomes sont indispensables à la protection du génome et leur érosion graduelle est contrebalancée par l’activité enzymatique de la télomérase. Nous avons effectué un crible génétique par CRISPR à l’échelle du génome dans une lignée cellulaire dont nous avons inhibé la télomérase en utilisant BIBR1532, un inhibiteur spécifique de la télomérase. Nous avons découvert une forte interaction génétique entre la télomérase et C16orf72, un gène non-annoté que nous avons nommé TAPR1. Nous avons montré que les cellules déficientes en TAPR1 possèdent des niveaux élevés de la protéine p53, un facteur de transcription central à la réponse cellulaire aux dommages télomériques et aux dommages à l’ADN. Nous suggérons que TAPR1 agit comme un inhibiteur de la stabilité protéique de p53. En somme, ces travaux mettent en évidence la capacité des cribles génétiques CRISPR à approfondir nos connaissances sur le fonctionnement des processus de maintien de la stabilité génomique chez l’humain.The genetic material (DNA) of an organism contains the necessary information for survival, growth and reproduction. Loss of this information strongly impacts the health of the organism and is the leading factor in aging and cancer. Almost all cells in an organism contain a copy of said genetic material (genome) and employ several mechanisms to repair any damaged section of the genome and to accurately copy it during cell division. We sought to understand the cellular processes by which cells maintain genome stability by systematically inactivating individual genes to uncover their role using pooled CRISPR-Cas9 screening. We employed genome-wide CRISPR screening in human cell lines in combination with specific chemical perturbations to identify gene deletions that enhance or suppress the phenotype of the chemical treatment, thereby shedding light on the effect of the treatment or the role of said enhancer/suppressor genes. We first focused on resveratrol; a small molecule first discovered in plants that has been suggested to extend lifespan in model organisms while also inhibiting cell proliferation ex vivo. Chemical-genetic screening pinpointed a role of resveratrol in inhibition of DNA replication. When we compared the cellular effects of resveratrol to hydroxyurea, a known inducer of replicative stress, we found that both treatments led to slower replication fork progression and activation of signaling in response to replicative stress. Importantly, we showed that the inhibition of DNA replication by resveratrol in human cells is a primary effect on cell proliferation and independent of the histone deacetylase Sirtuin-1, which has been implicated as the primary target in lifespan extension by resveratrol. We then studied the perturbation of a second cellular process, namely telomere maintenance. These specialized sequences at the termini of chromosomes are critical for the protection of chromosome ends and their erosion is counteracted by the enzymatic activity of telomerase. We performed a genome-wide CRISPR screen in cells that were concomitantly treated with a specific telomerase inhibitor, BIBR1532. We uncovered a strong genetic interaction between telomerase and a previously unannotated gene, C16orf72, which we named TAPR1. We found that TAPR1-depleted cells led to elevated p53 levels, a transcription factor central for the cellular response to telomeric and global DNA damage. We propose that TAPR1 is a negative regulator of p53 protein levels by promoting its turnover. Altogether, these studies highlight the power of CRISPR-Cas9 in genetic screening to uncover novel insight into the human genome stability maintenance network

    Epigenetics and genetics of hematopoietic stem cells heterogeneity

    Get PDF
    Abstract In diploid eukaryotic organisms, most genes are expressed biallelically. However, there are exceptions where the expression occurs in a monoallelic pattern that results from a differential allele-specific transcription based on the different epigenetic marking of the two alleles. At the level of cells, there are three classes of monoallelic expression regulated by epigenetic mechanisms: parent-of-origin imprinting, X chromosome inactivation (XCI), and random autosomal monoallelic expression (RMAE). Biased repopulations obtained from single-cell transplantation assays revealed that the pool of hematopoietic stem cells (HSCs) is heterogeneous, reflecting the epigenetic differences of individual cells. According to a model in which the allele-specific expression patterns are established during differentiation in embryonic stem cells and are stably propagated through cell divisions, it is assumed that HSCs carry genes (and alleles) with these stable epigenetic marks. Therefore, the analysis of epigenetic states in the stem cell population at the clonal level is necessary to understand its heterogeneity and diversity. Here we evaluated for the first time the persistence of allele-specific epigenetic states in the hematopoietic system in vivo using allelic imbalance as a readout. We created a monoclonal hematopoietic system in mice by single HSC transplantation and then analyzed the emerging lymphoid progeny using a genome-wide transcriptomics approach. We revealed that in the single-HSC derived hematopoietic cells, XCI is stably maintained through extensive proliferation and differentiation, whereas the vast majority of autosomal genes lack the stable clonal patterns of random monoallelic expression. This finding shows that the recurrent parallels between XCI and RMAE are misleading, suggesting that different mechanisms underlie these two classes of monoallelic expression. Additionally, we show that this in vivo clonal approach, which is free of genetic manipulation, can replace the artificial strategies that have been used to study tissue-specific XCI. Finally, stable allele-specific expression patterns were found in a rare number of genes (14 genes, <0.2%) in the progeny of a single HSC, indicating that these patterns were already present in the original HSC used for transplantation. However, the number of genes with stable monoallelic expression in cells that underwent differentiation steps is much lower than the numbers previously reported in studies using clonal cell lines in vitro without extensive differentiation (~2–15%). To reconcile these observations, we propose that most allele-specific expression patterns in autosomal genes are metastable and can be erased and reestablished at different differentiation stages.Resumo Nos organismos eucarióticos diplóides, a maioria dos genes são expressos bialelicamente. No entanto, existem excepções em que, ao nível das células, a expressão ocorre num padrão monoalélico que resulta de uma transcrição diferencial dos alelos de base epigenética. Existem três classes de expressão monoalélica regulada por mecanismos epigenéticos: imprinting de origem parental, inactivação do cromossoma X (XCI*) e expressão aleatória monoalélica autossómica (RMAE). Populações enviesadas obtidas a partir de ensaios de transplante de uma única célula revelaram que o conjunto de células estaminais hematopoiéticas (HSCs) é heterogéneo, reflectindo as diferenças epigenéticas de células individuais. Segundo um modelo em que os padrões da expressão específica de alelos são estabelecidos durante a diferenciação de células estaminais embrionárias e são propagados depois de forma estável através de divisões celulares, as HSCs carregam genes (e alelos) com marcas epigenéticas estáveis. A análise a nível clonal dos estados epigenéticos das células estaminais é necessária para entender a sua heterogeneidade e diversidade. Nesta tese, avaliamos pela primeira vez a persistência de estados epigenéticos entre os alelos no sistema hematopoiético in vivo usando o desequilíbrio da expressão alélica como ferramenta de leitura. O trabalho baseou-se na criação de um sistema hematopoiético monoclonal em ratinho por transplante de uma única HSC e no subsequente estudo da progenia linfóide emergente por análise transcriptómica de todo o genoma. Nas células hematopoiéticas resultantes de uma única HSC, verificámos que a XCI é mantida de forma estável após extensa proliferação e diferenciação, enquanto a vasta maioria dos genes autossómicos não estão sob RMAE. Assim, os paralelismos recorrentes na literatura entre XCI e RMAE são enganosos, porque estes dois fenómenos não têm a mesma estabilidade e serão regulados por diferentes mecanismos. Além disso, demonstramos que esta abordagem clonal com base num sistema sem manipulação genética pode ser uma estratégia para estudar a XCI específica de tecidos in vivo. Por fim, um padrão de RMAE foi encontrado num número raro de genes (14 genes, <0,2% do total) em células linfóides resultantes de uma única HSC, indicando que esses padrões já estavam presentes na HSC original usada no transplante. No entanto, o número de genes com RMAE em células que passaram por etapas de diferenciação é muito menor do que o número relatado anteriormente em estudos usando linhagens celulares clonais in vitro sem diferenciação extensa (~2–15%). Para conciliar estas observações, propomos que a maioria dos padrões de RMAE são meta-estáveis, isto é, passíveis de eliminação e restauração em diferentes estados de diferenciação

    Clonal dynamics in osteosarcoma defined by RGB marking

    Get PDF
    Osteosarcoma is a type of bone tumour characterized by considerable levels of phenotypic heterogeneity, aneuploidy, and a high mutational rate. The life expectancy of osteosarcoma patients has not changed during the last three decades and thus much remains to be learned about the disease biology. Here, we employ a RGB-based single-cell tracking system to study the clonal dynamics occurring in a de novo-induced murine osteosarcoma model. We show that osteosarcoma cells present initial polyclonal dynamics, followed by clonal dominance associated with adaptation to the microenvironment. Interestingly, the dominant clones are composed of subclones with a similar tumour generation potential when they are re-implanted in mice. Moreover, individual spontaneous metastases are clonal or oligoclonal, but they have a different cellular origin than the dominant clones present in primary tumours. In summary, we present evidence that osteosarcomagenesis can follow a neutral evolution model, in which different cancer clones coexist and propagate simultaneously.We thank ISCIII and CNIO flow cytometry and cell sorting units for their participation in our studies. We are thankful to the CCEH-Fred Hutchinson Cancer Research Center for LAM-PCR service. We acknowledge Raquel Pérez Tavarez, María Blázquez Mesa, Alicia Giménez Sánchez, Elena Calvo Cazalilla, and Monserrat Arroyo Correas for useful help on the pathology studies; and Teresa Cejalvo, Isabel Cubillo Moreno, and Miguel Angel Rodríguez-Milla for their contributions in experimental setup. We thank the visual artist Isabella Lacquaniti for her help with drawings and schematics. We are also thankful to the Fondo de Investigaciones Sanitarias (FIS: PI11/00377 and PI14CIII/00005 to J.G.-C., FIS: CP11/00206 to A.A., and RTICC: RD12/0036/0027 to J.G.-C.), the Madrid Regional Government (CellCAM; P2010/BMD-2420 to J.G.-C.), the Asociación Pablo Ugarte, and the Asociación Afanion for grants support.S

    Convergent losses of decay mechanisms and rapid turnover of symbiosis genes in mycorrhizal mutualists.

    Get PDF
    To elucidate the genetic bases of mycorrhizal lifestyle evolution, we sequenced new fungal genomes, including 13 ectomycorrhizal (ECM), orchid (ORM) and ericoid (ERM) species, and five saprotrophs, which we analyzed along with other fungal genomes. Ectomycorrhizal fungi have a reduced complement of genes encoding plant cell wall-degrading enzymes (PCWDEs), as compared to their ancestral wood decayers. Nevertheless, they have retained a unique array of PCWDEs, thus suggesting that they possess diverse abilities to decompose lignocellulose. Similar functional categories of nonorthologous genes are induced in symbiosis. Of induced genes, 7-38% are orphan genes, including genes that encode secreted effector-like proteins. Convergent evolution of the mycorrhizal habit in fungi occurred via the repeated evolution of a 'symbiosis toolkit', with reduced numbers of PCWDEs and lineage-specific suites of mycorrhiza-induced genes

    DNA fragility in the context of neural stem cell fate : a multi-method integrative exploration of genome dynamics

    Get PDF
    Recent advances in mapping the complex genetic architecture underlying various debilitating brain disorders have enabled identification of several genetic risk variants. However, these risk variants only explain part of the heritability and vulnerability to these disorders in early development. Moreover, de novo somatic mutations have been detected in subsets of brain cells, which might account for a significant portion of the missing heritability. However, it remains unclear where these mutations come from and at what developmental stage they might occur. Genome fragility is subject to the functional activity and spatial chromatin organization characteristic of a distinct cell identity. Under physiological conditions, cells regulate their chromatin structure and organization to express necessary genes. DNA topoisomerases are a key player in all of these processes and in replication. Through generation of transient breaks in the DNA, topoisomerases are able to resolve topological problems and thereby activation of particular sections of the genome. Beyond topoisomerases, the genome is subject to perpetual challenges with DNA double-strand breaks (DSBs) being among the most deleterious. Each cell is estimated to suffer numerous transient DSBs per day, most of which are repaired. Incorrectly repaired DSBs however, pose a major threat to genome stability through formation of mutations or potential genomic rearrangements. Although the exact relationship of DNA damage to differentiation is still unclear, a recent investigation into neural specification demonstrated that loss of DNA repair sensors leads to centrosome amplification, thereby resulting in defective mitosis and chromosomal instability. Ensuing excessive stem cell proliferation and replication stress also happen to be a hallmark of neurodevelopmental disorders (NDDs). Despite the emerging evidence linking endogenous DSBs to NDDs, there has been a lack of genome-wide maps of DSBs spontaneously arising at different stages of human neurogenesis. This thesis brings together (I) a correlative genomics study describing endogenous DSBs genome-wide during neural differentiation in a cell-type specific manner, and (II) a mechanistic study into the regulatory role of Topoisomerase 1 (TOP1) in transcription and proliferation. In paper I, we mapped the genomic DSB landscape of cells at various stages of neural differentiation and correlated our maps with genomic and epigenomic features. In so doing, we provide clues on how DSB formation and their incorrect repair might contribute to the pathogenesis of NDDs. The current view is that transcription-associated DSBs seem to be the main driver of de novo mutations. Indeed, we found that DSBs preferentially form around the transcription start site (TSS) of transcriptionally active genes, as well as at chromatin loop anchors in proximity of highly transcribed genes. This follows from the accumulation of DNA torsional stress and topoisomerase activity in these regions. Interestingly, hotspots of endogenous DSBs were detected around the TSS of highly transcribed genes involved in general cellular processes and along the gene body of long, neural-specific genes whose human orthologues had been previously implicated in NDDs. Through our integrative multimethod approach we corroborate previous findings regarding DSB-fragile loci at TSSs and loop anchors, and find a unique distribution pattern for this fragility in post-mitotic neurons. We show a cell type-specific preference for DSB accumulation in specific NDD genes and begin to describe the relation of DSB fragility and chromatin conformation. In paper II, we investigated the role of Topoisomerase I (TOP1) in relation to transcription in the context of replication stress across mitosis and as subject of interruption of interphase chromatin conformation. In particular, we investigated different stages of the cell cycle for transcription patterns and transcriptional spiking by RNA polymerase II (RNAPII) in human colon carcinoma cells. TOP1 relieves torsional stress in actively transcribed DNA and facilitates the expression of long genes, many of which are important for neural functions. However, TOP1 also plays a direct role in transcriptional control through interaction with RNAPII Carboxy-Terminal Domain (CTD). We investigated control cells and a knock-in (KI) clone lacking TOP1 exon4, the phosphor-CTD-binding site for RNAPII. We found that in early mitosis TOP1 clears RNAPII during transcriptional elongation. When the TOP1 CTD-binding domain is disrupted, we detected replication stress and delay in mitotic exit. In this case, chromatin becomes topologically stressed, increasing the need for TOP2A cleavage resulting in DSBs. However, we did not detect substantial changes in DSB markers gamma- H2AX and 53BP1 when comparing WT and KI cells across different stages of the cell cycle. Therefore, we conclude that the observed delay in mitotic exit is most likely due to the deregulation of gene expression, rather than to the activation of DNA repair pathways. Acute depletion of TOP1 through the auxin-degron system resulted in absence of RNAPII spiking at the TSS. Efficient removal of RNAPII from chromosomes by TOP1 in early mitosis is both a prerequisite for the timely spike of RNAPII at TSSs in mid mitosis and might affect cellular memory. Indeed, we found that when mitotic transcription is poorly regulated, individual proliferating cells have a greater variance in transcriptional levels and thus could lead to loss of cell identity. Concluding from these findings, we demonstrate that endogenous DSBs are distributed differentially in a cell type-specific manner. Through our integrative multi-method approach, we corroborate previous findings regarding DSB-fragile loci and discovered a unique distribution pattern for DSBs in post-mitotic neurons. We show a preference for specific NDDs genes and begin to describe the relation of DSB fragility and chromatin conformation in a developmental context. We assessed the role of TOP1 in a model for replication stress and found that outside of its canonical torsional stress function, the direct interaction with RNAPII across the cell cycle is crucial in maintaining transcriptional memory and could feed into loss of cell identity. While not exhaustive, the findings described in these papers begin to elucidate a complex mystery of human NDDs and provide valuable datasets for further investigation of genome fragility. Taken together, these findings contribute to a better understanding of how neural genome dynamics affect high transcriptional or replicative burden during neurodevelopment

    Microsatellite characterization and marker development from massive sequencing data of the blenny Salaria pavo

    Get PDF
    Tese de mestrado. Biologia (Bioinformática e Biologia Computacional). Universidade de Lisboa, Faculdade de Ciências, 2011No blenídeo Salaria pavo o comportamento reprodutor de fêmeas e machos encontra-se modulado de acordo com a disponibilidade de ninhos no seu habitat. O sistema de acasalamento é promíscuo e os cuidados parentais às posturas são prestados exclusivamente por parte do macho. Nas populações de substrato rochoso onde existe uma grande disponibilidade de ninhos, os machos nidificam em cavidades na rocha e cortejam activamente as fêmeas, enquanto as fêmeas assumem um papel mais passivo. Por outro lado, a população da Ria Formosa apresenta consideráveis modificações ao padrão observado em costas rochosas. Nesta lagoa costeira, os substratos de nidificação são escassos e os únicos ninhos existentes são encontrados em tijolos utilizados na delimitação de áreas de cultivo de bivalves. A escassez de ninhos leva a que nesta população haja uma reversão dos papéis sexuais, com as fêmeas a cortejarem intensamente os machos e a competirem entre si pelo acesso a ninhos, e ao surgimento de tácticas alternativas de reprodução, com o aparecimento de machos parasitas. Estes machos apresentam um tamanho menor daqueles que nidificam e imitam tanto a morfologia como o comportamento de corte das fêmeas de modo a conseguirem aproximar-se do ninho e fertilizar as posturas durante episódios de desova. Estas tácticas alternativas de reprodução são sequenciais e os machos que apresentam uma táctica parasita numa época de reprodução normalmente adquirem um ninho na época seguinte. De forma a perceber a evolução e manutenção deste tipo de sistemas é necessário estimar o número de ovos que são fertilizados por machos parasitas. Actualmente a forma mais eficaz de fazer testes de paternidades é usando marcadores genéticos, que podem ou não ser complementados com observações de campo. De entre os vários marcadores existentes, o mais utilizado para este tipo de estudos é o microssatélite devido à elevada reproductibilidade dos resultados obtidos. Os microssatélites são motivos de 1 a 6 nucleótidos repetidos em tandem, altamente polimórficos e facilmente reproduzidos por PCR (Polymerase Chain Reation). Até recentemente, o isolamento de novos microssatélites para uma espécie dependia essencialmente da construção de bibliotecas genómicas enriquecidas para determinados microssatélites ou da reutilização de microssatélites de espécies próximas, tendo ambos os processos taxas de insucesso elevadas. Com o rápido desenvolvimento e disponibilização das tecnologias de sequenciação massiva à comunidade científica, uma nova forma de detectar e isolar microssatélites surgiu, a pesquisa in silico. De entre as várias plataformas de sequenciação massiva, a mais indicada para espécies que ainda não tenham sequências referência (genoma), é a pirosequenciação, por sequenciar fragmentos suficientemente longos (≈400 pb) que permitem a assemblagem de novo. Neste trabalho, o objectivo principal foi o de isolar e determinar o polimorfismo dos primeiros marcadores genéticos desenvolvidos para esta espécie. Para tal o transcriptoma deste blenídeo, que já se encontra sequenciado por pirosequenciação mas ainda não disponível nas bases de dados públicas, foi utilizado em conjunto com uma ferramenta bioinformática de pesquisa de microssatélites. Cerca de 640,000 sequências foram alinhadas de forma a se obter as 62,038 sequências consensus (unigenes com um tamanho médio de 452 pb) que foram utilizadas para fazer uma anotação funcional do transcriptoma e para a pesquisa de microssatélites. A anotação funcional foi realizada recorrendo a um algoritmo de alinhamento de sequências e procura de similaridades, BLAST (Basic Local Alignment Search Tool), mais concretamente o BLASTX, uma vez que se utilizou sequências nucleótidicas para pesquisas de semelhança na base de dados não redundante de sequências proteicas do NCBI. Utilizando um e-value<10-5, apenas 31% dos unigenes ficaram anotados funcionalmente e 21% tiveram termos do Gene Ontology atribuídos. Apesar de a percentagem de unigenes anotados ter sido baixa, a anotação realizada tem significado biológico uma vez que 80% das anotações obtidas foram provenientes de dezoito espécies de peixes. A pesquisa por microssatélites in silico resultou em 4,190 microssatélites identificados em 3,670 unigenes, usando como parâmetros de pesquisa microssatélites perfeitos com um mínimo de 6 repetições para todos os tipos de microssatélites (di-, tri-, tetra-, penta- e hexanucleótidos). Como acontece noutras espécies de peixes, os dinucleótidos são o tipo de microssatélites que se encontram em maior frequência (79%), seguidos dos trinucleótidos (19%). Os restantes tipos de microssatélites encontram-se em menor frequência correspondendo a apenas 6.5% do total de microssatélites. Qualquer que seja o processo utilizado para a obtenção de microssatélites, será sempre necessário testá-los e aplicá-los num conjunto de amostras de ADN de forma a poder avaliar o seu polimorfismo. Uma vez que neste trabalho interessava isolar microssatélites polimórficos, foram aplicadas duas estratégias de selecção dos microssatélites. A primeira estratégia utilizada baseou-se no conhecimento a priori do grau de polimorfismo dos microssatélites na população. Para tal, as sequências individuais que compõem o unigene na região do microssatélite foram manualmente curadas in silico de forma a avaliar o seu grau de polimorfismo. No total, 737 microssatélites revelaram ser polimórficos, dos quais 727 eram dinucleótidos e 6 trinucleótidos, com um número máximo de alelos observado in silico de 4 alelos. Para além do grau de polimorfismo também se teve em consideração nesta estratégia se o microssatélite estava num unigene anotado e com termos GO atribuídos e se era possível desenvolver um par de primers que o amplificassem. Depois da filtragem dos microssatélites pelos parâmetros mencionados anteriormente obteve-se uma lista de 97 dinucleótidos dos quais 33 foram seleccionados para aplicação (média de 8.5 repetições). A segunda estratégia para a selecção de microssatélites teve exclusivamente em conta o tamanho da repetição do microssatélite. Estudos anteriores apontam que microssatélites com mais repetições tendem a ser mais polimórficos e, desta forma, foram seleccionados 29 microssatélites para aplicação que continham em média 12 repetições. Para além dos 63 microssatélites seleccionados, outros 3 microssatélites isolados em Lipophrys pholis, pertencente à mesma subfamília do blenídeo Salaria pavo, foram também testados numa amostra de ADN de Salaria pavo para testar a sua amplificação heteróloga. De modo a testar a viabilidade de aplicação destes microssatélites para estudos de genética de populações, quarenta e um microssatélites, cujos primers amplificaram um só fragmento com o tamanho esperado, tiveram o seu primer forward marcado com fluorescência para a genotipagem de 20 indivíduos provenientes da população da ilha da Culatra (Portugal) e 6 indivíduos provenientes das ilhas de Formentera (Espanha) e Borovac (Croácia). Depois de analisados os resultados obtidos, 28 microssatélites isolados em Salaria pavo e 1 microssatélite isolado em Lipophrys pholis ficaram validados para futuros estudos. Na população da Culatra todos os microssatélites, à excepção de 5, microssatélites revelaram ser polimórficos. O número de alelos variou entre 2 e 12 alelos e a heterozigosidade observada e esperada variou entre 0.05 a 0.85 e 0.05 a 0.79 respectivamente. O número médio de alelos e a heterozigosidade esperada foi superior nos microssatélites seleccionados usando a segunda estratégia (6.5 e 0.62) comparativamente à primeira estratégia (3.54 e 0.40). Dois microssatélites revelaram estar em desequilíbrio de Hardy-Weinberg e 2 pares de microssatélites em desequilíbrio de linkage. Todos os microssatélites amplificaram nas amostras de ADN das populações de Formentera e Borovac. Tendo em conta os resultados obtidos, a segunda estratégia revelou ser mais eficiente para a selecção de microssatélites com maior taxa de polimorfismo. Apesar de os microssatélites monomórficos encontrados na população da Culatra terem sido isolados com base na primeira estratégia será necessário aumentar o número de indivíduos genotipados de forma a confirmar-se o grau de polimorfismo observado in silico.Next-generation sequencing is providing researchers with a relatively fast and affordable option for developing microsatellite loci for non-model organisms. The number of studies using this approach is fast-growing and a new focus has been given to the development of microsatellites from cDNA due to their potential in targeting candidate genes (type I markers). When the microsatellite polymorphism is of interest, developing microsatellites can become time-consuming due to the numerous primer pairs to be tested for polymorphism by polymerase chain reaction (PCR) in the focal species. Assemblies have a new potential not yet fully explored for microsatellite mining and evaluation, which can help improve the polymorphism rates obtained. Their high sequence coverage enables to access the microsatellite polymorphism in silico, if the DNA library sequenced was obtained from a pool of DNA from various individuals of the focal species. Therefore, in this study the transcriptome assembly obtained with pyrosequencing for the blenny Salaria pavo, was mined for microsatellites and their polymorphism manually evaluated in silico. Two strategies emerged for microsatellite selection and application in a sample of 26 individuals from the islands of Culatra, Formentera and Borovac. Microsatellites were selected based on their in silico polymorphism and annotation results (first strategy) or based only on their repetition length (second strategy). From a set of 63 microsatellite loci isolated in Salaria pavo sequences, 28 were validated plus one microsatellite from Lipophrys pholis. All microsatellites, except 5, revealed to be polymorphic on the 20 individuals genotyped from Culatra Island, the focal population of study. With the results obtained in this work, the second strategy revealed to be more efficient in yielding polymorphic microsatellites than the first strategy (average number of alleles was 6.5 and 3.54 respectively). Nevertheless, merging these two strategies in future studies may help improving the polymorphism results and at the same time develop type I markers
    corecore