226 research outputs found
Inferring bona fide transfrags in RNA-Seq derived-transcriptome assemblies of non-model organisms
Background: De novo transcriptome assembly of short transcribed fragments (transfrags) produced from sequencing-by-synthesis technologies often results in redundant datasets with differing levels of unassembled, partially assembled or mis-assembled transcripts. Post-assembly processing intended to reduce redundancy typically involves reassembly or clustering of assembled sequences. However, these approaches are mostly based on common word heuristics and often create clusters of biologically unrelated sequences, resulting in loss of unique
transfrags annotations and propagation of mis-assemblies.
Results: Here, we propose a structured framework that consists of a few steps in pipeline architecture for Inferring Functionally Relevant Assembly-derived Transcripts (IFRAT). IFRAT combines 1) removal of identical subsequences,
2) error tolerant CDS prediction, 3) identification of coding potential, and 4) complements BLAST with a multiple domain architecture annotation that reduces non-specific domain annotation. We demonstrate that independent of the assembler, IFRAT selects bona fide transfrags (with CDS and coding potential) from the transcriptome assembly of a model organism without relying on post-assembly clustering or reassembly. The robustness of IFRAT is inferred on RNA-Seq data of Neurospora crassa assembled using de Bruijn graph-based assemblers, in single (Trinity and Oases-25) and multiple (Oases-Merge and additive or pooled) k-mer modes. Single k-mer assemblies contained fewer transfrags compared to the multiple k-mer assemblies. However, Trinity identified a comparable number of
predicted coding sequence and gene loci to Oases pooled assembly. IFRAT selects bona fide transfrags representing over 94% of cumulative BLAST-derived functional annotations of the unfiltered assemblies. Between 4-6% are lost when orphan transfrags are excluded and this represents only a tiny fraction of annotation derived from functional transference by sequence similarity. The median length of bona fide transfrags ranged from 1.5kb (Trinity) to 2kb (Oases), which is consistent with the average coding sequence length in fungi. The fraction of transfrags that could be associated with gene ontology terms ranged from 33-50%, which is also high for domain based annotation. We showed that unselected transfrags were mostly truncated and represent sequences from intronic, untranslated
(5′ and 3′) regions and non-coding gene loci.
Conclusions: IFRAT simplifies post-assembly processing providing a reference transcriptome enriched with functionally relevant assembly-derived transcripts for non-model organism.Department of Science and Technology
National Research Foundation
South African Research Chair initiativeWeb of Scienc
Chromosomal-level assembly of the Asian seabass genome using long sequence reads and multi-layered scaffolding
We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a
tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical
and genetic mapping along with shared synteny from closely related fish species to
derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50
size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various
regions across the species’ native range. SNP analyses identified high levels of genetic
diversity and confirmed earlier indications of a population stratification comprising three
clades with signs of admixture apparent in the South-East Asian population. The quality of
the Asian seabass genome assembly far exceeds that of any other fish species, and will
serve as a new standard for fish genomics.Web of Scienc
A more reliable PCR for detection of Mycobacterium tuberculosis in clinical samples
Diagnostic techniques based on PCR have two major problems: false-positive reactions due to contamination with DNA fragments from previous PCRs (amplicons) and false-negative reactions caused by inhibitors that interfere with the PCR. We have improved our previously reported PCR based on the amplification of a fragment of the Mycobacterium tuberculosis complex-specific insertion element IS6110 with respect to both problems. False-positive reactions caused by amplicon contamination were prevented by the use of uracil-N-glycosylase and dUTP instead of dTTP. We selected a new set of primers outside the region spanned by the formerly used primers to avoid false-positive reactions caused by dTTP-containing amplicons still present in the laboratory. With this new primer set, 16 copies of the IS6110 insertion element, the equivalent of two bacteria, could be amplified 10(10) times in 40 cycles, resulting in a mean efficiency of 77% per cycle. To detect the presence of inhibitors of the Taq polymerase, which may cause false-negative reactions, part of each sample was spiked with M. tuberculosis DNA. The DNA purification method using guanidinium thiocyanate and diatoms effectively removed most or all inhibitors of the PCR. However, this was not suitable for blood samples, for which we developed a proteinase K treatment followed by phenol-chloroform extraction. This method permitted detection of 20 M. tuberculosis bacteria per ml of whole blood. Various laboratory procedures were introduced to reduce failure or inhibition of PCR and avoid DNA cross contamination. We have tested 218 different clinical specimens obtained from patients suspected of having tuberculosis. The samples included sputum (n=145), tissue biopsy samples (n=25), cerebrospinal fluid (n=15), blood (n=14), pleural fluid (n=9), feces, (n=7), fluid from fistulae (n=2), and pus from a wound (n=1). The results obtained by PCR were consistent with those obtained with culture, which is the "gold standard." We demonstrate that PCR is a useful technique for the rapid diagnosis of tuberculosis at various sites
Effect of multimorbidity on utilisation and out-of-pocket expenditure in Indonesia: quantile regression analysis
Background Multimorbidity (the presence of two or more non-communicable diseases) is a major growing challenge for many low-income and middle-income countries (LMICs). Yet, its effects on health care costs and financial burden for patients have not been adequately studied. This study investigates the effect of multimorbidity across the different percentiles of healthcare utilisation and out-of-pocket expenditure (OOPE). Methods We conducted a secondary data analysis of the 2014/2015 Indonesian Family Life Survey (IFLS-5), which included 13,798 respondents aged ≥40 years. Poisson regression was used to assess the association between sociodemographic characteristics and the total number of non-communicable diseases (NCDs), while multivariate logistic regression and quantile regression analysis was used to estimate the associations between multimorbidity, health service use and OOPE. Results Overall, 20.8% of total participants had two or more NCDs in 2014/2015. The number of NCDs was associated with higher healthcare utilisation (coefficient 0.11, 95% CI 0.07–0.14 for outpatient care and coefficient 0.09 (95% CI 0.02–0.16 for inpatient care) and higher four-weekly OOPE (coefficient 27.0, 95% CI 11.4–42.7). The quantile regression results indicated that the marginal effect of having three or more NCDs on the absolute amount of four-weekly OOPE was smaller for the lower percentiles (at the 25th percentile, coefficient 1.0, 95% CI 0.5–1.5) but more pronounced for the higher percentile of out-of-pocket spending distribution (at the 90th percentile, coefficient 31.0, 95% CI 15.9–46.2). Conclusion Multimorbidity is positively correlated with health service utilisation and OOPE and has a significant effect, especially among those in the upper tail of the utilisation/costs distribution. Health financing strategies are urgently required to meet the needs of patients with multimorbidity, particularly for vulnerable groups that have a higher level of health care utilisation
The long and winding road leading to the successful introgression of downy mildew resistance into onion
Downy mildew resistance originating from Allium roylei Stearn provides a complete resistance to onions and is based on one, dominant gene. Since A. roylei can successfully be hybridized with onion (A. cepa L.), a breeding scheme aimed at the introgression of this gene was initiated ca. 20 years ago. Several setbacks in this programme were encountered, firstly the identified molecular marker linked to the downy mildew resistance locus became increasingly difficult to use and finally lost its discriminating power and secondly the final step, making homozygous introgression lines (ILs), turned out to be more difficult then was hoped. GISH analysis showed that the chromosomal region harbouring the resistance locus was the only remaining piece of A. roylei in the nuclear background of onion and it also confirmed that this region was located on the distal end of chromosome 3. It was hypothesized that some factor present in the remaining A. roylei region was lethal when homozygously present in an onion genetic background. The identification of an individual with a smaller and more distally located introgression fragment and homozygous ILs in its progeny validated this hypothesis. With the help of these nearly isogenic lines four AFLP® markers closely linked to the resistance gene were identified, which can be used for marker-aided selection. The introduction of downy mildew resistance caused by Peronospora destructor into onion is a significant step forward in the development of environmentally-friendly onion cultivars.<br/>Downy mildew resistance originating from Allium roylei Stearn provides a complete resistance to onions and is based on one, dominant gene. Since A. roylei can successfully be hybridized with onion (A. cepa L.), a breeding scheme aimed at the introgression of this gene was initiated ca. 20 years ago. Several setbacks in this programme were encountered, firstly the identified molecular marker linked to the downy mildew resistance locus became increasingly difficult to use and finally lost its discriminating power and secondly the final step, making homozygous introgression lines (ILs), turned out to be more difficult then was hoped. GISH analysis showed that the chromosomal region harbouring the resistance locus was the only remaining piece of A. roylei in the nuclear background of onion and it also confirmed that this region was located on the distal end of chromosome 3. It was hypothesized that some factor present in the remaining A. roylei region was lethal when homozygously present in an onion genetic background. The identification of an individual with a smaller and more distally located introgression fragment and homozygous ILs in its progeny validated this hypothesis. With the help of these nearly isogenic lines four AFLP (R) markers closely linked to the resistance gene were identified, which can be used for marker-aided selection. The introduction of downy mildew resistance caused by Peronospora destructor into onion is a significant step forward in the development of environmentally-friendly onion cultivars
Diversity in fertility potential and organo-sulphur compounds among garlics from Central Asia
Extending the collection of garlic (Allium sativum L.) accessions is an important means that is available for broadening the genetic variability of this cultivated plant, with regard to yield, quality, and tolerance to biotic and abiotic traits; it is also an important means for restoring fertility and flowering. In the framework of the EU project Garlic and Health, 120 garlic accessions were collected in Central Asia - the main centre of garlic diversity. Plants were documented and thereafter maintained in field collections in both Israel and The Netherlands. The collection was evaluated for biological and economic traits. Garlic clones vary in most vegetative characteristics (leaf number, bulb size and structure), as well as in floral scape elongation and inflorescence development. A clear distinction was made between incomplete bolting and bolting populations; most of the accessions in the latter populations produced flowers with fertile pollen and receptive stigma. Wide variations were recorded with regard to differentiation of topsets, their size, number and rapidity of development. Furthermore, significant variation in organo-sulphur compounds (alliin, isoalliin, allicin and related dipeptides) was found within garlic collections and between plants grown under differing environmental conditions. Genetic fingerprinting by means of AFLP markers revealed three distinct groups within this collection, differing also in flowering ability and organo-S content
Unrelated Helpers in a Primitively Eusocial Wasp: Is Helping Tailored Towards Direct Fitness?
The paper wasp Polistes dominulus is unique among the social insects in that nearly one-third of co-foundresses are completely unrelated to the dominant individual whose offspring they help to rear and yet reproductive skew is high. These unrelated subordinates stand to gain direct fitness through nest inheritance, raising the question of whether their behaviour is adaptively tailored towards maximizing inheritance prospects. Unusually, in this species, a wealth of theory and empirical data allows us to predict how unrelated subordinates should behave. Based on these predictions, here we compare helping in subordinates that are unrelated or related to the dominant wasp across an extensive range of field-based behavioural contexts. We find no differences in foraging effort, defense behaviour, aggression or inheritance rank between unrelated helpers and their related counterparts. Our study provides no evidence, across a number of behavioural scenarios, that the behaviour of unrelated subordinates is adaptively modified to promote direct fitness interests
Mapping and characterization of novel parthenocarpy QTLs in tomato
Parthenocarpy is the development of the fruit in absence of pollination and/or fertilization. In tomato, parthenocarpy is considered as an attractive trait to solve the problems of fruit setting under unfavorable conditions. We studied the genetics of parthenocarpy in two different lines, IL5-1 and IVT-line 1, both carrying Solanum habrochaites chromosome segments. Parthenocarpy in IL5-1 is under the control of two QTLs, one on chromosome 4 (pat4.1) and one on chromosome 5 (pat5.1). IVT-line 1 also contains two parthenocarpy QTLs, one on chromosome 4 (pat4.2) and one on chromosome 9 (pat9.1). In addition, we identified one stigma exsertion locus in IL5-1, located on the long arm of chromosome 5 (se5.1). It is likely that pat4.1, from IL5-1 and pat4.2, from IVT-line 1, both located near the centromere of chromosome 4 are allelic. By making use of the microsynteny between tomato and Arabidopsis in this genetic region, we identified ARF8 as a potential candidate gene for these two QTLs. ARF8 is known to act as an inhibitor for further carpel development in Arabidopsis, in absence of pollination/fertilization. Expression of an aberrant form of the ArabidopsisARF8 gene, in tomato, has been found to cause parthenocarpy. This candidate gene approach may lead to the first isolation of a parthenocarpy gene in tomato and will allow further use in several crop species
- …