540 research outputs found

    ERDMAS: An exemplar-driven institutional research data management and analysis strategy

    Get PDF
    Devising fit-for-purpose research data management strategies within a university is challenging. This is because the five ‘Vs’ for generated research data; its Volume, Variety, Velocity, Veracity and its Value must be constantly considered. Invariably, a combination of data V's for any given research endeavour determine how best to manage it appropriately addressing archiving, compliance, security, privacy, sharing, reuse and so forth. As such, institutions are faced with defining, shaping and refining strategies and practicies to ensure there are consistent and adequate research data management polices and guidelines in place for their researchers. FAIR data principles are very important for embracing open data opportunities, but more broadly, research data management practices need to be established in a comprehensive way. Additionally, new ICT options have rapidly become available where institutions can make considered choices on whether to continue to use ‘on prem’, private Cloud or public Cloud infrastructure. If a hybrid approach is adopted, then the potential impact on existing institutional research data management strategies must be continually assessed and revised accordingly. Getting the balance right between developing a relevant institutional policy on the one hand yet also dynamically catering for the eclectic research data management and analytics needs of researchers and their evolving interactions with external collaborators on the other, must be continually navigated. In this manuscript, an exemplar-driven research data management and analytics conceptual framework is introduced. A key feature of this framework is that it is couched in two dimensions. On one axis is the ‘standard’ linear approach of developing the research data management policy, guidelines, procedures, audit and risk assessment and an options matrix. Importantly, a second axis comprising a researcher-driven focus is introduced where exemplar research activities are used to define ‘classes’ of research data management and analysis requirements. This exemplar-driven dimension enables an ongoing system-wide comparative review to occur in parallel that can continually inform policy and guidelines refinement

    An Open Framework for Extensible Multi-Stage Bioinformatics Software

    Get PDF
    In research labs, there is often a need to customise software at every step in a given bioinformatics workflow, but traditionally it has been difficult to obtain both a high degree of customisability and good performance. Performance-sensitive tools are often highly monolithic, which can make research difficult. We present a novel set of software development principles and a bioinformatics framework, Friedrich, which is currently in early development. Friedrich applications support both early stage experimentation and late stage batch processing, since they simultaneously allow for good performance and a high degree of flexibility and customisability. These benefits are obtained in large part by basing Friedrich on the multiparadigm programming language Scala. We present a case study in the form of a basic genome assembler and its extension with new functionality. Our architecture has the potential to greatly increase the overall productivity of software developers and researchers in bioinformatics.Comment: 12 pages, 1 figure, to appear in proceedings of PRIB 201

    Intestinal spirochaetes of the genus Brachyspira share a partially conserved 26 kilobase genomic region with Enterococcus faecalis and Escherichia coli

    Get PDF
    Anaerobic intestinal spirochaetes of the genus Brachyspira include both pathogenic and commensal species. The two best-studied members are the pathogenic species B. hyodysenteriae (the aetiological agent of swine dysentery) and B. pilosicoli (a cause of intestinal spirochaetosis in humans and other species). Analysis of near-complete genome sequences of these two species identifi ed a highly conserved 26 kilobase (kb) region that was shared, against a background of otherwise very little sequence conservation between the two species. PCR amplification was used to identify sets of contiguous genes from this region in the related Brachyspira species B. intermedia, B. innocens, B. murdochii, B. alvinipulli, and B. aalborgi, and demonstrated the presence of at least part of this region in species from throughout the genus. Comparative genomic analysis with other sequenced bacterial species revealed that none of the completely sequenced spirochaete species from different genera contained this conserved cluster of coding sequences. In contrast, Enterococcus faecalis and Escherichia coli contained high gene cluster conservation across the 26 kb region, against an expected background of little sequence conservation between these phylogenetically distinct species. The conserved region in B. hyodysenteriae contained five genes predicted to be associated with amino acid transport and metabolism, four with energy production and conversion, two with nucleotide transport and metabolism, one with ion transport and metabolism, and four with poorly characterised or uncertain function, including an ankyrin repeat unit at the 5’ end. The most likely explanation for the presence of this 26 kb region in the Brachyspira species and in two unrelated enteric bacterial species is that the region has been involved in horizontal gene transfer

    Germin-like proteins (GLPs) in cereal genomes: gene clustering and dynamic roles in plant defence

    Get PDF
    The recent release of the genome sequences of a number of crop and model plant species has made it possible to define the genome organisation and functional characteristics of specific genes and gene families of agronomic importance. For instance, Sorghum bicolor, maize (Zea mays) and Brachypodium distachyon genome sequences along with the model grass species rice (Oryza sativa) enable the comparative analysis of genes involved in plant defence. Germin-like proteins (GLPs) are a small, functionally and taxonomically diverse class of cupin-domain containing proteins that have recently been shown to cluster in an area of rice chromosome 8. The genomic location of this gene cluster overlaps with a disease resistance QTL that provides defence against two rice fungal pathogens (Magnaporthe oryzae and Rhizoctonia solani). Studies showing the involvement of GLPs in basal host resistance against powdery mildew (Blumeria graminis ssp.) have also been reported in barley and wheat. In this mini-review, we compare the close proximity of GLPs in publicly available cereal crop genomes and discuss the contribution that these proteins, and their genome sequence organisation, play in plant defenc

    Evidence that the 36kb plasmid of Brachyspira hyodysenteriae contributes to virulence

    Get PDF
    Swine dysentery (SD) results from infection of the porcine large intestine with the anaerobic intestinal spirochaete Brachyspira hyodysenteriae. Recently the genome of virulent Australian B. hyodysenteriae strain WA1 was sequenced, and a 36. kilobase (kb) circular plasmid was identified. The plasmid contained 31 genes including six rfb genes that were predicted to be involved with rhamnose biosynthesis, and others associated with glycosylation. In the current study a set of PCRs was developed to amplify portions of nine of the plasmid genes. When used with DNA extracted from virulent strain B204, PCR products were generated, but no products were generated with DNA from avirulent strain A1. Analysis of the DNA using pulsed field gel electrophoresis (PFGE) identified a plasmid band in strains WA1 and B204, but not in strain A1. These results demonstrate that strain A1 does not contain the plasmid, and suggests that lack of the plasmid may explain why this strain is avirulent. To determine how commonly strains lacking plasmids occur, DNA was extracted from 264 Australian field isolates of B. hyodysenteriae and subjected to PCRs for three of the plasmid genes. Only one isolate (WA400) that lacked the plasmid was identified, and this absence was confirmed by PFGE analysis of DNA from the isolate and further PCR testing. To assess its virulence, 24 pigs were experimentally challenged with cultures of WA400, and 12 control pigs were challenged with virulent strain WA1 under the same conditions. Significantly fewer (P= 0.03) of the pigs challenged with WA400 became colonised and developed SD (13/24; 54%) compared to the pigs infected with WA1 (11/12; 92%). Gross lesions in the pigs colonised with WA400 tended to be less extensive than those in pigs colonised with WA1, although there were no obvious differences at the microscopic level. The results support the likelihood that plasmid-encoded genes of B. hyodysenteriae are involved in colonisation and/or disease expression

    Bioinformatics Education—Perspectives and Challenges

    Get PDF
    This article discusses the evolution of curriculum, instructional methodologies and initiatives supporting the dissemination of bioinformatics. Building on the early applications of informatics to the field of biology, bioinformatics research entails input from the diverse disciplines of mathematics and statistics, physics and chemistry and medicine and pharmacology. Training in bioinformatics remains the oldest and most important rapid introduction approach to learning bioinformatics skills.2 page(s

    The genetics of symbiotic nitrogen fixation: comparative genomics of 14 Rhizobia Strains by resolution of protein clusters.

    Get PDF
    The symbiotic relationship between legumes and nitrogen fixing bacteria is critical for agriculture, as it may have profound impacts on lowering costs for farmers, on land sustainability, on soil quality, and on mitigation of greenhouse gas emissions. However, despite the importance of the symbioses to the global nitrogen cycling balance, very few rhizobial genomes have been sequenced so far, although there are some ongoing efforts in sequencing elite strains. In this study, the genomes of fourteen selected strains of the order Rhizobiales, all previously fully sequenced and annotated, were compared to assess differences between the strains and to investigate the feasibility of defining a core ?symbiome??the essential genes required by all rhizobia for nodulation and nitrogen fixation. Comparison of these whole genomes has revealed valuable information, such as several events of lateral gene transfer, particularly in the symbiotic plasmids and genomic islands that have contributed to a better understanding of the evolution of contrasting symbioses. Unique genes were also identified, as well as omissions of symbiotic genes that were expected to be found. Protein comparisons have also allowed the identification of a variety of similarities and differences in several groups of genes, including those involved in nodulation, nitrogen fixation, production of exopolysaccharides, Type I to Type VI secretion systems, among others, and identifying some key genes that could be related to host specificity and/or a better saprophytic ability. However, while several significant differences in the type and number of proteins were observed, the evidence presented suggests no simple core symbiome exists. A more abstract systems biology concept of nitrogen fixing symbiosis may be required. The results have also highlighted that comparative genomics represents a valuable tool for capturing specificities and generalities of each genome.bitstream/item/74069/1/ID-34062.pd

    An Overview of the Adaptive Behaviour Profile in Young Children with Angelman Syndrome: Insights from the Global Angelman Syndrome Registry

    Get PDF
    Objectives: Angelman syndrome (AS) is a rare genetic disorder that affects the expression of the UBE3A gene within the central nervous system that profoundly impacts neurodevelopment. Individuals with AS experience significant challenges across multiple adaptive behaviour domains including communication, motor skills, and the ability to independently perform daily functions such as feeding, and toileting. Furthermore, persons with AS can demonstrate specific behaviours that limit their ability to participate within their social environment that vary with age. The aim of this paper is to explore the adaptive behaviour profile through parent report from the Global Angelman Syndrome Registry. Methods: Specific parent report data from the Global Angelman Syndrome Registry were analysed to explore the adaptive profile of 204 young children, under the age of 6 years old, with formal diagnoses of AS. Analysis of data focused on communication skills, gross and fine motor skills, daily self-care skills (feeding, toileting, and dressing), and behavioural characteristics. Several relationships were explored: (a) the age at which certain skills were first performed based on genotype; (b) abilities in motor and adaptive behaviours, according to age and genotype, and (c) the frequency at which children performed specific communication skills and the presence and frequency of challenging behaviours, across age and genotype. Results: We visually present the ages at which frequent speech, walking, and independent dressing and toileting were first mastered by children. Additionally, we provide in-depth descriptives of expressive and receptive communication skills (including the use of alternative communication forms), fine and gross motor skills, eating, dressing, toileting, anxiety, aggression, and other behavioural characteristics. Conclusions: This cross-sectional profile of adaptive skills in 204 young children with AS showcases that although many communication, motor and adaptive skills were determined by age, children with a non-deletion aetiology exhibited advantages in communication skills, which may have impacted upon subsequent adaptive skills. The use of parent report in the present study provides valuable insight into the adaptive behaviour profile of young children with AS

    How to identify pathogenic mutations among all those variations: Variant annotation and filtration in the genome sequencing era

    Get PDF
    High-throughput sequencing technologies have become fundamental for the identification of disease-causing mutations in human genetic diseases both in research and clinical testing contexts. The cumulative number of genes linked to rare diseases is now close to 3,500 with more than 1,000 genes identified between 2010 and 2014 because of the early adoption of Exome Sequencing technologies. However, despite these encouraging figures, the success rate of clinical exome diagnosis remains low due to several factors including wrong variant annotation and nonoptimal filtration practices, which may lead to misinterpretation of disease-causing mutations. In this review, we describe the critical steps of variant annotation and filtration processes to highlight a handful of potential disease-causing mutations for downstream analysis. We report the key annotation elements to gather at multiple levels for each mutation, and which systems are designed to help in collecting this mandatory information. We describe the filtration options, their efficiency, and limits and provide a generic filtration workflow and highlight potential pitfalls through a use case
    corecore