1,965 research outputs found

    Technological advances in maize breeding: past, present and future

    Get PDF
    Maize has for many decades been both one of the most important crops worldwide and one of the primary genetic model organisms. More recently, maize breeding has been impacted by rapid technological advances in sequencing and genotyping technology, transformation including genome editing, doubled haploid technology, parallelled by progress in data sciences and the development of novel breeding approaches utilizing genomic information. Herein, we report on past, current and future developments relevant for maize breeding with regard to (1) genome analysis, (2) germplasm diversity characterization and utilization, (3) manipulation of genetic diversity by transformation and genome editing, (4) inbred line development and hybrid seed production, (5) understanding and prediction of hybrid performance, (6) breeding methodology and (7) synthesis of opportunities and challenges for future maize breeding

    SynBac: :minimal synthetic baculovirus genomes

    Get PDF

    MICRO$EC: Cost Effective, Whole-Genome Sequencing

    Get PDF
    While the feasibility of whole human genome sequencing was proven by the success of the Human Genome Project several years ago, the prevalence of personal genome sequencing in the medical industry is still elusive due to its unrealistic cost and time requirements. Microeqisastartupcompanywiththegoalofovercomingtheselimitationsbysequencingaminimumof12completehumangenomesperdayatanerrorratelessthantenpartsinmillionataprofitablemarketpriceoflessthanUSeq is a startup company with the goal of overcoming these limitations by sequencing a minimum of 12 complete human genomes per day at an error rate less than ten parts in million at a profitable market price of less than US1000 per genome. To overcome the technology bottlenecks hindering current biotech companies from achieving these target throughput, error rate, and market price goals, Microeqhasdevelopedaninnovativesequencingtechniquethatusesshortreadfragmentswithhighcoverageonamicrofluidicsplatform.Short,amplifiedDNAfragmentsaregeneratedfromaninputofcustomersaliva.6basepair(bp)sequencehybridizationisusedforsequencingeachoftheDNAfragmentsindividually.TheresultsarethesehydridizationreadsarethenassembledviadeBruijngraphtheoryandthegraphicalreconstructionsofeachfragment’ssequencearethenassembledtoacompletegenomeviashotgunsequencingwithanexpectederrorratelessthan1in100,000bp.Uponthecompletionoffinancialanalysis,bothasmall−scalebusinessmodelproducing72genomesperdayatUSeq has developed an innovative sequencing technique that uses shortread fragments with high coverage on a microfluidics platform. Short, amplified DNA fragments are generated from an input of customer saliva. 6 base pair(bp) sequence hybridization is used for sequencing each of the DNA fragments individually. The results are these hydridization reads are then assembled via de Bruijn graph theory and the graphical reconstructions of each fragment’s sequence are then assembled to a complete genome via shotgun sequencing with an expected error rate less than 1 in 100,000bp. Upon the completion of financial analysis, both a small-scale business model producing 72 genomes per day at US999 per genome, and a largescale business model producing 52.2 genomes per year at a market price of US299pergenomewerefoundtobeprofitable,yieldingMicro299 per genome were found to be profitable, yielding Microeq investors return margins of ~90% and 300% for the small and large scale models, respectively. With a market price Micro$eq offers personal genome sequencing at one-tenth of its nearest potential competitor’s cost. Additionally, its ability for bulk-sequencing allows it to profitably venture into the previously untapped Pharmaceutical Industry market sector, enabling the creation of large-scale genome databases which are the next step forward in the quest for truly personalized

    CAD Tools for DNA Micro-Array Design, Manufacture and Application

    Get PDF
    Motivation: As the human genome project progresses and some microbial and eukaryotic genomes are recognized, numerous biotechnological processes have attracted increasing number of biologists, bioengineers and computer scientists recently. Biotechnological processes profoundly involve production and analysis of highthroughput experimental data. Numerous sequence libraries of DNA and protein structures of a large number of micro-organisms and a variety of other databases related to biology and chemistry are available. For example, microarray technology, a novel biotechnology, promises to monitor the whole genome at once, so that researchers can study the whole genome on the global level and have a better picture of the expressions among millions of genes simultaneously. Today, it is widely used in many fields- disease diagnosis, gene classification, gene regulatory network, and drug discovery. For example, designing organism specific microarray and analysis of experimental data require combining heterogeneous computational tools that usually differ in the data format; such as, GeneMark for ORF extraction, Promide for DNA probe selection, Chip for probe placement on microarray chip, BLAST to compare sequences, MEGA for phylogenetic analysis, and ClustalX for multiple alignments. Solution: Surprisingly enough, despite huge research efforts invested in DNA array applications, very few works are devoted to computer-aided optimization of DNA array design and manufacturing. Current design practices are dominated by ad-hoc heuristics incorporated in proprietary tools with unknown suboptimality. This will soon become a bottleneck for the new generation of high-density arrays, such as the ones currently being designed at Perlegen [109]. The goal of the already accomplished research was to develop highly scalable tools, with predictable runtime and quality, for cost-effective, computer-aided design and manufacturing of DNA probe arrays. We illustrate the utility of our approach by taking a concrete example of combining the design tools of microarray technology for Harpes B virus DNA data

    GENOME ASSEMBLY AND VARIANT DETECTION USING EMERGING SEQUENCING TECHNOLOGIES AND GRAPH BASED METHODS

    Get PDF
    The increased availability of genomic data and the increased ease and lower costs of DNA sequencing have revolutionized biomedical research. One of the critical steps in most bioinformatics analyses is the assembly of the genome sequence of an organism using the data generated from the sequencing machines. Despite the long length of sequences generated by third-generation sequencing technologies (tens of thousands of basepairs), the automated reconstruction of entire genomes continues to be a formidable computational task. Although long read technologies help in resolving highly repetitive regions, the contigs generated from long read assembly do not always span a complete chromosome or even an arm of the chromosome. Recently, new genomic technologies have been developed that can ''bridge" across repeats or other genomic regions that are difficult to sequence or assemble and improve genome assemblies by ''scaffolding" together large segments of the genome. The problem of scaffolding is vital in the context of both single genome assembly of large eukaryotic genomes and in metagenomics where the goal is to assemble multiple bacterial genomes in a sample simultaneously. First, we describe SALSA2, a method we developed to use interaction frequency between any two loci in the genome obtained using Hi-C technology to scaffold fragmented eukaryotic genome assemblies into chromosomes. SALSA2 can be used with either short or long read assembly to generate highly contiguous and accurate chromosome level assemblies. Hi-C data are known to introduce small inversion errors in the assembly, so we included the assembly graph in the scaffolding process and used the sequence overlap information to correct the orientation errors. Next, we present our contributions to metagenomics. We developed a scaffolding and variant detection method MetaCarvel for metagenomic datasets. Several factors such as the presence of inter-genomic repeats, coverage ambiguities, and polymorphic regions in the genomes complicate the task of scaffolding metagenomes. Variant detection is also tricky in metagenomes because the different genomes within these complex samples are not known beforehand. We showed that MetaCarvel was able to generate accurate scaffolds and find genome-wide variations de novo in metagenomic datasets. Finally, we present EDIT, a tool for clustering millions of DNA sequence fragments originating from the highly conserved 16s rRNA gene in bacteria. We extended classical Four Russians' speed up to banded sequence alignment and showed that our method clusters highly similar sequences efficiently. This method can also be used to remove duplicates or near duplicate sequences from a dataset. With the increasing data being generated in different genomic and metagenomic studies using emerging sequencing technologies, our software tools and algorithms are well timed with the need of the community

    An investigation into the biosynthesis of proximicins

    Get PDF
    PhD ThesisThe proximicins are a family of three compounds – A-C – produced by two marine Actinomycete Verrucosispora strains – V. maris AB18-032 and V. sp. str. 37 - and are characterised by the presence of 2,4-disubstituted furan rings. Proximicins demonstrate cell-arresting and antimicrobial ability, making them interesting leads for clinical drug development. Proximicin research has been largely overshadowed by other Verrucosispora strain secondary metabolites (SM), and despite the publication of the V. maris AB18-032 draft, the enzymatic machinery responsible for their production has not been established. It has been noted in related research into a pyrrole-containing homolog – congocidine –due to the structural similarity exhibited, proximicins likely utilise a similar biosynthetic route. The initial aim of this research was to confirm the presumed pathway to proximicin biosynthesis. Following the sequencing, assembly and annotation of the second proximicin producer, Verrucosispora sp. str. MG37, and genome mining of V. maris AB18-032, no common clusters mimicked that of congocidine, casting doubt on the previously assumed analogous biosynthetic routes. A putative proximicin biosynthesis (ppb) cluster was identified, containing non-ribosomal peptide synthetase (NRPS) enzymes, exhibiting some homology with congocidine. NRPSsystems represent a network of interacting proteins, which act as a SM assembly line: crucially, adenylation (A)- domain enzymes act as the ‘gate-keeper’, determining which precursors are included into the elongating peptide. To elucidate the route to proximicins, activity characterisation of the four A-domains present in ppb cluster was attempted. The A-domain Ppb120 was shown to possess novel activity, demonstrating a high promiscuity towards heterocycle containing precursors, in addition to the absence of an apparent essential domain. This discovery refutes previous work outlining the core residues which dictate A-domain activity, while also presenting a facile route to novel heterocycle-containing compounds. Despite extensive work, A-domains ppb195 and ppb210, were ineffectively purified in the active form – informing future work into A-domains activity characterisation. Finally, the ppb220 A-domain which lies at the border of ppb, was inactive suggesting over-estimation of the cluster margins. To confirm ppb220 redundancy and confirm ppb boundaries, CRISPR/Cas gene editing studies were done. The gene responsible for the orange pigment of Verrucosispora strains was initially targeted and successfully deleted, and ppb studies commenced. The research here refutes the previously presumed route to proximicin biosynthesis; the ppb cluster instead comprises enzymes exhibiting unique activity and structure. The findings represent the foundations for allowing exploitation of chemistry exhibited within the proximicin family. The novelty exhibited can be utilised in the search for antimicrobial clinical leads, by allowing the production of compounds containing previously inaccessible heterocycle chemistry

    The exploitation of thermophiles and their enzymes for the construction of multistep enzyme reactions from characterised enzyme parts

    Get PDF
    Biocatalysis is a field rapidly expanding to meet a demand for green and sustainable chemical processes. As the use of enzymes for synthetic chemistry becomes more common, the construction of multistep enzyme reactions is likely to become more prominent providing excellent cost and productivity benefits. However, the design and optimisation of multistep reactions can be challenging. An enzyme toolbox of well-characterised enzyme parts is critical for the design of novel multistep reactions. Furthermore, while whole-cell biocatalysis offers an excellent platform for multistep reactions, we are limited to the use of mesophilic host organisms such as Escherichia coli. The development of a thermophilic host organism would offer a powerful tool allowing whole-cell biocatalysis at elevated temperatures. This study aimed to investigate the construction of a multistep enzyme reaction from well-characterised enzyme parts, consisting of an esterase, a carboxylic acid reductase and an alcohol dehydrogenase. A novel thermostable esterase Af-Est2 was characterised both biochemically and structurally. The enzyme shows exceptional stability making it attractive for industrial biocatalysis, and features what is likely a structural or regulatory CoA molecule tightly bound near the active site. Five carboxylic acid reductases (CARs) taken from across the known CAR family were thoroughly characterised. Kinetic analysis of these enzymes with various substrates shows they have a broad but similar substrate specificity and that electron rich acids are favoured. The characterisation of these CARs seeks to provide specifications for their use as a biocatalyst. The use of isolated enzymes was investigated as an alternative to whole-cell biocatalysis for the multistep reaction. Additional enzymes for the regeneration of cofactors and removal of by-products were included, resulting in a seven enzyme reaction. Using characterised enzyme parts, a mechanistic mathematical model was constructed to aid in the understanding and optimisation of the reaction, demonstrating the power of this approach. Thermus thermophilus was identified as a promising candidate for use as a thermophilic host organism for whole-cell biocatalysis. Synthetic biology parts including a BioBricks vector, custom ribosome binding sites and characterised promoters were developed for this purpose. The expression of enzymes to complete the multistep enzyme reaction in T. thermophilus was successful, but native T. thermophilus enzymes prevented the biotransformation from being completed. In summary, this work makes a number of contributions to the enzyme toolbox of well-characterised enzymes, and investigates their combination into a multistep enzyme reaction both in vitro and in vivo using a novel thermophilic host organism.BBSRC, GS

    Metagenomics-Based Tryptophan Dimer Natural Product Discovery and Development Pipeline

    Get PDF
    Most microbial natural product discovery programs rely on the growth of bacteria in the laboratory, yet it is now well established that the vast majority of bacteria in the environment have not been cultured, particularly from the diverse soil microbiota. By extracting DNA directly from soil samples to construct large archived environmental DNA (eDNA) libraries, thousands of genomes from both cultured and as yet uncultured bacteria can be simultaneously screened for gene clusters encoding natural products of interest. Several natural products with pharmaceutically relevant biological activity arise from the dimerization of tryptophans, such as staurosporine, rebeccamycin, and violacein. To discover novel tryptophan dimers (TDs), we have designed a metagenomics-based TD natural product discovery and development pipeline that consists of seven steps: 1) soil eDNA extraction; 2) eDNA library construction; 3) homology-based screening; 4) bioinformatics analysis; 5) heterologous expression; 6) characterization of compounds and their biosynthesis; 7) target identification. Using a degenerate primer set that targets the CPA synthase gene, one of the conserved genes of tryptophan dimer biosynthesis, we screened the equivalent of ~1 million (over 1 tera base pairs) bacterial genomes from the eDNA libraries, resulting in the discovery of 14 unprecedented TD gene clusters, almost tripling the number of TD gene clusters that have previously been characterized. Using heterologous expression strategies that involve 1) shuttling of pathways into diverse bacterial hosts, 2) overexpression of positive transcriptional regulator, 3) synthetic refactoring of complete pathways, and 4) co-expression of deficient biosynthetic genes, we successfully expressed nine of the 14 gene clusters. This led to the functional characterization of three novel TD families (i.e. indolotryptoline, carboxy-indolocarbazole, and bisindolylmaleimide), consisting of 15 novel natural products (e.g. BE 54017s, borregomycins, erdasporines) with therapeutically relevant bioactivities (e.g. antitumor, antibacterial). Linking biologically active natural products to their cellular targets remains a challenging and critical process in the development of therapeutic agents and small-molecule probes, especially for cytotoxic agents that might serve as anticancer agents. Using multidrug resistance-suppressed (MDR-sup) fission yeast resistant mutant screening, the molecular target of the indolotryptoline family of TD was identified and validated to be the proteolipid subunits of vacuolar H+-ATPase (V-ATPase) at a putative binding site that is distinct from the previously described V-ATPase inhibitors. Together, we demonstrate the utility of this pipeline in the isolation, characterization, and development of novel natural products from the soil bacterial metagenome
    • …
    corecore