2,082 research outputs found
Genome assembly in the telomere-to-telomere era
De novo assembly is the process of reconstructing the genome sequence of an
organism from sequencing reads. Genome sequences are essential to biology, and
assembly has been a central problem in bioinformatics for four decades. Until
recently, genomes were typically assembled into fragments of a few megabases at
best but technological advances in long-read sequencing now enable near
complete chromosome-level assembly, also known as telomere-to-telomere
assembly, for many organisms. Here we review recent progress on assembly
algorithms and protocols. We focus on how to derive near telomere-to-telomere
assemblies and discuss potential future developments
Hierarchical information clustering by means of topologically embedded graphs
We introduce a graph-theoretic approach to extract clusters and hierarchies
in complex data-sets in an unsupervised and deterministic manner, without the
use of any prior information. This is achieved by building topologically
embedded networks containing the subset of most significant links and analyzing
the network structure. For a planar embedding, this method provides both the
intra-cluster hierarchy, which describes the way clusters are composed, and the
inter-cluster hierarchy which describes how clusters gather together. We
discuss performance, robustness and reliability of this method by first
investigating several artificial data-sets, finding that it can outperform
significantly other established approaches. Then we show that our method can
successfully differentiate meaningful clusters and hierarchies in a variety of
real data-sets. In particular, we find that the application to gene expression
patterns of lymphoma samples uncovers biologically significant groups of genes
which play key-roles in diagnosis, prognosis and treatment of some of the most
relevant human lymphoid malignancies.Comment: 33 Pages, 18 Figures, 5 Table
Multiple bit error correcting architectures over finite fields
This thesis proposes techniques to mitigate multiple bit errors in GF arithmetic circuits. As GF arithmetic circuits such as multipliers constitute the complex and important functional unit of a crypto-processor, making them fault tolerant will improve the reliability of circuits that are employed in safety applications and the errors may cause catastrophe if not mitigated.
Firstly, a thorough literature review has been carried out. The merits of efficient schemes are carefully analyzed to study the space for improvement in error correction, area and power consumption.
Proposed error correction schemes include bit parallel ones using optimized BCH codes that are useful in applications where power and area are not prime concerns. The scheme is also extended to dynamically correcting scheme to reduce decoder delay. Other method that suits low power and area applications such as RFIDs and smart cards using cross parity codes is also proposed. The experimental evaluation shows that the proposed techniques can mitigate single and multiple bit errors with wider
error coverage compared to existing methods with lesser area and power consumption. The proposed scheme is used to mask the errors appearing at the output of the circuit irrespective of their cause.
This thesis also investigates the error mitigation schemes in emerging technologies (QCA, CNTFET) to compare area, power and delay with existing CMOS equivalent. Though the proposed novel multiple error correcting techniques can not ensure 100% error mitigation, inclusion of these techniques
to actual design can improve the reliability of the circuits or increase the difficulty in hacking crypto-devices. Proposed schemes can also be extended to non GF digital circuits
Insights into the biology of Candidate Division OP3 LiM populations
The candidate division OP3, recently entitled candidate phylum Omnitrophica, is characterized by 16S rRNA gene sequences from a broad range of anoxic habitats with a broad phylogeny of up to 26% sequence dissimilarity. The 16S rRNA phylotype OP3 LiM had previously been detected in limonene-degrading, methanogenic enrichment cultures and represented small coccoid cells. Neither isolation experiments nor physiological experiments had provided insights into the metabolism of this bacterium within the complex methanogenic community. This doctoral thesis aimed at the characterization of populations of the phylotype OP3 LiM to discover its biology. Metagenomes usually yield draft population genomes. To obtain the complete closed OP3 LiM genome, in silico methods were explored to improve draft assemblies. Large genomes of planctomycete strains were assembled with a variety of methods. A taxonomic classification of contig sequences was used to differentiate and separate contigs of draft assemblies into taxon-specific groups. Reassemblies of reads obtaining from mapping onto taxon-specific contigs yielded improved draft assemblies. This knowledge was used to obtain a closed genome of OP3 LiM from a metagenome of physically enriched OP3 LiM cells. Finishing the OP3 LiM genome required the combination of data of different sequencing technologies, a variety of assembly and mapping software, over 15 reassemblies with intensive manual quality controls by read and contig mapping and, finally, laboratory work with combinatorial PCR to solve the genome puzzle. The population genome of OP3 LiM is the first closed genome of a member of candidate phylum Omnitrophica and comprises 1,974,501 bp with a GC content of 52.9%. Its 23S rRNA contains a group I intron. The genome offers a syntrophic life on hydrogen or formate, however, the metaproteome indicated that OP3 LiM uses glycolysis together with pyruvate oxidation as major catabolic pathway. The metaproteome also identified high levels of proteins potentially involved in the degradation of polymers as well as in the uptake of foreign nucleic acids. The genomic information was combined with observations of cells of the methanogenic community by different visualization methods. Images of OP3 LiM required electron microscopy due to the small cell size of 0.2a 0.3 AAmicrometre in diameter. In situ hybridizations revealed two physiological stages, free-living OP3 LiM cells with low ribosome content and OP3 LiM cells attached to either bacteria or archaea, which showed strong signals. This observation indicated a higher metabolic activity of OP3 LiM cells during the attachment and, likewise, that the bacterium utilizes surface polysaccharides as preferred substrate. In situ hybridizations revealed that the methanogen Methanosaeta in the enrichment culture contained cells in the filaments that lacked DNA and rRNA suggesting that these cells lost their cellular content. We also observed faint signals of the OP3 LiM 16S rRNA in Methanosaeta cells. The presence of the intron RNA of the 23S rRNA of OP3 LiM was visualized in Methanosaeta cells devoid of DNA and rRNA. This first direct observation of an intron transfer from a bacterium to an archaeon together with metaproteomic observations indicate the lifestyle of an epibiotic bacterium for OP3 LiM. OP3 LiM is the first predatory bacterium that preys on Archaea. We propose to name OP3 LiM a Candidatus Vampirococcus archaeovorusa
Biophysical analysis of HTLV-1 particles reveals novel insights into particle morphology and Gag stoichiometry
<p>Abstract</p> <p>Background</p> <p>Human T-lymphotropic virus type 1 (HTLV-1) is an important human retrovirus that is a cause of adult T-cell leukemia/lymphoma. While an important human pathogen, the details regarding virus replication cycle, including the nature of HTLV-1 particles, remain largely unknown due to the difficulties in propagating the virus in tissue culture. In this study, we created a codon-optimized HTLV-1 Gag fused to an <it>EYFP </it>reporter as a model system to quantitatively analyze HTLV-1 particles released from producer cells.</p> <p>Results</p> <p>The codon-optimized Gag led to a dramatic and highly robust level of Gag expression as well as virus-like particle (VLP) production. The robust level of particle production overcomes previous technical difficulties with authentic particles and allowed for detailed analysis of particle architecture using two novel methodologies. We quantitatively measured the diameter and morphology of HTLV-1 VLPs in their native, hydrated state using cryo-transmission electron microscopy (cryo-TEM). Furthermore, we were able to determine HTLV-1 Gag stoichiometry as well as particle size with the novel biophysical technique of fluorescence fluctuation spectroscopy (FFS). The average HTLV-1 particle diameter determined by cryo-TEM and FFS was 71 ± 20 nm and 75 ± 4 nm, respectively. These values are significantly smaller than previous estimates made of HTLV-1 particles by negative staining TEM. Furthermore, cryo-TEM reveals that the majority of HTLV-1 VLPs lacks an ordered structure of the Gag lattice, suggesting that the HTLV-1 Gag shell is very likely to be organized differently compared to that observed with HIV-1 Gag in immature particles. This conclusion is supported by our observation that the average copy number of HTLV-1 Gag per particle is estimated to be 510 based on FFS, which is significantly lower than that found for HIV-1 immature virions.</p> <p>Conclusions</p> <p>In summary, our studies represent the first quantitative biophysical analysis of HTLV-1-like particles and reveal novel insights into particle morphology and Gag stochiometry.</p
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Towards Molecular Simulations that are Transparent, Reproducible, Usable By Others, and Extensible (TRUE)
Systems composed of soft matter (e.g., liquids, polymers, foams, gels,
colloids, and most biological materials) are ubiquitous in science and
engineering, but molecular simulations of such systems pose particular
computational challenges, requiring time and/or ensemble-averaged data to be
collected over long simulation trajectories for property evaluation. Performing
a molecular simulation of a soft matter system involves multiple steps, which
have traditionally been performed by researchers in a "bespoke" fashion,
resulting in many published soft matter simulations not being reproducible
based on the information provided in the publications. To address the issue of
reproducibility and to provide tools for computational screening, we have been
developing the open-source Molecular Simulation and Design Framework (MoSDeF)
software suite. In this paper, we propose a set of principles to create
Transparent, Reproducible, Usable by others, and Extensible (TRUE) molecular
simulations. MoSDeF facilitates the publication and dissemination of TRUE
simulations by automating many of the critical steps in molecular simulation,
thus enhancing their reproducibility. We provide several examples of TRUE
molecular simulations: All of the steps involved in creating, running and
extracting properties from the simulations are distributed on open-source
platforms (within MoSDeF and on GitHub), thus meeting the definition of TRUE
simulations
Bioprocess Systems Engineering Applications in Pharmaceutical Manufacturing
Biopharmaceutical and pharmaceutical manufacturing are strongly influenced by the process analytical technology initiative (PAT) and quality by design (QbD) methodologies, which are designed to enhance the understanding of more integrated processes. The major aim of this effort can be summarized as developing a mechanistic understanding of a wide range of process steps, including the development of technologies to perform online measurements and real-time control and optimization. Furthermore, minimization of the number of empirical experiments and the model-assisted exploration of the process design space are targeted. Even if tremendous progress has been achieved so far, there is still work to be carried out in order to realize the full potential of the process systems engineering toolbox. Within this reprint, an overview of cutting-edge developments of process systems engineering for biopharmaceutical and pharmaceutical manufacturing processes is given, including model-based process design, Digital Twins, computer-aided process understanding, process development and optimization, and monitoring and control of bioprocesses. The biopharmaceutical processes addressed focus on the manufacturing of biopharmaceuticals, mainly by Chinese hamster ovary (CHO) cells, as well as adeno-associated virus production and generation of cell spheroids for cell therapies
- …