2,886 research outputs found
Bioinformatics from genetic variants to methylation
An important research topic in bioinformatics is the analysis of DNA, the
molecule that encodes the genetic information of all organisms. The basis
for this is sequencing, a procedure in which the sequence of DNA bases
is determined. In addition to the identification of variations in the base sequence
itself, advances in sequencing methods and a steady reduction in sequencing
costs open up new fields of research: the analysis of functionally
relevant non-base-related changes, so-called epigenetics. An important example
of such a mechanism is DNA methylation, a process in which methyl
groups are added to DNA without altering the sequence itself. Methylation
takes place only at specific sites, and the methylation information of human
DNA consists of approximately 30 million methylation levels between
0 and 1 in total. This thesis deals with problems and solutions for each
phase of DNA methylation analysis.
The most advanced method for detecting DNA methylation based on resolution
is Whole-Genome Bisulfite Sequencing (WGBS), a technique that
modifies DNA at unmethylated sites. We describe the special in-silico treatment
required to process this altered DNA and existing concepts as well
as newly developed bioinformatic methods for efficient determination of
DNA methylation levels and their further processing with our developed
tool camel. A common downstream analysis step is the detection of differentially
methylated regions (DMRs), for which we have implemented a
modification of the widely used method BSmooth in order to deal with
today’s common data sizes.
Setting up and creating new sequencing protocols, e.g., the mentioned
WGBS, is complicated and requires adjustments to several parameters. We
have developed a method based on a linear program (LP) that can predict
the duplicate rate of supersamples. This critical quality measure represents
the proportion of redundant data that in most cases needs to be removed
from any further analysis. By using our method, it becomes possible to
test, adjust and improve parameters for small test libraries only and to
estimate the duplication rate for potential full-size samples.
Once the sequencing protocol has been established, the methylation recognition
of camel can be used as part of automated workflows, such as our
mosquito workflow. This pipeline processes the generated WGBS samples
from the raw data to the degree of methylation, including all essential
intermediate steps. Such workflows are one of the central components of
bioinformatics since the calculation must be parallel, reproducible and scalable.
The distribution of the detected methylation levels, e.g., values of several
samples at a specific location, can often be described as a beta-mixture
model. The standard approach for estimating the parameters for such a
model, the EM algorithm, has problems for data points of 0 or 1, which are
very common as methylation levels. For this reason, we have developed an
alternative algorithm based on moments that overcome this disadvantage.It is robust for data points within the closed interval [0; 1] and can also be
applied to similar data sets in addition to methylation levels.
This work deals not only with epigenetic but also with genetic variants. To
analyze these, we present a second pipeline (ape) for data from targeted
sequencing, where for example only genes are sequenced. The recognized
variants then serve as input for our graphical environment eagle, a tool
for computer scientists and geneticists to recognize possible causal genetic
variants. As the name implies: The configuration of the analysis and presentation
of the results is done via a graphical user interface. Unlike other
tools, eagle is not based on databases, but on encapsulated hdf5 files. The
use of this universal file-system-like data structure offers some advantages
and makes the system easy to use especially for non-computer scientists.
At the end of the thesis, we use all methods presented for the detection,
analysis, and characterization of interindividual DMRs between several
donors. This leads to some computational challenges because DMR
detection is usually performed on two different groups.
Our developed approach processes independent samples and calculates
key metrics such as p-values and the number of undetectable DMRs.
Through whole genome association studies (GWAS) on more than 1000 array
data sets of methylation and variants, we show that (interindividual)
DMRs as a subtype of epigenetics are related to genetic variation
Pentacene in 1,3,5-Tri(1-naphtyl)benzene: A Novel Standard for Transient EPR Spectroscopy at Room Temperature
Testing and calibrating an experimental setup with standard samples is an essential
aspect of scientifc research. Single crystals of pentacene in p-terphenyl are widely
used for this purpose in transient electron paramagnetic resonance (EPR) spectros copy. However, this sample is not without downsides: the crystals need to be grown
and the EPR transitions only appear at particular orientations of the crystal with
respect to the external magnetic feld. An alternative host for pentacene is the glass forming 1,3,5-tri(1-naphtyl)benzene (TNB). Due to the high glass transition point
of TNB, an amorphous glass containing randomly oriented pentacene molecules
is obtained at room temperature. Here we demonstrate that pentacene dissolved in
TNB gives a typical “powder-like” transient EPR spectrum of the triplet state fol lowing pulsed laser excitation. From the two-dimensional data set, it is straight forward to obtain the zero-feld splitting parameters and relative populations by
spectral simulation as well as the B1 feld in the microwave resonator. Due to the
simplicity of preparation, handling and stability, this system is ideal for adjusting the
laser beam with respect to the microwave resonator and for introducing students to
transient EPR spectroscopy
X-ray crystallographic structure of a papain-leupeptin complex
AbstractThe three-dimensional structure of the papain-leupeptin complex has been determined by X-ray crystallography to a resolution of 2.1 Å (overall R-factor = 19.8%). The structure indicates that: (i) leupeptin contacts the S subsites of the papain active site and not the S'subsites; (ii) the ‘carbonyl’ carbon atom of the inhibitor is covalently bound by the Cys-25 sulphur atom of papain and is tetrahedrally coordinated; (iii) the ‘carbonyl’ oxygen atom of the inhibitor faces the oxyanion hole and makes hydrogen bond contacts with Gln-19 and Cys-25
Calculation of the elastic properties of prosthetic knee components with an iterative finite element-based modal analysis: quantitative comparison of different measuring techniques.
With the aging but still active population, research on total joint replacements relies increasingly on numerical methods, such as finite element analysis, to improve wear resistance of components. However, the validity of finite element models largely depends on the accuracy of their material behavior and geometrical representation. In particular, material properties are often based on manufacturer data or literature reports, but can alternatively be estimated by matching experimental measurements and structural predictions through modal analyses and identification of eigenfrequencies. The aim of the present study was to compare the accuracy of common setups used for estimating the eigenfrequencies of typical components often used in prosthetized joints. Eigenfrequencies of cobalt-chrome and ultra-high-molecular weight polyethylene components were therefore measured with four different setups, and used in modal analyses of corresponding finite element models for an iterative adjustment of their material properties. Results show that for the low-damped cobalt chromium endoprosthesis components, all common measuring setups provided accurate measurements. In the case of high-damped structures, measurements were only possible with setups including a continuously excitation system such as electrodynamic shakers. This study demonstrates that the iterative back-calculation of eigenfrequencies can be a reliable method to estimate the elastic properties for finite element models
Concomitant arginine-vasopressin and hydrocortisone therapy in severe septic shock: association with mortality
Purpose: To evaluate the association between concomitant arginine-vasopressin (AVP)/hydrocortisone therapy and mortality in severe septic shock patients. Methods: This retrospective study included severe septic shock patients treated with supplementary AVP. To test the association between concomitant AVP/hydrocortisone use and mortality, a multivariate regression and Cox model (adjusted for admission year, initial AVP dosage and the Sepsis-related Organ Failure Assessment score before AVP) as well as a propensity score-based analysis were used. In both models, intensive care unit (ICU) and 28-day mortality served as outcome variables. Results: One hundred fifty-nine patients were included. Hydrocortisone was administered to 76 (47.8%) at a median daily dosage of 300 (200-300)mg. In the multivariate logistic regression model, concomitant use of AVP and hydrocortisone was associated with a trend towards lower ICU (OR, 0.51; CI 95%, 0.24-1.08; p=0.08) and 28-day (HR, 0.69; CI 95%, 0.43-1.08; p=0.11) mortality. The probability of survival at day 28, as predicted by the regression model, was significantly higher in patients treated with concomitant AVP and hydrocortisone compared to those receiving AVP without hydrocortisone (p=0.001). In a propensity score-based analysis, ICU (45 vs. 65%; OR, 0.69; CI 95% 0.38-1.26; p=0.23) and 28-day mortality (35.5 vs. 55%; OR, 0.59; CI 95%, 0.27-1.29; p=0.18) was not different between patients treated with (n=40) or without concomitant hydrocortisone (n=40). Conclusion: Concomitant AVP and hydrocortisone therapy may be associated with a survival benefit in septic shock. An adequately powered, randomised controlled trial appears warranted to confirm these preliminary, hypothesis-generating result
The Infinite Index: Information Retrieval on Generative Text-To-Image Models
Conditional generative models such as DALL-E and Stable Diffusion generate
images based on a user-defined text, the prompt. Finding and refining prompts
that produce a desired image has become the art of prompt engineering.
Generative models do not provide a built-in retrieval model for a user's
information need expressed through prompts. In light of an extensive literature
review, we reframe prompt engineering for generative models as interactive
text-based retrieval on a novel kind of "infinite index". We apply these
insights for the first time in a case study on image generation for game design
with an expert. Finally, we envision how active learning may help to guide the
retrieval of generated images.Comment: Final version for CHIIR 202
Studying Light-Harvesting Models with Superconducting Circuits
The process of photosynthesis, the main source of energy in the animate
world, converts sunlight into chemical energy. The surprisingly high efficiency
of this process is believed to be enabled by an intricate interplay between the
quantum nature of molecular structures in photosynthetic complexes and their
interaction with the environment. Investigating these effects in biological
samples is challenging due to their complex and disordered structure. Here we
experimentally demonstrate a new approach for studying photosynthetic models
based on superconducting quantum circuits. In particular, we demonstrate the
unprecedented versatility and control of our method in an engineered three-site
model of a pigment protein complex with realistic parameters scaled down in
energy by a factor of . With this system we show that the excitation
transport between quantum coherent sites disordered in energy can be enabled
through the interaction with environmental noise. We also show that the
efficiency of the process is maximized for structured noise resembling
intramolecular phononic environments found in photosynthetic complexes.Comment: 8+12 pages, 4+12 figure
- …