144 research outputs found
Expression data dnalysis and regulatory network inference by means of correlation patterns
With the advance of high-throughput techniques, the amount of available data in the bio-molecular field is rapidly growing. It is now possible to measure genome-wide aspects of an entire biological system as a whole.
Correlations that emerge due to internal dependency structures of these systems entail the formation of characteristic patterns in the corresponding data. The extraction of these patterns has become an integral part of computational biology.
By triggering perturbations and interventions it is possible to induce an alteration of patterns, which may help to derive the dependency structures present in the system. In particular, differential expression experiments may yield alternate patterns that we can use to approximate the actual interplay of regulatory proteins and genetic elements, namely, the regulatory network of a cell.
In this work, we examine the detection of correlation patterns from bio-molecular data and we evaluate their applicability in terms of protein contact prediction, experimental artifact removal, the discovery of unexpected expression patterns and genome-scale inference of regulatory networks.
Correlation patterns are not limited to expression data. Their analysis in the context of conserved interfaces among proteins is useful to estimate whether these may have co-evolved. Patterns that hint on correlated mutations would then occur in the associated protein sequences as well. We employ a conceptually simple sampling strategy to decide whether or not two pathway elements share a conserved interface and are thus likely to be in physical contact. We successfully apply our method to a system of ABC-transporters and two-component systems from the phylum of Firmicute bacteria.
For spatially resolved gene expression data like microarrays, the detection of artifacts, as opposed to noise, corresponds to the extraction of localized patterns that resemble outliers in a given region. We develop a method to detect and remove such artifacts using a sliding-window approach. Our method is very accurate and it is shown to adapt to other platforms like custom arrays as well.
Further, we developed Padesco as a way to reveal unexpected expression patterns. We extract frequent and recurring patterns that are conserved across many experiments. For a specific experiment, we predict whether a gene deviates from its expected behaviour. We show that Padesco is an effective approach for selecting promising candidates from differential expression experiments.
In Chapter 5, we then focus on the inference of genome-scale regulatory networks from expression data. Here, correlation patterns have proven useful for the data-driven estimation of regulatory interactions. We show that, for reliable eukaryotic network inference, the integration of prior networks is essential. We reveal that this integration leads to an over-estimate of network-wide quality estimates and suggest a corrective procedure, CoRe, to counterbalance this effect. CoRe drastically improves the false discovery rate of the originally predicted networks. We further suggest a consensus approach in combination with an extended set of topological features to obtain a more accurate estimate of the eukaryotic regulatory network for yeast.
In the course of this work we show how correlation patterns can be detected and how they can be applied for various problem settings in computational molecular biology. We develop and discuss competitive approaches for the prediction of protein contacts, artifact repair, differential expression analysis, and network inference and show their applicability in practical setups.Mit der Weiterentwicklung von Hochdurchsatztechniken steigt die Anzahl verfügbarer Daten im Bereich der Molekularbiologie rapide an. Es ist heute möglich, genomweite Aspekte eines ganzen biologischen Systems komplett zu erfassen.
Korrelationen, die aufgrund der internen Abhängigkeits-Strukturen dieser Systeme enstehen, führen zu charakteristischen Mustern in gemessenen Daten. Die Extraktion dieser Muster ist zum integralen Bestandteil der Bioinformatik geworden.
Durch geplante Eingriffe in das System ist es möglich Muster-Änderungen auszulösen, die helfen, die Abhängigkeits-Strukturen des Systems abzuleiten. Speziell differentielle Expressions-Experimente können Muster-Wechsel bedingen, die wir verwenden können, um uns dem tatsächlichen Wechselspiel von regulatorischen Proteinen und genetischen Elementen anzunähern, also dem regulatorischen Netzwerk einer Zelle.
In der vorliegenden Arbeit beschäftigen wir uns mit der Erkennung von Korrelations-Mustern in molekularbiologischen Daten und schätzen ihre praktische Nutzbarkeit ab, speziell im Kontext der Kontakt-Vorhersage von Proteinen, der Entfernung von experimentellen Artefakten, der Aufdeckung unerwarteter Expressions-Muster und der genomweiten Vorhersage regulatorischer Netzwerke.
Korrelations-Muster sind nicht auf Expressions-Daten beschränkt. Ihre Analyse im Kontext konservierter Schnittstellen zwischen Proteinen liefert nützliche Hinweise auf deren Ko-Evolution. Muster die auf korrelierte Mutationen hinweisen, würden in diesem Fall auch in den entsprechenden Proteinsequenzen auftauchen. Wir nutzen eine einfache Sampling-Strategie, um zu entscheiden, ob zwei Elemente eines Pathways eine gemeinsame Schnittstelle teilen, berechnen also die Wahrscheinlichkeit für deren physikalischen Kontakt. Wir wenden unsere Methode mit Erfolg auf ein System von ABC-Transportern und Zwei-Komponenten-Systemen aus dem Firmicutes Bakterien-Stamm an.
Für räumlich aufgelöste Expressions-Daten wie Microarrays enspricht die Detektion von Artefakten der Extraktion lokal begrenzter Muster. Im Gegensatz zur Erkennung von Rauschen stellen diese innerhalb einer definierten Region Ausreißer dar. Wir entwickeln eine Methodik, um mit Hilfe eines Sliding-Window-Verfahrens, solche Artefakte zu erkennen und zu entfernen. Das Verfahren erkennt diese sehr zuverlässig. Zudem kann es auf Daten diverser Plattformen, wie Custom-Arrays, eingesetzt werden.
Als weitere Möglichkeit unerwartete Korrelations-Muster aufzudecken, entwickeln wir Padesco. Wir extrahieren häufige und wiederkehrende Muster, die über Experimente hinweg konserviert sind. Für ein bestimmtes Experiment sagen wir vorher, ob ein Gen von seinem erwarteten Verhalten abweicht. Wir zeigen, dass Padesco ein effektives Vorgehen ist, um vielversprechende Kandidaten eines differentiellen Expressions-Experiments auszuwählen.
Wir konzentrieren uns in Kapitel 5 auf die Vorhersage genomweiter regulatorischer Netzwerke aus Expressions-Daten. Hierbei haben sich Korrelations-Muster als nützlich für die datenbasierte Abschätzung regulatorischer Interaktionen erwiesen. Wir zeigen, dass für die Inferenz eukaryotischer Systeme eine Integration zuvor bekannter Regulationen essentiell ist. Unsere Ergebnisse ergeben, dass diese Integration zur Überschätzung netzwerkübergreifender Qualitätsmaße führt und wir schlagen eine Prozedur - CoRe - zur Verbesserung vor, um diesen Effekt auszugleichen. CoRe verbessert die False Discovery Rate der ursprünglich vorhergesagten Netzwerke drastisch. Weiterhin schlagen wir einen Konsensus-Ansatz in Kombination mit einem erweiterten Satz topologischer Features vor, um eine präzisere Vorhersage für das eukaryotische Hefe-Netzwerk zu erhalten.
Im Rahmen dieser Arbeit zeigen wir, wie Korrelations-Muster erkannt und wie sie auf verschiedene Problemstellungen der Bioinformatik angewandt werden können. Wir entwickeln und diskutieren Ansätze zur Vorhersage von Proteinkontakten, Behebung von Artefakten, differentiellen Analyse von Expressionsdaten und zur Vorhersage von Netzwerken und zeigen ihre Eignung im praktischen Einsatz
Expression data dnalysis and regulatory network inference by means of correlation patterns
With the advance of high-throughput techniques, the amount of available data in the bio-molecular field is rapidly growing. It is now possible to measure genome-wide aspects of an entire biological system as a whole.
Correlations that emerge due to internal dependency structures of these systems entail the formation of characteristic patterns in the corresponding data. The extraction of these patterns has become an integral part of computational biology.
By triggering perturbations and interventions it is possible to induce an alteration of patterns, which may help to derive the dependency structures present in the system. In particular, differential expression experiments may yield alternate patterns that we can use to approximate the actual interplay of regulatory proteins and genetic elements, namely, the regulatory network of a cell.
In this work, we examine the detection of correlation patterns from bio-molecular data and we evaluate their applicability in terms of protein contact prediction, experimental artifact removal, the discovery of unexpected expression patterns and genome-scale inference of regulatory networks.
Correlation patterns are not limited to expression data. Their analysis in the context of conserved interfaces among proteins is useful to estimate whether these may have co-evolved. Patterns that hint on correlated mutations would then occur in the associated protein sequences as well. We employ a conceptually simple sampling strategy to decide whether or not two pathway elements share a conserved interface and are thus likely to be in physical contact. We successfully apply our method to a system of ABC-transporters and two-component systems from the phylum of Firmicute bacteria.
For spatially resolved gene expression data like microarrays, the detection of artifacts, as opposed to noise, corresponds to the extraction of localized patterns that resemble outliers in a given region. We develop a method to detect and remove such artifacts using a sliding-window approach. Our method is very accurate and it is shown to adapt to other platforms like custom arrays as well.
Further, we developed Padesco as a way to reveal unexpected expression patterns. We extract frequent and recurring patterns that are conserved across many experiments. For a specific experiment, we predict whether a gene deviates from its expected behaviour. We show that Padesco is an effective approach for selecting promising candidates from differential expression experiments.
In Chapter 5, we then focus on the inference of genome-scale regulatory networks from expression data. Here, correlation patterns have proven useful for the data-driven estimation of regulatory interactions. We show that, for reliable eukaryotic network inference, the integration of prior networks is essential. We reveal that this integration leads to an over-estimate of network-wide quality estimates and suggest a corrective procedure, CoRe, to counterbalance this effect. CoRe drastically improves the false discovery rate of the originally predicted networks. We further suggest a consensus approach in combination with an extended set of topological features to obtain a more accurate estimate of the eukaryotic regulatory network for yeast.
In the course of this work we show how correlation patterns can be detected and how they can be applied for various problem settings in computational molecular biology. We develop and discuss competitive approaches for the prediction of protein contacts, artifact repair, differential expression analysis, and network inference and show their applicability in practical setups.Mit der Weiterentwicklung von Hochdurchsatztechniken steigt die Anzahl verfügbarer Daten im Bereich der Molekularbiologie rapide an. Es ist heute möglich, genomweite Aspekte eines ganzen biologischen Systems komplett zu erfassen.
Korrelationen, die aufgrund der internen Abhängigkeits-Strukturen dieser Systeme enstehen, führen zu charakteristischen Mustern in gemessenen Daten. Die Extraktion dieser Muster ist zum integralen Bestandteil der Bioinformatik geworden.
Durch geplante Eingriffe in das System ist es möglich Muster-Änderungen auszulösen, die helfen, die Abhängigkeits-Strukturen des Systems abzuleiten. Speziell differentielle Expressions-Experimente können Muster-Wechsel bedingen, die wir verwenden können, um uns dem tatsächlichen Wechselspiel von regulatorischen Proteinen und genetischen Elementen anzunähern, also dem regulatorischen Netzwerk einer Zelle.
In der vorliegenden Arbeit beschäftigen wir uns mit der Erkennung von Korrelations-Mustern in molekularbiologischen Daten und schätzen ihre praktische Nutzbarkeit ab, speziell im Kontext der Kontakt-Vorhersage von Proteinen, der Entfernung von experimentellen Artefakten, der Aufdeckung unerwarteter Expressions-Muster und der genomweiten Vorhersage regulatorischer Netzwerke.
Korrelations-Muster sind nicht auf Expressions-Daten beschränkt. Ihre Analyse im Kontext konservierter Schnittstellen zwischen Proteinen liefert nützliche Hinweise auf deren Ko-Evolution. Muster die auf korrelierte Mutationen hinweisen, würden in diesem Fall auch in den entsprechenden Proteinsequenzen auftauchen. Wir nutzen eine einfache Sampling-Strategie, um zu entscheiden, ob zwei Elemente eines Pathways eine gemeinsame Schnittstelle teilen, berechnen also die Wahrscheinlichkeit für deren physikalischen Kontakt. Wir wenden unsere Methode mit Erfolg auf ein System von ABC-Transportern und Zwei-Komponenten-Systemen aus dem Firmicutes Bakterien-Stamm an.
Für räumlich aufgelöste Expressions-Daten wie Microarrays enspricht die Detektion von Artefakten der Extraktion lokal begrenzter Muster. Im Gegensatz zur Erkennung von Rauschen stellen diese innerhalb einer definierten Region Ausreißer dar. Wir entwickeln eine Methodik, um mit Hilfe eines Sliding-Window-Verfahrens, solche Artefakte zu erkennen und zu entfernen. Das Verfahren erkennt diese sehr zuverlässig. Zudem kann es auf Daten diverser Plattformen, wie Custom-Arrays, eingesetzt werden.
Als weitere Möglichkeit unerwartete Korrelations-Muster aufzudecken, entwickeln wir Padesco. Wir extrahieren häufige und wiederkehrende Muster, die über Experimente hinweg konserviert sind. Für ein bestimmtes Experiment sagen wir vorher, ob ein Gen von seinem erwarteten Verhalten abweicht. Wir zeigen, dass Padesco ein effektives Vorgehen ist, um vielversprechende Kandidaten eines differentiellen Expressions-Experiments auszuwählen.
Wir konzentrieren uns in Kapitel 5 auf die Vorhersage genomweiter regulatorischer Netzwerke aus Expressions-Daten. Hierbei haben sich Korrelations-Muster als nützlich für die datenbasierte Abschätzung regulatorischer Interaktionen erwiesen. Wir zeigen, dass für die Inferenz eukaryotischer Systeme eine Integration zuvor bekannter Regulationen essentiell ist. Unsere Ergebnisse ergeben, dass diese Integration zur Überschätzung netzwerkübergreifender Qualitätsmaße führt und wir schlagen eine Prozedur - CoRe - zur Verbesserung vor, um diesen Effekt auszugleichen. CoRe verbessert die False Discovery Rate der ursprünglich vorhergesagten Netzwerke drastisch. Weiterhin schlagen wir einen Konsensus-Ansatz in Kombination mit einem erweiterten Satz topologischer Features vor, um eine präzisere Vorhersage für das eukaryotische Hefe-Netzwerk zu erhalten.
Im Rahmen dieser Arbeit zeigen wir, wie Korrelations-Muster erkannt und wie sie auf verschiedene Problemstellungen der Bioinformatik angewandt werden können. Wir entwickeln und diskutieren Ansätze zur Vorhersage von Proteinkontakten, Behebung von Artefakten, differentiellen Analyse von Expressionsdaten und zur Vorhersage von Netzwerken und zeigen ihre Eignung im praktischen Einsatz
Structure of the ATP synthase from Mycobacterium smegmatis provides targets for treating tuberculosis.
The structure has been determined by electron cryomicroscopy of the adenosine triphosphate (ATP) synthase from Mycobacterium smegmatis This analysis confirms features in a prior description of the structure of the enzyme, but it also describes other highly significant attributes not recognized before that are crucial for understanding the mechanism and regulation of the mycobacterial enzyme. First, we resolved not only the three main states in the catalytic cycle described before but also eight substates that portray structural and mechanistic changes occurring during a 360° catalytic cycle. Second, a mechanism of auto-inhibition of ATP hydrolysis involves not only the engagement of the C-terminal region of an α-subunit in a loop in the γ-subunit, as proposed before, but also a "fail-safe" mechanism involving the b'-subunit in the peripheral stalk that enhances engagement. A third unreported characteristic is that the fused bδ-subunit contains a duplicated domain in its N-terminal region where the two copies of the domain participate in similar modes of attachment of the two of three N-terminal regions of the α-subunits. The auto-inhibitory plus the associated "fail-safe" mechanisms and the modes of attachment of the α-subunits provide targets for development of innovative antitubercular drugs. The structure also provides support for an observation made in the bovine ATP synthase that the transmembrane proton-motive force that provides the energy to drive the rotary mechanism is delivered directly and tangentially to the rotor via a Grotthuss water chain in a polar L-shaped tunnel
Feasibility Study for a Chemical Process Particle Size Characterization System for Explosive Environments Using Low Laser Power
The industrial particle sensor market lacks simple, easy to use, low cost yet robust, safe and fast response solutions. Towards development of such a sensor, for in-line use in micro channels under continuous flow conditions, this work introduces static light scattering (SLS) determination of particle diameter using a laser with an emission power of less than 5 µW together with sensitive detectors with detection times of 1 ms. The measurements for the feasibility studies are made in an angular range between 20° and 160° in 2° increments. We focus on the range between 300 and 1000 nm, for applications in the production of paints, colors, pigments and crystallites. Due to the fast response time, reaction characteristics in microchannel designs for precipitation and crystallization processes can be studied. A novel method for particle diameter characterization is developed using the positions of maxima and minima and slope distribution. The novel algorithm to classify particle diameter is especially developed to be independent of dispersed phase concentration or concentration fluctuations like product flares or signal instability. Measurement signals are post processed and particle diameters are validated against Mie light scattering simulations. The design of a low cost instrument for industrial use is proposed
A commonly used rumen-protected conjugated linoleic acid supplement marginally affects fatty acid distribution of body tissues and gene expression of mammary gland in heifers during early lactation
Background: Conjugated linoleic acids (CLA) in general, and in particular the trans-10, cis-12 (t10, c12-CLA) isomer are potent modulators of milk fat synthesis in dairy cows. Studies in rodents, such as mice, have revealed that t10, c12-CLA is responsible for hepatic lipodystrophy and decreased adipose tissue with subsequent changes in the fatty acid distribution. The present study aimed to investigate the fatty acid distribution of lipids in several body tissues compared to their distribution in milk fat in early lactating cows in response to CLA treatment. Effects in mammary gland are further analyzed at gene expression level. Methods: Twenty-five Holstein heifers were fed a diet supplemented with (CLA groups) or without (CON groups) a rumen-protected CLA supplement that provided 6 g/d of c9, t11-and t10, c12-CLA. Five groups of randomly assigned cows were analyzed according to experimental design based on feeding and time of slaughter. Cows in the first group received no CLA supplement and were slaughtered one day postpartum (CON0). Milk samples were taken from the remaining cows in CON and CLA groups until slaughter at 42 (period 1) and 105 (period 2) days in milk (DIM). Immediately after slaughter, tissue samples from liver, retroperitoneal fat, mammary gland and M. longissimus (13th rib) were obtained and analyzed for fatty acid distribution. Relevant genes involved in lipid metabolism of the mammary gland were analyzed using a custom-made microarray platform. Results: Both supplemented CLA isomers increased significantly in milk fat. Furthermore, preformed fatty acids increased at the expense of de novo-synthesized fatty acids. Total and single trans-octadecenoic acids (e. g., t10-18:1 and t11-18:1) also significantly increased. Fatty acid distribution of the mammary gland showed similar changes to those in milk fat, due mainly to residual milk but without affecting gene expression. Liver fatty acids were not altered except for trans-octadecenoic acids, which were increased. Adipose tissue and M. longissimus were only marginally affected by CLA supplementation. Conclusions: Daily supplementation with CLA led to typical alterations usually observed in milk fat depression (reduction of de novo-synthesized fatty acids) but only marginally affected tissue lipids. Gene expression of the mammary gland was not influenced by CLA supplementation
Upworkers in Finland: Survey Results
Upwork is the world’s largest online labor market platform connecting clients with freelance professionals from various disciplines ranging from administrative support to web development. This study documents the main findings of the Upworkers in Finland survey conducted in December 2017.The survey targeted all freelancers listed on the platform who (a) claimed to reside in Finland and (b) had earned at least $1 since signing up. Of the 207 such freelancers found publicly listed on Upwork on 8 December 2017, 58.9% responded to our online questionnaire. Most Upworkers in Finland are translators, followed by designers and coders. They are typically less than 30 years old, involved in higher education or training (or already have at least a college-level degree), and live in the capital region or another urban area. Approximately one-third are immigrants or other nonnative speakers. They have a strong preference for entrepreneurship/self-employment over paid/salaried employment. Independence, flexibility, and extra earnings are particularly motivators for online work engagement. The respondents are both quite fond of the platform and satisfied with their current online work arrangement.</p
The Utility of AISA Eagle Hyperspectral Data and Random Forest Classifier for Flower Mapping
Peer reviewe
Factors influencing the efficiency of generating genetically engineered pigs by nuclear transfer: multi-factorial analysis of a large data set
Background: Somatic cell nuclear transfer (SCNT) using genetically engineered donor cells is currently the most widely used strategy to generate tailored pig models for biomedical research. Although this approach facilitates a similar spectrum of genetic modifications as in rodent models, the outcome in terms of live cloned piglets is quite variable. In this study, we aimed at a comprehensive analysis of environmental and experimental factors that are substantially influencing the efficiency of generating genetically engineered pigs. Based on a considerably large data set from 274 SCNT experiments (in total 18,649 reconstructed embryos transferred into 193 recipients), performed over a period of three years, we assessed the relative contribution of season, type of genetic modification, donor cell source, number of cloning rounds, and pre-selection of cloned embryos for early development to the cloning efficiency. Results: 109 (56%) recipients became pregnant and 85 (78%) of them gave birth to offspring. Out of 318 cloned piglets, 243 (76%) were alive, but only 97 (40%) were clinically healthy and showed normal development. The proportion of stillborn piglets was 24% (75/318), and another 31% (100/318) of the cloned piglets died soon after birth. The overall cloning efficiency, defined as the number of offspring born per SCNT embryos transferred, including only recipients that delivered, was 3.95%. SCNT experiments performed during winter using fetal fibroblasts or kidney cells after additive gene transfer resulted in the highest number of live and healthy offspring, while two or more rounds of cloning and nuclear transfer experiments performed during summer decreased the number of healthy offspring. Conclusion: Although the effects of individual factors may be different between various laboratories, our results and analysis strategy will help to identify and optimize the factors, which are most critical to cloning success in programs aiming at the generation of genetically engineered pig models
- …