2,556 research outputs found
SMAGEXP: a galaxy tool suite for transcriptomics data meta-analysis
Bakground: With the proliferation of available microarray and high throughput
sequencing experiments in the public domain, the use of meta-analysis methods
increases. In these experiments, where the sample size is often limited,
meta-analysis offers the possibility to considerably enhance the statistical
power and give more accurate results. For those purposes, it combines either
effect sizes or results of single studies in a appropriate manner. R packages
metaMA and metaRNASeq perform meta-analysis on microarray and NGS data,
respectively. They are not interchangeable as they rely on statistical modeling
specific to each technology.
Results: SMAGEXP (Statistical Meta-Analysis for Gene EXPression) integrates
metaMA and metaRNAseq packages into Galaxy. We aim to propose a unified way to
carry out meta-analysis of gene expression data, while taking care of their
specificities. We have developed this tool suite to analyse microarray data
from Gene Expression Omnibus (GEO) database or custom data from affymetrix
microarrays. These data are then combined to carry out meta-analysis using
metaMA package. SMAGEXP also offers to combine raw read counts from Next
Generation Sequencing (NGS) experiments using DESeq2 and metaRNASeq package. In
both cases, key values, independent from the technology type, are reported to
judge the quality of the meta-analysis. These tools are available on the Galaxy
main tool shed. Source code, help and installation instructions are available
on github.
Conclusion: The use of Galaxy offers an easy-to-use gene expression
meta-analysis tool suite based on the metaMA and metaRNASeq packages
Differential meta-analysis of RNA-seq data from multiple studies
High-throughput sequencing is now regularly used for studies of the
transcriptome (RNA-seq), particularly for comparisons among experimental
conditions. For the time being, a limited number of biological replicates are
typically considered in such experiments, leading to low detection power for
differential expression. As their cost continues to decrease, it is likely that
additional follow-up studies will be conducted to re-address the same
biological question. We demonstrate how p-value combination techniques
previously used for microarray meta-analyses can be used for the differential
analysis of RNA-seq data from multiple related studies. These techniques are
compared to a negative binomial generalized linear model (GLM) including a
fixed study effect on simulated data and real data on human melanoma cell
lines. The GLM with fixed study effect performed well for low inter-study
variation and small numbers of studies, but was outperformed by the
meta-analysis methods for moderate to large inter-study variability and larger
numbers of studies. To conclude, the p-value combination techniques illustrated
here are a valuable tool to perform differential meta-analyses of RNA-seq data
by appropriately accounting for biological and technical variability within
studies as well as additional study-specific effects. An R package metaRNASeq
is available on the R Forge
One machine, one minute, three billion tetrahedra
This paper presents a new scalable parallelization scheme to generate the 3D
Delaunay triangulation of a given set of points. Our first contribution is an
efficient serial implementation of the incremental Delaunay insertion
algorithm. A simple dedicated data structure, an efficient sorting of the
points and the optimization of the insertion algorithm have permitted to
accelerate reference implementations by a factor three. Our second contribution
is a multi-threaded version of the Delaunay kernel that is able to concurrently
insert vertices. Moore curve coordinates are used to partition the point set,
avoiding heavy synchronization overheads. Conflicts are managed by modifying
the partitions with a simple rescaling of the space-filling curve. The
performances of our implementation have been measured on three different
processors, an Intel core-i7, an Intel Xeon Phi and an AMD EPYC, on which we
have been able to compute 3 billion tetrahedra in 53 seconds. This corresponds
to a generation rate of over 55 million tetrahedra per second. We finally show
how this very efficient parallel Delaunay triangulation can be integrated in a
Delaunay refinement mesh generator which takes as input the triangulated
surface boundary of the volume to mesh
Guided Machine Learning for power grid segmentation
The segmentation of large scale power grids into zones is crucial for control
room operators when managing the grid complexity near real time. In this paper
we propose a new method in two steps which is able to automatically do this
segmentation, while taking into account the real time context, in order to help
them handle shifting dynamics. Our method relies on a "guided" machine learning
approach. As a first step, we define and compute a task specific "Influence
Graph" in a guided manner. We indeed simulate on a grid state chosen
interventions, representative of our task of interest (managing active power
flows in our case). For visualization and interpretation, we then build a
higher representation of the grid relevant to this task by applying the graph
community detection algorithm \textit{Infomap} on this Influence Graph. To
illustrate our method and demonstrate its practical interest, we apply it on
commonly used systems, the IEEE-14 and IEEE-118. We show promising and original
interpretable results, especially on the previously well studied RTS-96 system
for grid segmentation. We eventually share initial investigation and results on
a large-scale system, the French power grid, whose segmentation had a
surprising resemblance with RTE's historical partitioning
Utilisation des virus oncolytiques dans le traitement des tumeurs gliales chez le chien
Les tumeurs gliales correspondent à la deuxième tumeur intracrânienne la plus fréquente chez le chien. Les traitements actuels existants présentent des résultats décevants. En effet, les récidives sont fréquentes et les médianes de survie sont faibles. De nouvelles thérapeutiques sont donc en cours d’étude, dont notamment la thérapie génique utilisant des virus oncolytiques et montrent des résultats prometteurs. Par conséquent, cette thèse a pour objectif de montrer l’intérêt d’utiliser les virus oncolytiques dans le traitement des gliomes canins. Différentes études réalisées utilisant les virus oncolytiques : in vitro sur des lignées cellulaires de gliomes humains et canins, in vivo chez les modèles murins et dans des essais cliniques chez l’homme, seront présentées
New efficient algorithms for multiple change-point detection with kernels
Several statistical approaches based on reproducing kernels have been
proposed to detect abrupt changes arising in the full distribution of the
observations and not only in the mean or variance. Some of these approaches
enjoy good statistical properties (oracle inequality, \ldots). Nonetheless,
they have a high computational cost both in terms of time and memory. This
makes their application difficult even for small and medium sample sizes (). This computational issue is addressed by first describing a new
efficient and exact algorithm for kernel multiple change-point detection with
an improved worst-case complexity that is quadratic in time and linear in
space. It allows dealing with medium size signals (up to ).
Second, a faster but approximation algorithm is described. It is based on a
low-rank approximation to the Gram matrix. It is linear in time and space. This
approximation algorithm can be applied to large-scale signals ().
These exact and approximation algorithms have been implemented in \texttt{R}
and \texttt{C} for various kernels. The computational and statistical
performances of these new algorithms have been assessed through empirical
experiments. The runtime of the new algorithms is observed to be faster than
that of other considered procedures. Finally, simulations confirmed the higher
statistical accuracy of kernel-based approaches to detect changes that are not
only in the mean. These simulations also illustrate the flexibility of
kernel-based approaches to analyze complex biological profiles made of DNA copy
number and allele B frequencies. An R package implementing the approach will be
made available on github
- …