11 research outputs found
Gene-environment and gene-gene interactions in myopia
Motivated by the release of the UK Biobank data and the lack of documented gene-environment (GxE) and gene-gene (GxG) interactions in myopia, I sought to apply various statistical tools to provide a quantitative assessment of the interplay between environmental and genetic risk factors shaping refractive error.
The comparison between the two different risk measurement scales with which GxE interactions can be identified suggested that the additive risk scale can lead to a more informative perspective about refractive error aetiology.
The evaluation of two indirect methods for detecting genetic variants affecting refractive error via interaction effects suggested the enrichment of GxG and GxE among the variants that display marginal SNP effects.
For genetic variants already known from prior GWAS studies to influence refractive error, genetic effect sizes were highly non-uniform; individuals from the tails of the refractive error distribution (i.e. high myopes and hyperopes) displayed much larger effects compared to individuals in the middle of the distribution (i.e. emmetropes).
Prediction of refractive error using GxE interactions indicated that although some of the variance of refractive error could be explained by a risk score constructed using interaction effects, the contribution of GxE was already accounted for by a risk score constructed using marginal SNP effects only.
Although a handful of candidate genes were identified using multifactor dimensionality reduction technique, none displayed compelling evidence of involvement in a GxG interaction. There was, however, suggestive evidence that the candidate genes constitute a genetic interaction network which is regulated by hub gene ZMAT4.
In summary, the analyses reported in this thesis provide further support for the challenging nature of definitively identifying loci involved in GxE and GxG interactions. The thesis provides several guidelines that future studies could take into account to obtain more insightful results regarding the extent of interactions in refractive error
High-Order Epistasis Detection in High Performance Computing Systems
Programa Oficial de Doutoramento en Investigaci贸n en Tecnolox铆as da Informaci贸n. 524V01[Resumo]
Nos 煤ltimos anos, os estudos de asociaci贸n do xenoma completo (Genome-Wide
Association Studies, GWAS) est谩n a ga帽ar moita popularidade de cara a buscar unha
explicaci贸n xen茅tica 谩 presenza ou ausencia de certas enfermidades nos humanos.Hai
un consenso nestes estudos sobre a existencia de interacci贸ns xen茅ticas que condicionan
a expresi贸n de enfermidades complexas, un fen贸meno co帽ecido como epistasia.
Esta tese c茅ntrase no estudo deste fen贸meno empregando a computaci贸n de altas
prestaci贸ns (High-Performance Computing, HPC) e dende a s煤a perspectiva estad铆stica:
a desviaci贸n da expresi贸n dun fenotipo como a suma dos efectos individuais de
m煤ltiples variantes xen茅ticas. Con este obxectivo desenvolvemos unha primeira ferramenta,
chamada MPI3SNP, que identifica interacci贸ns de tres variantes a partir dun
conxunto de datos de entrada. MPI3SNP implementa unha busca exhaustiva empregando
un test de asociaci贸n baseado na Informaci贸n Mutua, e explota os recursos de
cl煤steres de CPUs ou GPUs para acelerar a busca. Coa axuda desta ferramenta avaliamos
o estado da arte da detecci贸n de epistasia a trav茅s dun estudo que compara o rendemento
de vintesete ferramentas. A conclusi贸n m谩is importante desta comparativa
茅 a incapacidade dos m茅todos non exhaustivos de atopar interacci贸n ante a ausencia
de efectos marxinais (pequenos efectos de asociaci贸n das variantes individuais que
participan na epistasia). Por isto, esta tese continuou centr谩ndose na optimizaci贸n da
busca exhaustiva de epistasia. Por unha parte, mellorouse a eficiencia do test de asociaci贸n
a trav茅s dunha implantaci贸n vectorial do mesmo. Por outro lado, creouse un
algoritmo distribu铆do que implementa unha busca exhaustiva capaz de atopar epistasia
de calquera orden. Estes dous fitos l贸granse en Fiuncho, unha ferramenta que integra
toda a investigaci贸n realizada, obtendo un rendemento en cl煤steres de CPUs que
supera a todas as s煤as alternativas no estado da arte. Adicionalmente, desenvolveuse
unha librar铆a para simular escenarios biol贸xicos con epistasia chamada Toxo. Esta
librar铆a permite a simulaci贸n de epistasia seguindo modelos de interacci贸n xen茅tica
existentes para orde alto.[Resumen]
En los 煤ltimos a帽os, los estudios de asociaci贸n del genoma completo (Genome-
Wide Association Studies, GWAS) est谩n ganando mucha popularidad de cara a buscar
una explicaci贸n gen茅tica a la presencia o ausencia de ciertas enfermedades en los seres
humanos. Existe un consenso entre estos estudios acerca de que muchas enfermedades
complejas presentan interacciones entre los diferentes genes que intervienen en su
expresi贸n, un fen贸meno conocido como epistasia. Esta tesis se centra en el estudio de
este fen贸meno empleando la computaci贸n de altas prestaciones (High-Performance
Computing, HPC) y desde su perspectiva estad铆stica: la desviaci贸n de la expresi贸n de
un fenotipo como suma de los efectos de m煤ltiples variantes gen茅ticas. Para ello se
ha desarrollado una primera herramienta, MPI3SNP, que identifica interacciones de
tres variantes a partir de un conjunto de datos de entrada. MPI3SNP implementa una
b煤squeda exhaustiva empleando un test de asociaci贸n basado en la Informaci贸n Mutua,
y explota los recursos de cl煤steres de CPUs o GPUs para acelerar la b煤squeda.
Con la ayuda de esta herramienta, hemos evaluado el estado del arte de la detecci贸n
de epistasia a trav茅s de un estudio que compara el rendimiento de veintisiete herramientas.
La conclusi贸n m谩s importante de esta comparativa es la incapacidad de los
m茅todos no exhaustivos de localizar interacciones ante la ausencia de efectos marginales
(peque帽os efectos de asociaci贸n de variantes individuales pertenecientes a una
relaci贸n epist谩tica). Por ello, esta tesis continu贸 centr谩ndose en la optimizaci贸n de la
b煤squeda exhaustiva. Por un lado, se mejor贸 la eficiencia del test de asociaci贸n a trav茅s
de una implementaci贸n vectorial del mismo. Por otra parte, se dise帽贸 un algoritmo
distribuido que implementa una b煤squeda exhaustiva capaz de encontrar relaciones
epist谩ticas de cualquier tama帽o. Estos dos hitos se logran en Fiuncho, una herramienta
que integra toda la investigaci贸n realizada, obteniendo un rendimiento en cl煤steres
de CPUs que supera a todas sus alternativas del estado del arte. A mayores, tambi茅n se
ha desarrollado una librer铆a para simular escenarios biol贸gicos con epistasia llamada
Toxo. Esta librer铆a permite la simulaci贸n de epistasia siguiendomodelos de interacci贸n
existentes para orden alto.[Abstract]
In recent years, Genome-Wide Association Studies (GWAS) have become more and
more popular with the intent of finding a genetic explanation for the presence or absence
of particular diseases in human studies. There is consensus about the presence
of genetic interactions during the expression of complex diseases, a phenomenon
called epistasis. This thesis focuses on the study of this phenomenon, employingHigh-
Performance Computing (HPC) for this purpose and from a statistical definition of the
problem: the deviation of the expression of a phenotype from the addition of the individual
contributions of genetic variants. For this purpose, we first developedMPI3SNP,
a programthat identifies interactions of three variants froman input dataset. MPI3SNP
implements an exhaustive search of epistasis using an association test based on the
Mutual Information and exploits the resources of clusters of CPUs or GPUs to speed up
the search. Then, we evaluated the state-of-the-art methods with the help of MPI3SNP
in a study that compares the performance of twenty-seven tools. The most important
conclusion of this study is the inability of non-exhaustive approaches to locate epistasis
in the absence of marginal effects (small association effects of individual variants
that partake in an epistasis interaction). For this reason, this thesis continued focusing
on the optimization of the exhaustive search. First, we improved the efficiency of
the association test through a vector implementation of this procedure. Then, we developed
a distributed algorithm capable of locating epistasis interactions of any order.
These two milestones were achieved in Fiuncho, a program that incorporates all the
research carried out, obtaining the best performance in CPU clusters out of all the alternatives
of the state-of-the-art. In addition, we also developed a library to simulate
particular scenarios with epistasis called Toxo. This library allows for the simulation of
epistasis that follows existing interaction models for high-order interactions
Algorithms for regression and classification
Regression and classification are statistical techniques that may be used to extract rules and patterns out of data sets. Analyzing the involved algorithms comprises interdisciplinary research that offers interesting problems for statisticians and computer scientists alike. The focus of this thesis is on robust regression and classification in genetic association studies.
In the context of robust regression, new exact algorithms and results for robust online scale estimation with the estimators Qn and Sn and for robust linear regression in the plane with the estimator least quartile difference (LQD) are presented. Additionally, an evolutionary computation algorithm for robust regression with different estimators in higher dimensions is devised. These estimators include the widely used least median of squares (LMS) and least trimmed squares (LTS).
For classification in genetic association studies, this thesis describes a Genetic Programming algorithm that outpeforms the standard approaches on the considered data sets. It is able to identify interesting genetic factors not found before in a data set on sporadic breast cancer and to handle larger data sets than the compared methods. In addition, it is extendible to further application fields
Fuelling the zero-emissions road freight of the future: routing of mobile fuellers
The future of zero-emissions road freight is closely tied to the sufficient availability of new and clean fuel options such as electricity and Hydrogen. In goods distribution using Electric Commercial Vehicles (ECVs) and Hydrogen Fuel Cell Vehicles (HFCVs) a major challenge in the transition period would pertain to their limited autonomy and scarce and unevenly distributed refuelling stations. One viable solution to facilitate and speed up the adoption of ECVs/HFCVs by logistics, however, is to get the fuel to the point where it is needed (instead of diverting the route of delivery vehicles to refuelling stations) using "Mobile Fuellers (MFs)". These are mobile battery swapping/recharging vans or mobile Hydrogen fuellers that can travel to a running ECV/HFCV to provide the fuel they require to complete their delivery routes at a rendezvous time and space. In this presentation, new vehicle routing models will be presented for a third party company that provides MF services. In the proposed problem variant, the MF provider company receives routing plans of multiple customer companies and has to design routes for a fleet of capacitated MFs that have to synchronise their routes with the running vehicles to deliver the required amount of fuel on-the-fly. This presentation will discuss and compare several mathematical models based on different business models and collaborative logistics scenarios
Large space structures and systems in the space station era: A bibliography with indexes (supplement 04)
Bibliographies and abstracts are listed for 1211 reports, articles, and other documents introduced into the NASA scientific and technical information system between 1 Jul. and 30 Dec. 1991. Its purpose is to provide helpful information to the researcher, manager, and designer in technology development and mission design according to system, interactive analysis and design, structural concepts and control systems, electronics, advanced materials, assembly concepts, propulsion, and solar power satellite systems