755 research outputs found
Identification of gene-gene interactions for Alzheimer's disease using co-operative game theory
Thesis (Ph.D.)--Boston UniversityThe multifactorial nature of Alzheimer's Disease suggests that complex gene-gene interactions are present in AD pathways. Contemporary approaches to detect such interactions in genome-wide data are mathematically and computationally challenging. We investigated gene-gene interactions for AD using a novel algorithm based on cooperative game theory in 15 genome-wide association study (GWAS) datasets comprising of a total of 11,840 AD cases and 10,931 cognitively normal elderly controls from the Alzheimer Disease Genetics Consortium (ADGC). We adapted this approach, which was developed originally for solving multi-dimensional problems in economics and social sciences, to compute a Shapely value statistic to identify genetic markers that contribute most to coalitions of SNPs in predicting AD risk. Treating each GWAS dataset as independent discovery, markers were ranked according to their contribution to coalitions formed with other markers. Using a backward elimination strategy, markers with low Shapley values were eliminated and the statistic was recalculated iteratively. We tested all two-way interactions between top Shapley markers in regression models which included the two SNPs (main effects) and a term for their interaction. Models yielding a p-value<0.05 for the interaction term were evaluated in each of the other datasets and the results from all datasets were combined by meta-analysis. Statistically significant interactions were observed with multiple marker combinations in the APOE regions. My analyses also revealed statistically strong interactions between markers in 6 regions; CTNNA3-ATP11A (p=4.1E-07), CSMD1-PRKCQ (p=3.5E-08), DCC-UNC5CL (p=5.9e-8), CNTNAP2-RFC3 (p=1.16e-07), AACS-TSHZ3 (p=2.64e-07) and CAMK4-MMD (p=3.3e-07). The Shapley value algorithm outperformed Chi-Square and ReliefF in detecting known interactions between APOE and GAB2 in a previously published GWAS dataset. It was also more accurate than competing filtering methods in identifying simulated epistastic SNPs that are additive in nature, but its accuracy was low in identifying non-linear interactions. The game theory algorithm revealed strong interactions between markers in novel genes with weak main effects, which would have been overlooked if only markers with strong marginal association with AD were tested. This method will be a valuable tool for identifying gene-gene interactions for complex diseases and other traits
Development of Gaussian Learning Algorithms for Early Detection of Alzheimer\u27s Disease
Alzheimer’s disease (AD) is the most common form of dementia affecting 10% of the population over the age of 65 and the growing costs in managing AD are estimated to be $259 billion, according to data reported in the 2017 by the Alzheimer\u27s Association. Moreover, with cognitive decline, daily life of the affected persons and their families are severely impacted. Taking advantage of the diagnosis of AD and its prodromal stage of mild cognitive impairment (MCI), an early treatment may help patients preserve the quality of life and slow the progression of the disease, even though the underlying disease cannot be reversed or stopped. This research aims to develop Gaussian learning algorithms, natural language processing (NLP) techniques, and mathematical models to effectively delineate the MCI participants from the cognitively normal (CN) group, and identify the most significant brain regions and patterns of changes associated with the progression of AD. The focus will be placed on the earliest manifestations of the disease (early MCI or EMCI) to plan for effective curative/therapeutic interventions and protocols.
Multiple modalities of biomarkers have been found to be significantly sensitive in assessing the progression of AD. In this work, several novel multimodal classification frameworks based on proposed Gaussian Learning algorithms are created and applied to neuroimaging data. Classification based on the combination of structural magnetic resonance imaging (MRI), positron emission tomography (PET), and cerebrospinal fluid (CSF) biomarkers is seen as the most reliable approach for high-accuracy classification.
Additionally, changes in linguistic complexity may provide complementary information for the diagnosis and prognosis of AD. For this research endeavor, an NLP-oriented neuropsychological assessment is developed to automatically analyze the distinguishing characteristics of text data in MCI group versus those in CN group. Early findings suggest significant linguistic differences between CN and MCI subjects in terms of word usage, vocabulary, recall, fragmented sentences.
In summary, the results obtained indicate a high potential of the neuroimaging-based classification and NLP-oriented assessment to be utilized as a practically computer aided diagnosis system for classification and prediction of AD and its prodromal stages. Future work will ultimately focus on early signs of AD that could help in the planning of curative and therapeutic intervention to slow the progression of the disease
Identifying Multimodal Intermediate Phenotypes between Genetic Risk Factors and Disease Status in Alzheimer’s Disease
Neuroimaging genetics has attracted growing attention and interest, which
is thought to be a powerful strategy to examine the influence of genetic
variants (i.e., single nucleotide polymorphisms (SNPs)) on structures or
functions of human brain. In recent studies, univariate or multivariate
regression analysis methods are typically used to capture the effective
associations between genetic variants and quantitative traits (QTs) such as
brain imaging phenotypes. The identified imaging QTs, although associated with
certain genetic markers, may not be all disease specific. A useful, but
underexplored, scenario could be to discover only those QTs associated with both
genetic markers and disease status for revealing the chain from genotype to
phenotype to symptom. In addition, multimodal brain imaging phenotypes are
extracted from different perspectives and imaging markers consistently showing
up in multimodalities may provide more insights for mechanistic understanding of
diseases (i.e., Alzheimer’s disease (AD)). In this work, we propose a
general framework to exploit multi-modal brain imaging phenotypes as
intermediate traits that bridge genetic risk factors and multi-class disease
status. We applied our proposed method to explore the relation between the
well-known AD risk SNP APOE rs429358 and three baseline brain
imaging modalities (i.e., structural magnetic resonance imaging (MRI),
fluorodeoxyglucose positron emission tomography (FDG-PET) and F-18 florbetapir
PET scans amyloid imaging (AV45)) from the Alzheimer’s Disease
Neuroimaging Initiative (ADNI) database. The empirical results demonstrate that
our proposed method not only helps improve the performances of imaging genetic
associations, but also discovers robust and consistent regions of interests
(ROIs) across multi-modalities to guide the disease-induced interpretation
Current advances in systems and integrative biology
Systems biology has gained a tremendous amount of interest in the last few years. This is partly due to the realization that traditional approaches focusing only on a few molecules at a time cannot describe the impact of aberrant or modulated molecular environments across a whole system. Furthermore, a hypothesis-driven study aims to prove or disprove its postulations, whereas a hypothesis-free systems approach can yield an unbiased and novel testable hypothesis as an end-result. This latter approach foregoes assumptions which predict how a biological system should react to an altered microenvironment within a cellular context, across a tissue or impacting on distant organs. Additionally, re-use of existing data by systematic data mining and re-stratification, one of the cornerstones of integrative systems biology, is also gaining attention. While tremendous efforts using a systems methodology have already yielded excellent results, it is apparent that a lack of suitable analytic tools and purpose-built databases poses a major bottleneck in applying a systematic workflow. This review addresses the current approaches used in systems analysis and obstacles often encountered in large-scale data analysis and integration which tend to go unnoticed, but have a direct impact on the final outcome of a systems approach. Its wide applicability, ranging from basic research, disease descriptors, pharmacological studies, to personalized medicine, makes this emerging approach well suited to address biological and medical questions where conventional methods are not ideal
The Stylometric Processing of Sensory Open Source Data
This research project’s end goal is on the Lone Wolf Terrorist.
The project uses an exploratory approach to the
self-radicalisation problem by creating a stylistic fingerprint
of a person's personality, or self, from subtle characteristics
hidden in a person's writing style. It separates the identity of
one person from another based on their writing style. It also
separates the writings of suicide attackers from ‘normal'
bloggers by critical slowing down; a dynamical property used to
develop early warning signs of tipping points. It identifies
changes in a person's moods, or shifts from one state to another,
that might indicate a tipping point for self-radicalisation.
Research into authorship identity using personality is a
relatively new area in the field of neurolinguistics. There are
very few methods that model how an individual's cognitive
functions present themselves in writing. Here, we develop a
novel algorithm, RPAS, which draws on cognitive functions such as
aging, sensory processing, abstract or concrete thinking through
referential activity emotional experiences, and a person's
internal gender for identity. We use well-known techniques such
as Principal Component Analysis, Linear Discriminant Analysis,
and the Vector Space Method to cluster multiple
anonymous-authored works. Here we use a new approach, using
seriation with noise to separate subtle features in individuals.
We conduct time series analysis using modified variants of 1-lag
autocorrelation and the coefficient of skewness, two statistical
metrics that change near a tipping point, to track serious life
events in an individual through cognitive linguistic markers.
In our journey of discovery, we uncover secrets about the
Elizabethan playwrights hidden for over 400 years. We uncover
markers for depression and anxiety in modern-day writers and
identify linguistic cues for Alzheimer's disease much earlier
than other studies using sensory processing. In using these
techniques on the Lone Wolf, we can separate their writing style
used before their attacks that differs from other writing
An introduction to time-resolved decoding analysis for M/EEG
The human brain is constantly processing and integrating information in order
to make decisions and interact with the world, for tasks from recognizing a
familiar face to playing a game of tennis. These complex cognitive processes
require communication between large populations of neurons. The non-invasive
neuroimaging methods of electroencephalography (EEG) and magnetoencephalography
(MEG) provide population measures of neural activity with millisecond precision
that allow us to study the temporal dynamics of cognitive processes. However,
multi-sensor M/EEG data is inherently high dimensional, making it difficult to
parse important signal from noise. Multivariate pattern analysis (MVPA) or
"decoding" methods offer vast potential for understanding high-dimensional
M/EEG neural data. MVPA can be used to distinguish between different conditions
and map the time courses of various neural processes, from basic sensory
processing to high-level cognitive processes. In this chapter, we discuss the
practical aspects of performing decoding analyses on M/EEG data as well as the
limitations of the method, and then we discuss some applications for
understanding representational dynamics in the human brain
Optimizing Alzheimer's disease prediction using the nomadic people algorithm
The problem with using microarray technology to detect diseases is that not each is analytically necessary. The presence of non-essential gene data adds a computing load to the detection method. Therefore, the purpose of this study is to reduce the high-dimensional data size by determining the most critical genes involved in Alzheimer's disease progression. A study also aims to predict patients with a subset of genes that cause Alzheimer's disease. This paper uses feature selection techniques like information gain (IG) and a novel metaheuristic optimization technique based on a swarm’s algorithm derived from nomadic people’s behavior (NPO). This suggested method matches the structure of these individuals' lives movements and the search for new food sources. The method is mostly based on a multi-swarm method; there are several clans, each seeking the best foraging opportunities. Prediction is carried out after selecting the informative genes of the support vector machine (SVM), frequently used in a variety of prediction tasks. The accuracy of the prediction was used to evaluate the suggested system's performance. Its results indicate that the NPO algorithm with the SVM model returns high accuracy based on the gene subset from IG and NPO methods
Pattern recognition and machine learning for magnetic resonance images with kernel methods
The aim of this thesis is to apply a particular category of machine learning and
pattern recognition algorithms, namely the kernel methods, to both functional and
anatomical magnetic resonance images (MRI). This work specifically focused on
supervised learning methods. Both methodological and practical aspects are described
in this thesis.
Kernel methods have the computational advantage for high dimensional data,
therefore they are idea for imaging data. The procedures can be broadly divided into
two components: the construction of the kernels and the actual kernel algorithms
themselves. Pre-processed functional or anatomical images can be computed into a
linear kernel or a non-linear kernel. We introduce both kernel regression and kernel
classification algorithms in two main categories: probabilistic methods and
non-probabilistic methods. For practical applications, kernel classification methods
were applied to decode the cognitive or sensory states of the subject from the fMRI
signal and were also applied to discriminate patients with neurological diseases from
normal people using anatomical MRI. Kernel regression methods were used to predict
the regressors in the design of fMRI experiments, and clinical ratings from the
anatomical scans
Evolutionary approaches for feature selection in biological data
Data mining techniques have been used widely in many areas such as business, science, engineering and medicine. The techniques allow a vast amount of data to be explored in order to extract useful information from the data. One of the foci in the health area is finding interesting biomarkers from biomedical data. Mass throughput data generated from microarrays and mass spectrometry from biological samples are high dimensional and is small in sample size. Examples include DNA microarray datasets with up to 500,000 genes and mass spectrometry data with 300,000 m/z values. While the availability of such datasets can aid in the development of techniques/drugs to improve diagnosis and treatment of diseases, a major challenge involves its analysis to extract useful and meaningful information. The aims of this project are: 1) to investigate and develop feature selection algorithms that incorporate various evolutionary strategies, 2) using the developed algorithms to find the “most relevant” biomarkers contained in biological datasets and 3) and evaluate the goodness of extracted feature subsets for relevance (examined in terms of existing biomedical domain knowledge and from classification accuracy obtained using different classifiers). The project aims to generate good predictive models for classifying diseased samples from control
- …