Apart from diseases caused by the defect of a single gene, most diseases are highly complex
and are usually caused by a combination of biological and environmental factors. In the
biological context, cellular processes are often tightly connected across molecular layers of the
central dogma of biology, and the examination of a single layer would not be sufficient to
address disease pathology, therefore, conclusions drawn can be limited. Combining biological
observations from multiple layers or angles would greatly broaden our perspectives on the
disease in concern and may lead to novel discoveries which would not be possible to deduce
from a single-omics perspective. In this thesis, we focused on the method development for
single-cell transcriptomics to address the prime bias problem introduced by the new dropletbased technologies; integrative omics discovery of genomic signatures specific to different
brain regions in normal individuals; as well as the utilization of multiple omics to identify
potential biomarkers specific to amyotrophic lateral sclerosis (ALS) disease prognosis and
diagnosis.
Research has been revolutionized with the advent of single-cell omics technologies in the past
few decades and new methods and tools have also been developed to accommodate such
scientific accelerations. These innovations however posed new challenges and could
potentially introduce bias and unforeseeable circumstances if left unaddressed. Specifically, to
resolve the prime-based problem introduced by the current popular droplet-based single-cell
sequencing technologies which may lead to bias quantification, in Study I, we presented a novel
transcript quantification tool for droplet-based single-cell RNA-Sequencing (scRNA-Seq)
technologies and benchmarked our tool with other popular transcript and gene quantification
tools. Our tool outperformed currently popular tools in terms of transcript- and gene-level
quantifications.
In Study II, we investigated the association of splicing variants with the genetic patterns from
different regions of the brain in normal individuals to identify quantitative trait loci (QTL)
associated with ratios of isoform expression in genes. We carried out genome-wide association
studies (GWAS) on isoform ratios from 13 brain regions and identified isoform-ratio QTL
(irQTL) specific to each brain region, and their associated traits which could have been missed
by expression QTL derived from gene expressions.
We further looked into the utilization of proteomics and genomics data for ALS disease in
Study III to understand disease pathology from multiple perspectives, and to identify potential
protein biomarkers and protein QTL (pQTL) specific to different stages of the disease and
tissue sites. In terms of proteomics, for each tissue site, we identified potential protein
biomarkers specific to disease prognosis, survival of ALS patients, the functional decline
among ALS patients, and longitudinal changes after disease diagnosis. In terms of integrative
omics, we performed GWAS of protein expressions with genotyping data and identified tissuesite-specific pQTL signatures for ALS patients.
All in all, our studies showed efforts in developing a single-cell transcript quantification tool
to address potential bias problems with improved performance; identifying novel irQTL
signatures specific to various brain regions using an integrative omics approach; and also
discovering potential protein and genetic signatures for different tissues sites and pathological
stages in ALS disease using multiple omics. We hope our work could potentially enhance the
research process in various omics in terms of methods development and the novel signatures
could act as valuable resources for fostering further research ideas and potential experimental
validations