1,282 research outputs found
Data Discovery and Anomaly Detection Using Atypicality: Theory
A central question in the era of 'big data' is what to do with the enormous
amount of information. One possibility is to characterize it through
statistics, e.g., averages, or classify it using machine learning, in order to
understand the general structure of the overall data. The perspective in this
paper is the opposite, namely that most of the value in the information in some
applications is in the parts that deviate from the average, that are unusual,
atypical. We define what we mean by 'atypical' in an axiomatic way as data that
can be encoded with fewer bits in itself rather than using the code for the
typical data. We show that this definition has good theoretical properties. We
then develop an implementation based on universal source coding, and apply this
to a number of real world data sets.Comment: 40 page
Data Discovery and Anomaly Detection using Atypicality.
Ph.D. Thesis. University of Hawaiʻi at Mānoa 2017
An investigation into the requirements for an efficient image transmission system over an ATM network
This thesis looks into the problems arising in an image transmission system when
transmitting over an A TM network. Two main areas were investigated: (i) an
alternative coding technique to reduce the bit rate required; and (ii) concealment of
errors due to cell loss, with emphasis on processing in the transform domain of
DCT-based images. [Continues.
Mechanism of PhosphoThreonine/Serine Recognition and Specificity for Modular Domains from All-atom Molecular Dynamics
<p>Abstract</p> <p>Background</p> <p>Phosphopeptide-binding domains mediate many vital cellular processes such as signal transduction and protein recognition. We studied three well-known domains important for signal transduction: BRCT repeats, WW domain and forkhead-associated (FHA) domain. The first two recognize both phosphothreonine (pThr) and phosphoserine (pSer) residues, but FHA has high specificity for pThr residues. Here we used molecular dynamics (MD) simulations to reveal how FHA exclusively chooses pThr and how BRCT and WW recognize both pThr/pSer. The work also investigated the energies and thermodynamic information of intermolecular interactions.</p> <p>Results</p> <p>Simulations carried out included wide-type and mutated systems. Through analysis of MD simulations, we found that the conserved His residue defines dual loops feature of the FHA domain, which creates a small cavity reserved for only the methyl group of pThr. These well-organized loop interactions directly response to the pThr binding selectivity, while single loop (the 2nd phosphobinding site of FHA) or in combination with α-helix (BRCT repeats) or β-sheet (WW domain) fail to differentiate pThr/pSer.</p> <p>Conclusions</p> <p>Understanding the domain pre-organizations constructed by conserved residues and the driving force of domain-phosphopeptide recognition provides structural insight into pThr specific binding, which also helps in engineering proteins and designing peptide inhibitors.</p
Unravelling genetic predisposition to familial breast and ovarian cancer: new susceptibility genes and variant interpretation by in silico approaches
Programa de Doctorat en Biomedicina / Tesi realitzada a l'Institut d'Oncologia Vall d’Hebron (VHIO)Patients with hereditary breast and ovarian cancer (HBOC) in whom a causative pathogenic variant is not identified after genetic analysis may not benefit from prevention, early detection, or precision treatment measures. This negative or inconclusive results are due, among other causes, to the detection of variants of uncertain significance (VUS).The main objective of this thesis is to increase the capacity of genetic diagnosis of patients with HBOC, by focusing on i) the optimisation in the interpretation of exonic and intronic variants that might affect RNA quality or quantity but remain as variants of uncertain significance (VUS) and ii) the identification of new susceptibility genes for HBOC.
The article included in this thesis, Moles-Fernández et al., 2018 (DOI: 10.3389/fgene.2018.00366) explains an optimization in the identification of potentially spliceogenic variants located near to splicing sites, and provides recommendations to use for analysing donor and acceptor sites. Moreover, the creation or activation of cryptic sites along deep intronic regions could alter splicing causing the inclusion of intronic sequences in RNA. In the article, Moles-Fernández et al., 2021 (DOI: 10.3390/cancers13133341), a framework for the identification of deep intronic spliceogenic is provided, after the performance analysis of SpliceAI in silico tool in a dataset of spliceogenic and non-spliceogenic deep intronic variants. In addition, the importance of the splicing regulatory elements balance in the pseudoexon creation is described.
The American College of Medical Genetics (ACMG) variant interpretation guidelines provide general recommendations to classify variants. In the included article Feliubadalò et al., 2021 (DOI: 10.1093/clinchem/hvaa250), ACMG guidelines were adapted to ATM gene. We focused on in silico splicing evidence (PP3/BP4). After reclassification of variants following the adapted guidelines, a reduction of VUS was obtained.
On the other hand, in patients without pathogenic variants identified in HBOC related genes, the phenotype could be due to deleterious variants in genes still not known associated with the disease. For this reason, in Moles-Fernández et al., (article in preparation), the aim was to identify candidate genes through exomes and extended panel analysis and validate their risk association by performing a case-control study. The significant identification of loss-of-function variants in ALKBH3, BLM, CAMKK1, FANCD2, FANCM, NEIL3, PER1, RBL1, RECQL4, WRN and XRCC4 genes in patients with HBOC suggests that they might be breast/ovarian cancer susceptibility genes
Modeling signal transduction pathways and their transcriptional response
This thesis is concerned with revealing regulation of gene expression. The basic motivation behind our work is that gene regulation can be better resolved when analyzed in a cellular context of the upstream signaling pathway and known regulatory targets. Our source of data are perturbation experiments, which are performed on pathway components and induce changes in gene expression. In such a way, they connect the signaling pathway to its downstream target genes. This chapter starts with an introduction to the cellular con- text considered in the thesis (section 1.1) and the principles of perturbation experiments (section 1.2). We end with a concise summary of three approaches that comprise this thesis. The approaches tackle various problems in the process of revealing context-speci c regulatory networks (section 1.3). We deal with di erential expression analysis of the per- turbation data, enhanced with known transcription factor targets serving as examples of di erential genes (chapter 2), pathway model-based planning of informative perturbation experiments (chapter 3), and nally, with deregulation analysis, i.e., comparing changes in gene regulation between two di erent cell populations (chapter 4)
- …