162 research outputs found

    Compensation methods to support cooperative applications: A case study in automated verification of schema requirements for an advanced transaction model

    Get PDF
    Compensation plays an important role in advanced transaction models, cooperative work and workflow systems. A schema designer is typically required to supply for each transaction another transaction to semantically undo the effects of . Little attention has been paid to the verification of the desirable properties of such operations, however. This paper demonstrates the use of a higher-order logic theorem prover for verifying that compensating transactions return a database to its original state. It is shown how an OODB schema is translated to the language of the theorem prover so that proofs can be performed on the compensating transactions

    Advancing Biomedicine with Graph Representation Learning: Recent Progress, Challenges, and Future Directions

    Full text link
    Graph representation learning (GRL) has emerged as a pivotal field that has contributed significantly to breakthroughs in various fields, including biomedicine. The objective of this survey is to review the latest advancements in GRL methods and their applications in the biomedical field. We also highlight key challenges currently faced by GRL and outline potential directions for future research.Comment: Accepted by 2023 IMIA Yearbook of Medical Informatic

    Clustering of Cases from Di erent Subtypes of Breast Cancer Using a Hop eld Network Built from Multi-omic Data

    Get PDF
    Tesis de Graduación (Maestría en Computación) Instituto Tecnológico de Costa Rica, Escuela de Computación, 2018Despite scienti c advances, breast cancer still constitutes a worldwide major cause of death among women. Given the great heterogeneity between cases, distinct classi cation schemes have emerged. The intrinsic molecular subtype classi cation (luminal A, luminal B, HER2- enriched and basal-like) accounts for the molecular characteristics and prognosis of tumors, which provides valuable input for taking optimal treatment actions. Also, recent advancements in molecular biology have provided scientists with high quality and diversity of omiclike data, opening up the possibility of creating computational models for improving and validating current subtyping systems. On this study, a Hop eld Network model for breast cancer subtyping and characterization was created using data from The Cancer Genome Atlas repository. Novel aspects include the usage of the network as a clustering mechanism and the integrated use of several molecular types of data (gene mRNA expression, miRNA expression and copy number variation). The results showed clustering capabilities for the network, but even so, trying to derive a biological model from a Hop eld Network might be di cult given the mirror attractor phenomena (every cluster might end up with an opposite). As a methodological aspect, Hop eld was compared with kmeans and OPTICS clustering algorithms. The last one, surprisingly, hints at the possibility of creating a high precision model that di erentiates between luminal, HER2-enriched and basal samples using only 10 genes. The normalization procedure of dividing gene expression values by their corresponding gene copy number appears to have contributed to the results. This opens up the possibility of exploring these kind of prediction models for implementing diagnostic tests at a lower cost

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 30th European Symposium on Programming, ESOP 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 24 papers included in this volume were carefully reviewed and selected from 79 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Predicting Rules for Cancer Subtype Classification using Grammar-Based Genetic Programming on various Genomic Data Types

    Get PDF
    With the advent of high-throughput methods more genomic data then ever has been generated during the past decade. As these technologies remain cost intensive and not worthwhile for every research group, databases, such as the TCGA and Firebrowse, emerged. While these database enable the fast and free access to massive amounts of genomic data, they also embody new challenges to the research community. This study investigates methods to obtain, normalize and process genomic data for computer aided decision making in the field of cancer subtype discovery. A new software, termed FirebrowseR is introduced, allowing the direct download of genomic data sets into the R programming environment. To pre-process the obtained data, a set of methods is introduced, enabling data type specific normalization. As a proof of principle, the Web-TCGA software is created, enabling fast data analysis. To explore cancer subtypes a statistical model, the EDL, is introduced. The newly developed method is designed to provide highly precise, yet interpretable models. The EDL is tested on well established data sets, while its performance is compared to state of the art machine learning algorithms. As a proof of principle, the EDL was run on a cohort of 1,000 breast cancer patients, where it reliably re-identified the known subtypes and automatically selected the corresponding maker genes, by which the subtypes are defined. In addition, novel patterns of alterations in well known maker genes could be identified to distinguish primary and mCRPC samples. The findings suggest that mCRPC is characterized through a unique amplification of the Androgen Receptor, while a significant fraction of primary samples is described by a loss of heterozygosity TP53 and NCOR1

    Taxometric Analysis of Negative Symptoms in An International Sample of Ten Countries

    Get PDF
    Negative symptoms have emerged as a replicable factor of symptomatology within schizophrenia. Although rating scales provide assessments along dimensions of severity, categorization of a negative symptom subtype is typically concluded. Despite an accumulation of findings that support categorical conceptualization, the data are also consistent with a dimensional-only model where negative symptom subtypologies simply reflect an extreme on a continuum of severity. Previous studies (Blanchard, et al, 2005) have used taxometric statistical methods to confirm the existence of a negative symptom subtype; however, the nature of taxometric methods requires replication (Waller & Meehl, 1998). The current investigation is a taxometric analysis of the World Health Organization Ten-Country Study of Schizophrenia. Data from a subset of 694 individuals were analyzed using the taxometric methods of maximum covariance analysis (MAXCOV) and mean above minus below a cut (MAMBAC) and a latent class with a base rate of approximately .14 - .16 was identified

    Cancer Subtyping Detection using Biomarker Discovery in Multi-Omics Tensor Datasets

    Get PDF
    This thesis begins with a thorough review of research trends from 2015 to 2022, examining the challenges and issues related to biomarker discovery in multi-omics datasets. The review covers areas of application, proposed methodologies, evaluation criteria used to assess performance, as well as limitations and drawbacks that require further investigation and improvement. This comprehensive overview serves to provide a deeper understanding of the current state of research in this field and the opportunities for future research. It will be particularly useful for those who are interested in this area of study and seeking to expand their knowledge. In the second part of this thesis, a novel methodology is proposed for the identification of significant biomarkers in a multi-omics colon cancer dataset. The integration of clinical features with biomarker discovery has the potential to facilitate the early identification of mortality risk and the development of personalized therapies for a range of diseases, including cancer and stroke. Recent advancements in “omics� technologies have opened up new avenues for researchers to identify disease biomarkers through system-level analysis. Machine learning methods, particularly those based on tensor decomposition techniques, have gained popularity due to the challenges associated with integrative analysis of multi-omics data owing to the complexity of biological systems. Despite extensive efforts towards discovering disease-associated biomolecules by analyzing data from various “omics� experiments, such as genomics, transcriptomics, and metabolomics, the poor integration of diverse forms of 'omics' data has made the integrative analysis of multi-omics data a daunting task. Our research includes ANOVA simultaneous component analysis (ASCA) and Tucker3 modeling to analyze a multivariate dataset with an underlying experimental design. By comparing the spaces spanned by different model components we showed how the two methods can be used for confirmatory analysis and provide complementary information. we demonstrated the novel use of ASCA to analyze the residuals of Tucker3 models to find the optimum one. Increasing the model complexity to more factors removed the last remaining ASCA detectable structure in the residuals. Bootstrap analysis of the core matrix values of the Tucker3 models used to check that additional triads of eigenvectors were needed to describe the remaining structure in the residuals. Also, we developed a new simple, novel strategy for aligning Tucker3 bootstrap models with the Tucker3 model of the original data so that eigenvectors of the three modes, the order of the values in the core matrix, and their algebraic signs match the original Tucker3 model without the need for complicated bookkeeping strategies or performing rotational transformations. Additionally, to avoid getting an overparameterized Tucker3 model, we used the bootstrap method to determine 95% confidence intervals of the loadings and core values. Also, important variables for classification were identified by inspection of loading confidence intervals. The experimental results obtained using the colon cancer dataset demonstrate that our proposed methodology is effective in improving the performance of biomarker discovery in a multi-omics cancer dataset. Overall, our study highlights the potential of integrating multi-omics data with machine learning methods to gain deeper insights into the complex biological mechanisms underlying cancer and other diseases. The experimental results using NIH colon cancer dataset demonstrate that the successful application of our proposed methodology in cancer subtype classification provides a foundation for further investigation into its utility in other disease areas
    corecore