20 research outputs found

    Artificial Intelligence in Oncology Drug Discovery and Development

    Get PDF
    There exists a profound conflict at the heart of oncology drug development. The efficiency of the drug development process is falling, leading to higher costs per approved drug, at the same time personalised medicine is limiting the target market of each new medicine. Even as the global economic burden of cancer increases, the current paradigm in drug development is unsustainable. In this book, we discuss the development of techniques in machine learning for improving the efficiency of oncology drug development and delivering cost-effective precision treatment. We consider how to structure data for drug repurposing and target identification, how to improve clinical trials and how patients may view artificial intelligence

    Дослідження ефективності критеріїв відбору у алгоритмі первинної селекції ознак в задачі класифікації патології печінки

    Get PDF
    Магістерська дисертація за темою «Дослідження ефективності критеріїв відбору у алгоритмі первинної селекції ознак в задачі класифікації патології печінки» виконана студентом кафедри біомедичної кібернетики ФСП Кожарою Катериною Миколаївноюм зі спеціальності 122 «Комп’ютерні науки» за освітньо-професійною програмою «Комп’ютерні технології в біології та медицині» та складається зі: вступу; 4 розділів (аналіз предметної області, методи реконструкції зображення, постановка задачі, аналіз ефективності відбору), розділу зі стартап проєкту, висновків до кожного з цих розділів; загальних висновків; списку використаних джерел, який налічує 30 джерела. Загальний обсяг роботи 95 сторінок. Обсяг роботи: 95 сторінок, 35 ілюстрацій, 30 джерел посилань. Актуальність теми. Діагностика захворювань печінки на ранніх стадіях допоможе більш якісно оцінити стан пацієнта та обрати якомога кращу лікувальну стратегію. Мета дослідження. Знаходження оптимального варіанту селекції ознак, для ефективного виконання задачі бінарної класифікації «норма -патологія» при дифузних захворюваннях печінки. Об’єкт дослідження. Зображення УЗД печінки. Предмет дослідження. Ефективність критеріїв відбору у алгоритмі первинної селекції ознак в задачі класифікації патології печінки. Методи дослідження. Методи селекції за критеріями внутрішньокласової дисперсії, міжкласової дисперсії, відношення внутрішньокласової і міжкласової дисперсій, кореляційний відбір ознак. Інструменти дослідження. Python, Anaconda, Jupyter Notebook.Master's dissertation on " Analysis of enrollment criteria efficiency in algorithm of the primary selection of indicators in the problematics of liver pathology classification" performed by Kozhara Kateryna, a student of the Department of Biomedical Cybernetics FBMI by specialty 122 "Computer Science" in the educational and professional program "Computer Technology in Biology and Medicine" and consists of an introduction, 4 chapters (subject area analysis, image reconstruction methods, problem statement, selection efficiency analysis), conclusions to each chapter, general conclusions and list of references that includes 30 points. The paper amounts to 95 pages. Paper size: 95 pages, 35 illustrations, 30 references. Relevance of the topic. Diagnosis of liver disease in the early stages will help to better assess the patient's condition and choose the best possible treatment strategy. Objective of the study. Знаходження оптимального варіанту селекції ознак, щоб можно було ефективно виконати задачу бінарної класифікації «Норма:Патологія». Object of study. Ultrasound image of the liver. Subject of study. The effectiveness of selection criteria in the algorithm of primary selection of traits in the problem of classification of liver pathology. Research methods. Intraclass variance, interclass variance, the ratio of intraclass and interclass variance, correlation selection of features. Research tools. Python, Anaconda, Jupyter Notebook

    An Integrated, Module-based Biomarker Discovery Framework

    Get PDF
    Identification of biomarkers that contribute to complex human disorders is a principal and challenging task in computational biology. Prognostic biomarkers are useful for risk assessment of disease progression and patient stratification. Since treatment plans often hinge on patient stratification, better disease subtyping has the potential to significantly improve survival for patients. Additionally, a thorough understanding of the roles of biomarkers in cancer pathways facilitates insights into complex disease formation, and provides potential druggable targets in the pathways. Many statistical methods have been applied toward biomarker discovery, often combining feature selection with classification methods. Traditional approaches are mainly concerned with statistical significance and fail to consider the clinical relevance of the selected biomarkers. Two additional problems impede meaningful biomarker discovery: gene multiplicity (several maximally predictive solutions exist) and instability (inconsistent gene sets from different experiments or cross validation runs). Motivated by a need for more biologically informed, stable biomarker discovery method, I introduce an integrated module-based biomarker discovery framework for analyzing high- throughput genomic disease data. The proposed framework addresses the aforementioned challenges in three components. First, a recursive spectral clustering algorithm specifically 4 tailored toward high-dimensional, heterogeneous data (ReKS) is developed to partition genes into clusters that are treated as single entities for subsequent analysis. Next, the problems of gene multiplicity and instability are addressed through a group variable selection algorithm (T-ReCS) based on local causal discovery methods. Guided by the tree-like partition created from the clustering algorithm, this algorithm selects gene clusters that are predictive of a clinical outcome. We demonstrate that the group feature selection method facilitate the discovery of biologically relevant genes through their association with a statistically predictive driver. Finally, we elucidate the biological relevance of the biomarkers by leveraging available prior information to identify regulatory relationships between genes and between clusters, and deliver the information in the form of a user-friendly web server, mirConnX

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

    Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

    Get PDF
    466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

    Learning by Fusing Heterogeneous Data

    Get PDF
    It has become increasingly common in science and technology to gather data about systems at different levels of granularity or from different perspectives. This often gives rise to data that are represented in totally different input spaces. A basic premise behind the study of learning from heterogeneous data is that in many such cases, there exists some correspondence among certain input dimensions of different input spaces. In our work we found that a key bottleneck that prevents us from better understanding and truly fusing heterogeneous data at large scales is identifying the kind of knowledge that can be transferred between related data views, entities and tasks. We develop interesting and accurate data fusion methods for predictive modeling, which reduce or entirely eliminate some of the basic feature engineering steps that were needed in the past when inferring prediction models from disparate data. In addition, our work has a wide range of applications of which we focus on those from molecular and systems biology: it can help us predict gene functions, forecast pharmacological actions of small chemicals, prioritize genes for further studies, mine disease associations, detect drug toxicity and regress cancer patient survival data. Another important aspect of our research is the study of latent factor models. We aim to design latent models with factorized parameters that simultaneously tackle multiple types of data heterogeneity, where data diversity spans across heterogeneous input spaces, multiple types of features, and a variety of related prediction tasks. Our algorithms are capable of retaining the relational structure of a data system during model inference, which turns out to be vital for good performance of data fusion in certain applications. Our recent work included the study of network inference from many potentially nonidentical data distributions and its application to cancer genomic data. We also model the epistasis, an important concept from genetics, and propose algorithms to efficiently find the ordering of genes in cellular pathways. A central topic of our Thesis is also the analysis of large data compendia as predictions about certain phenomena, such as associations between diseases and involvement of genes in a certain phenotype, are only possible when dealing with lots of data. Among others, we analyze 30 heterogeneous data sets to assess drug toxicity and over 40 human gene association data collections, the largest number of data sets considered by a collective latent factor model up to date. We also make interesting observations about deciding which data should be considered for fusion and develop a generic approach that can estimate the sensitivities between different data sets

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available
    corecore