220 research outputs found

    Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    Get PDF
    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

    Machine learning liver-injuring drug interactions with non-steroidal anti-inflammatory drugs (NSAIDs) from a retrospective electronic health record (EHR) cohort

    Get PDF
    Drug-drug interactions account for up to 30% of adverse drug reactions. Increasing prevalence of electronic health records (EHRs) offers a unique opportunity to build machine learning algorithms to identify drug-drug interactions that drive adverse events. In this study, we investigated hospitalizations\u27 data to study drug interactions with non-steroidal anti-inflammatory drugs (NSAIDS) that result in drug-induced liver injury (DILI). We propose a logistic regression based machine learning algorithm that unearths several known interactions from an EHR dataset of about 400,000 hospitalization. Our proposed modeling framework is successful in detecting 87.5% of the positive controls, which are defined by drugs known to interact with diclofenac causing an increased risk of DILI, and correctly ranks aggregate risk of DILI for eight commonly prescribed NSAIDs. We found that our modeling framework is particularly successful in inferring associations of drug-drug interactions from relatively small EHR datasets. Furthermore, we have identified a novel and potentially hepatotoxic interaction that might occur during concomitant use of meloxicam and esomeprazole, which are commonly prescribed together to allay NSAID-induced gastrointestinal (GI) bleeding. Empirically, we validate our approach against prior methods for signal detection on EHR datasets, in which our proposed approach outperforms all the compared methods across most metrics, such as area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC)

    Guidelines for the use of machine learning to predict student project group academic performance

    Get PDF
    Education plays a crucial role in the growth and development of a country. However, in South Africa, there is a limited capacity and an increasing demand of students seeking an education. In an attempt to address this demand, universities are pressured into accepting more students to increase their throughput. This pressure leads to educators having less time to give students individual attention. This study aims to address this problem by demonstrating how machine learning can be used to predict student group academic performance so that educators may allocate more resources and attention to students and groups at risk. The study focused on data obtained from the third-year capstone project for the diploma in Information Technology at the Nelson Mandela University. Learning analytics and educational data mining and their processes were discussed with an in-depth look at the machine learning techniques involved therein. Artificial neural networks, decision trees and naïve Bayes classifiers were proposed and motivated for prediction modelling. An experiment was performed resulting in proposed guidelines, which give insight and recommendations for the use of machine learning to predict student group academic performance

    Concepts and Methods from Artificial Intelligence in Modern Information Systems – Contributions to Data-driven Decision-making and Business Processes

    Get PDF
    Today, organizations are facing a variety of challenging, technology-driven developments, three of the most notable ones being the surge in uncertain data, the emergence of unstructured data and a complex, dynamically changing environment. These developments require organizations to transform in order to stay competitive. Artificial Intelligence with its fields decision-making under uncertainty, natural language processing and planning offers valuable concepts and methods to address the developments. The dissertation at hand utilizes and furthers these contributions in three focal points to address research gaps in existing literature and to provide concrete concepts and methods for the support of organizations in the transformation and improvement of data-driven decision-making, business processes and business process management. In particular, the focal points are the assessment of data quality, the analysis of textual data and the automated planning of process models. In regard to data quality assessment, probability-based approaches for measuring consistency and identifying duplicates as well as requirements for data quality metrics are suggested. With respect to analysis of textual data, the dissertation proposes a topic modeling procedure to gain knowledge from CVs as well as a model based on sentiment analysis to explain ratings from customer reviews. Regarding automated planning of process models, concepts and algorithms for an automated construction of parallelizations in process models, an automated adaptation of process models and an automated construction of multi-actor process models are provided

    Approaches For Capturing Time-Varying Functional Network Connectivity With Application to Normative Development and Mental Illness

    Get PDF
    Since the beginning of medical science, the human brain has remained an unsolved puzzle; an illusive organ that controls everything- from breathing to heartbeats, from emotion to anger, and more. With the power of advanced neuroimaging techniques, scientists have now started to solve this nearly impossible puzzle, piece by piece. Over the past decade, various in vivo techniques, including functional magnetic resonance imaging (fMRI), have been increasingly used to understand brain functions. fMRI is extensively being used to facilitate the identification of various neuropsychological disorders such as schizophrenia (SZ), bipolar disorder (BP) and autism spectrum disorder (ASD). These disorders are currently diagnosed based on patients’ self-reported experiences, and observed symptoms and behaviors over the course of the illnesses. Therefore, efficient identification of biological-based markers (biomarkers) can lead to early diagnosis of these mental disorders, and provide a trajectory for disease progression. By applying advanced machine learning techniques on fMRI data, significant differences in brain function among patients with mental disorders and healthy controls can be identified. Moreover, by jointly estimating information from multiple modalities, such as, functional brain data and genetic factors, we can now investigate the relationship between brain function and genes. Functional connectivity (FC) has become a very common measure to characterize brain functions, where FC is defined as the temporal covariance of neural signals between multiple spatially distinct brain regions. Recently, researchers are studying the FC among functionally specialized brain networks which can be defined as a higher level of FC, and is termed as functional network connectivity (FNC, defined as the correlation value that summarizes the overall connection between brain ‘networks’ over time). Most functional connectivity studies have made the limiting assumption that connectivity is stationary over multiple minutes, and ignore to identify the time-varying and reoccurring patterns of FNC among brain regions (known as time-varying FNC). In this dissertation, we demonstrate the use of time-varying FNC features as potential biomarkers to differentiate between patients with mental disorders and healthy subjects. The developmental characteristics of time-varying FNC in children with typically developing brain and ASD have been extensively studies in a cross-sectional framework, and age-, sex- and disease-related FNC profiles have been proposed. Also, time-varying FNC is characterized in healthy adults and patients with severe mental disorders (SZ and BP). Moreover, an efficient classification algorithm is designed to identify patients and controls at individual level. Finally, a new framework is proposed to jointly utilize information from brain’s functional network connectivity and genetic features to find the associations between them. The frameworks that we presented here can help us understand the important role played by time-varying FNC to identify potential biomarkers for the diagnosis of severe mental disorders

    Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    Identifying nocuous ambiguity in natural language requirements

    Get PDF
    This dissertation is an investigation into how ambiguity should be classified for authors and readers of text, and how this process can be automated. Usually, authors and readers disambiguate ambiguity, either consciously or unconsciously. However, disambiguation is not always appropriate. For instance, a linguistic construction may be read differently by different people, with no consensus about which reading is the intended one. This is particularly dangerous if they do not realise that other readings are possible. Misunderstandings may then occur. This is particularly serious in the field of requirements engineering. If requirements are misunderstood, systems may be built incorrectly, and this can prove very costly. Our research uses natural language processing techniques to address ambiguity in requirements. We develop a model of ambiguity, and a method of applying it, which represent a novel approach to the problem described here. Our model is based on the notion that human perception is the only valid criterion for judging ambiguity. If people perceive very differently how an ambiguity should be read, it will cause misunderstandings. Assigning a preferred reading to it is therefore unwise. In text, such ambiguities should be located and rewritten in a less ambiguous form; others need not be reformulated. We classify the former as nocuous and the latter as innocuous. We allow the dividing line between these two classifications to be adjustable. We term this the ambiguity threshold, and it represents a level of intolerance to ambiguity. A nocuous ambiguity can be an unacknowledged or an acknowledged ambiguity for a given set of readers. In the former case, they assign disparate readings to the ambiguity, but each is unaware that the others read it differently. In the latter case, they recognise that the ambiguity has more than one reading, but this fact may be unacknowledged by new readers. We present an automated approach to determine whether ambiguities in text are nocuous or innocuous. We use heuristics to distinguish ambiguities for which there is a strong consensus about how they should be read. These are innocuous ambiguities. The remaining nocuous ambiguities can then be rewritten at a later stage. We find consensus opinions about ambiguities by surveying human perceptions on them. Our heuristics try to predict these perceptions automatically. They utilise various types of linguistic information: generic corpus data, morphology and lexical subcategorisations are the most successful. We use coordination ambiguity as the test case for this research. This occurs where the scope of words such as and and or is unclear. Our research contributes to both the requirements engineering and the natural language processing literatures. Ambiguity is known to be a serious problem in requirements engineering, but has rarely been dealt with effectively and thoroughly. Our approach is an appropriate solution, and our flexible ambiguity threshold is a particularly useful concept. For instance, high ambiguity intolerance can be implemented when writing requirements for safety-critical systems. Coordination ambiguities are widespread and known to cause misunderstandings, but have received comparatively little attention. Our heuristics show that linguistic data can be used successfully to predict preferred readings of very diverse coordinations. Used in combination, these heuristics demonstrate that nocuous ambiguity can be distinguished from innocuous ambiguity under certain conditions. Employing appropriate ambiguity thresholds, accuracy representing 28% improvement on the baselines can be achieved
    • …
    corecore