109 research outputs found

    Coupling different methods for overcoming the class imbalance problem

    Get PDF
    Many classification problems must deal with imbalanced datasets where one class \u2013 the majority class \u2013 outnumbers the other classes. Standard classification methods do not provide accurate predictions in this setting since classification is generally biased towards the majority class. The minority classes are oftentimes the ones of interest (e.g., when they are associated with pathological conditions in patients), so methods for handling imbalanced datasets are critical. Using several different datasets, this paper evaluates the performance of state-of-the-art classification methods for handling the imbalance problem in both binary and multi-class datasets. Different strategies are considered, including the one-class and dimension reduction approaches, as well as their fusions. Moreover, some ensembles of classifiers are tested, in addition to stand-alone classifiers, to assess the effectiveness of ensembles in the presence of imbalance. Finally, a novel ensemble of ensembles is designed specifically to tackle the problem of class imbalance: the proposed ensemble does not need to be tuned separately for each dataset and outperforms all the other tested approaches. To validate our classifiers we resort to the KEEL-dataset repository, whose data partitions (training/test) are publicly available and have already been used in the open literature: as a consequence, it is possible to report a fair comparison among different approaches in the literature. Our best approach (MATLAB code and datasets not easily accessible elsewhere) will be available at https://www.dei.unipd.it/node/2357

    Dynamics and Drivers of Fecal Iindicator Bacteria and Associated Bacterial Community Members in Estuarine Waters

    Get PDF
    For over a century, specific types of bacteria have been monitored in natural water bodies as indicators of fecal pollution and increased risk of encountering human pathogens. One such type of bacteria is the fecal coliforms, a group of gram-negative, facultative anaerobes mostly from the Class Gammaproteobacteria and the Family Enterobacteriaceae, which are commonly found in the gut of warm-blooded animals. In the Chesapeake Bay, routine monitoring of coliform bacteria has been conducted since the 1920’s to assess the likelihood of sewage pollution in shellfish harvest areas. The research for this dissertation examined the dynamics and drivers of fecal coliforms and potential pathogen groups in Maryland waters. First, the impacts of climate variability on densities of fecal coliforms in surface waters were examined, finding that annual precipitation and air temperature levels correlate well to the proportion of stations with fecal coliforms in excess of the established regulatory criteria. A dominant climate pattern was identified for years with extreme precipitation and fecal coliform levels. Secondly, the validity of using precipitation totals as indicators of fecal coliform densities exceeding the regulatory criteria was examined. Precipitation levels over the previous two days were related to fecal coliforms in excess of the criteria for particular watersheds, depending on the percent of open water; non-tidal, forested wetlands; and soil types. The level of precipitation required to cause fecal coliform densities to exceed the FDA criterion varied between watersheds. Thirdly, high-throughput sequencing of 16S rRNA genes was used to study the community of bacteria at a long-term monitoring station in order to characterize community members over the course of 5 months. Water temperature and turbidity were found to be related to changes in community composition at the scale of Genera, while precipitation was a key driver for the presence of allochthonous bacteria such as fecal coliforms. The co-occurrence of some bacteria groups at the Class level of phylogeny was largely defined by the arrival of allochthonous groups into the autochthonous community. Further, a novel approach for estimating densities of bacteria from 16S rRNA amplicon pools was explored

    Technological Advances in the Diagnosis and Management of Pigmented Fundus Tumours

    Get PDF
    Choroidal naevi are the most common intraocular tumour. They can be pigmented or non-pigmented and have a predilection for the posterior uvea. The majority remain undetected and cause no harm but are increasingly found on routine community optometry examinations. Rarely does a naevus demonstrate growth or the onset of suspicious features to fulfil the criteria for a malignant melanoma. Because of this very small risk, optometrists commonly refer these patients to hospital eye units for a second opinion, triggering specialist examination and investigation, causing significant anxiety to patients and stretching medical resources. This PhD thesis introduces the MOLES acronym and scoring system that has been devised to categorise the risk of malignancy in choroidal melanocytic tumours according to Mushroom tumour shape, Orange pigment, Large tumour size, Enlarging tumour and Subretinal fluid. This is a simplified system that can be used without sophisticated imaging, and hence its main utility lies in the screening of patients with choroidal pigmented lesions in the community and general ophthalmology clinics. Under this system, lesions were categorised by a scoring system as ‘common naevus’, ‘low-risk naevus’, ‘high-risk naevus’ and ‘probable melanoma.’ According to the sum total of the scores, the MOLES system correlates well with ocular oncologists’ final diagnosis. The PhD thesis also describes a model of managing such lesions in a virtual pathway, showing that images of choroidal naevi evaluated remotely using a decision-making algorithm by masked non-medical graders or masked ophthalmologists is safe. This work prospectively validates a virtual naevus clinic model focusing on patient safety as the primary consideration. The idea of a virtual naevus clinic as a fast, one-stop, streamlined and comprehensive service is attractive for patients and healthcare systems, including an optimised patient experience with reduced delays and inconvenience from repeated visits. A safe, standardised model ensures homogeneous management of cases, appropriate and prompt return of care closer to home to community-based optometrists. This research work and strategies, such as the MOLES scoring system for triage, could empower community-based providers to deliver management of benign choroidal naevi without referral to specialist units. Based on the positive outcome of this prospective study and the MOLES studies, a ‘Virtual Naevus Clinic’ has been designed and adapted at Moorfields Eye Hospital (MEH) to prove its feasibility as a response to the COVID-19 pandemic, and with the purpose of reducing in-hospital patient journey times and increasing the capacity of the naevus clinics, while providing safe and efficient clinical care for patients. This PhD chapter describes the design, pathways, and operating procedures for the digitally enabled naevus clinics in Moorfields Eye Hospital, including what this service provides and how it will be delivered and supported. The author will share the current experience and future plan. Finally, the PhD thesis will cover a chapter that discusses the potential role of artificial intelligence (AI) in differentiating benign choroidal naevus from choroidal melanoma. The published clinical and imaging risk factors for malignant transformation of choroidal naevus will be reviewed in the context of how AI applied to existing ophthalmic imaging systems might be able to determine features on medical images in an automated way. The thesis will include current knowledge to date and describe potential benefits, limitations and key issues that could arise with this technology in the ophthalmic field. Regulatory concerns will be addressed with possible solutions on how AI could be implemented in clinical practice and embedded into existing imaging technology with the potential to improve patient care and the diagnostic process. The PhD will also explore the feasibility of developed automated deep learning models and investigate the performance of these models in diagnosing choroidal naevomelanocytic lesions based on medical imaging, including colour fundus and autofluorescence fundus photographs. This research aimed to determine the sensitivity and specificity of an automated deep learning algorithm used for binary classification to differentiate choroidal melanomas from choroidal naevi and prove that a differentiation concept utilising a machine learning algorithm is feasible

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    Quantitative imaging in radiation oncology

    Get PDF
    Artificially intelligent eyes, built on machine and deep learning technologies, can empower our capability of analysing patients’ images. By revealing information invisible at our eyes, we can build decision aids that help our clinicians to provide more effective treatment, while reducing side effects. The power of these decision aids is to be based on patient tumour biologically unique properties, referred to as biomarkers. To fully translate this technology into the clinic we need to overcome barriers related to the reliability of image-derived biomarkers, trustiness in AI algorithms and privacy-related issues that hamper the validation of the biomarkers. This thesis developed methodologies to solve the presented issues, defining a road map for the responsible usage of quantitative imaging into the clinic as decision support system for better patient care

    Systems Toxicology: Beyond Animal Models

    Get PDF
    Toxicology – much like the rest of biology – is undergoing a profound change as new technologies begin to offer a more systems oriented view of cellular physiology. For toxicology in particular, this means moving away from black-box animal models that provide limited information about mechanisms of toxicity towards the use of in vitro approaches which can both expedite hazard assessment while at the same time providing a more data –rich insight into toxic effects at the molecular level. One motivator of this shift is Green Toxciology, which seeks to support the Green Chemistry movement. In order for this approach to succeed, it will require two separate but parallel efforts. The first is an Integrated Testing Strategy which seeks to use machine learning and data mining techniques to combine QSARs and in vitro tests in the most efficient way possible to accurately estimate hazard, which is discussed both theoretically and demonstrated practically with the example of skin sensitization. Secondly, toxicology will require new approaches that exploit the insights of network biology to look at toxic mechanisms from a systems perspective. The theoretical concept of a Pathway of Toxicity is outlined, and an example of how to extract a suggested Pathway of Toxicity is given, using a Weighted Gene Correlation Network Analysis of a small microarray study of MPTP toxicity combined with text-mining and other high-throughput data to suggest novel candidate transcription factors and proteins. In conclusion, it discusses some of the current limitations of another promising –omics technology, metabolomics

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    The feasibility of using of electronic health records to inform clinical decision making for community-onset urinary tract infection in England

    Get PDF
    Urinary tract infections (UTIs) are a major source of morbidity, yet differentiating UTI from other conditions and choosing the right treatment remains challenging. Using case studies from English primary and secondary care, this thesis investigates the potential use of electronic health records (EHR) - i.e., data recorded as part of routine care - to aid the diagnosis and management of community-onset UTI. I start by introducing sources of uncertainty in diagnosing UTI (Chapter 1) and review how EHRs have previously been used to study UTIs (Chapter 2). In Chapter 3, I discuss EHR sources available to study UTIs in England. In Chapter 4, I explore how EHRs from primary care can be used to guide antibiotic prescribing for UTI, by evaluating harms of delaying treatment in key patient groups. In Chapters 5 and 6, I explore the use of EHR data as a diagnostic tool to guide antibiotic de-escalation in patients with suspected UTI in the emergency department (ED). Cases of community-onset UTI could be identified in both primary and secondary care data but case definitions relied heavily on coarse diagnostic codes. A lack of information on patients' acute health status, clinical observations (e.g., urine dipstick tests), and reasons for antibiotic prescribing resulted in heterogeneous study cohorts, which likely confounded estimated effects of antibiotic treatment in primary care. In secondary care, early prediction of bacteriuria to guide antibiotic prescribing decisions in the ED proved promising, but model performance varied greatly by patient mix and variable definitions. Better recording of clinical information and a combination of retrospective EHR analysis with prospective cohorts and qualitative approaches will be required to derive actionable insights on UTI. Results based solely on currently available EHR data need to be interpreted carefully

    Pacific Symposium on Biocomputing 2023

    Get PDF
    The Pacific Symposium on Biocomputing (PSB) 2023 is an international, multidisciplinary conference for the presentation and discussion of current research in the theory and application of computational methods in problems of biological significance. Presentations are rigorously peer reviewed and are published in an archival proceedings volume. PSB 2023 will be held on January 3-7, 2023 in Kohala Coast, Hawaii. Tutorials and workshops will be offered prior to the start of the conference.PSB 2023 will bring together top researchers from the US, the Asian Pacific nations, and around the world to exchange research results and address open issues in all aspects of computational biology. It is a forum for the presentation of work in databases, algorithms, interfaces, visualization, modeling, and other computational methods, as applied to biological problems, with emphasis on applications in data-rich areas of molecular biology.The PSB has been designed to be responsive to the need for critical mass in sub-disciplines within biocomputing. For that reason, it is the only meeting whose sessions are defined dynamically each year in response to specific proposals. PSB sessions are organized by leaders of research in biocomputing's 'hot topics.' In this way, the meeting provides an early forum for serious examination of emerging methods and approaches in this rapidly changing field
    • …
    corecore