68 research outputs found

    An ICT infrastructure to integrate clinical and molecular data in oncology research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The ONCO-i2b2 platform is a bioinformatics tool designed to integrate clinical and research data and support translational research in oncology. It is implemented by the University of Pavia and the IRCCS Fondazione Maugeri hospital (FSM), and grounded on the software developed by the Informatics for Integrating Biology and the Bedside (i2b2) research center. I2b2 has delivered an open source suite based on a data warehouse, which is efficiently interrogated to find sets of interesting patients through a query tool interface.</p> <p>Methods</p> <p>Onco-i2b2 integrates data coming from multiple sources and allows the users to jointly query them. I2b2 data are then stored in a data warehouse, where facts are hierarchically structured as ontologies. Onco-i2b2 gathers data from the FSM pathology unit (PU) database and from the hospital biobank and merges them with the clinical information from the hospital information system.</p> <p>Our main effort was to provide a robust integrated research environment, giving a particular emphasis to the integration process and facing different challenges, consecutively listed: biospecimen samples privacy and anonymization; synchronization of the biobank database with the i2b2 data warehouse through a series of Extract, Transform, Load (ETL) operations; development and integration of a Natural Language Processing (NLP) module, to retrieve coded information, such as SNOMED terms and malignant tumors (TNM) classifications, and clinical tests results from unstructured medical records. Furthermore, we have developed an internal SNOMED ontology rested on the NCBO BioPortal web services.</p> <p>Results</p> <p>Onco-i2b2 manages data of more than 6,500 patients with breast cancer diagnosis collected between 2001 and 2011 (over 390 of them have at least one biological sample in the cancer biobank), more than 47,000 visits and 96,000 observations over 960 medical concepts.</p> <p>Conclusions</p> <p>Onco-i2b2 is a concrete example of how integrated Information and Communication Technology architecture can be implemented to support translational research. The next steps of our project will involve the extension of its capabilities by implementing new plug-in devoted to bioinformatics data analysis as well as a temporal query module.</p

    Big Data as a Driver for Clinical Decision Support Systems: A Learning Health Systems Perspective

    Get PDF
    Big data technologies are nowadays providing health care with powerful instruments to gather and analyze large volumes of heterogeneous data collected for different purposes, including clinical care, administration, and research. This makes possible to design IT infrastructures that favor the implementation of the so-called "Learning Healthcare System Cycle," where healthcare practice and research are part of a unique and synergic process. In this paper we highlight how "Big Data enabled" integrated data collections may support clinical decision-making together with biomedical research. Two effective implementations are reported, concerning decision support in Diabetes and in Inherited Arrhythmogenic Diseases

    Artificial intelligence-based prediction of overall survival in metastatic renal cell carcinoma

    No full text
    Background and objectivesInvestigations of the prognosis are vital for better patient management and decision-making in patients with advanced metastatic renal cell carcinoma (mRCC). The purpose of this study is to evaluate the capacity of emerging Artificial Intelligence (AI) technologies to predict three- and five-year overall survival (OS) for mRCC patients starting their first-line of systemic treatment. Patients and methodsThe retrospective study included 322 Italian patients with mRCC who underwent systemic treatment between 2004 and 2019. Statistical analysis included the univariate and multivariate Cox proportional-hazard model and the Kaplan-Meier analysis for the prognostic factors' investigation. The patients were split into a training cohort to establish the predictive models and a hold-out cohort to validate the results. The models were evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. We assessed the clinical benefit of the models using decision curve analysis (DCA). Then, the proposed AI models were compared with well-known pre-existing prognostic systems ResultsThe median age of patients in the study was 56.7 years at RCC diagnosis and 78% of participants were male. The median survival time from the start of systemic treatment was 29.2 months; 95% of the patients died during the follow-up that finished by the end of 2019. The proposed predictive model, which was constructed as an ensemble of three individual predictive models, outperformed all well-known prognostic models to which it was compared. It also demonstrated better usability in supporting clinical decisions for 3- and 5-year OS. The model achieved (0.786 and 0.771) AUC and (0.675 and 0.558) specificity at sensitivity 0.90 for 3 and 5 years, respectively. We also applied explainability methods to identify the important clinical features that were found to be partially matched with the prognostic factors identified in the Kaplan-Meier and Cox analyses. ConclusionsOur AI models provide best predictive accuracy and clinical net benefits over well-known prognostic models. As a result, they can potentially be used in clinical practice for providing better management for mRCC patients starting their first-line of systemic treatment. Larger studies would be needed to validate the developed mode

    Careflow Mining Techniques to Explore Type 2 Diabetes Evolution

    No full text
    In this work we describe the application of a careflow mining algorithm to detect the most frequent patterns of care in a type 2 diabetes patients cohort. The applied method enriches the detected patterns with clinical data to define temporal phenotypes across the studied population. Novel phenotypes are discovered from heterogeneous data of 424 Italian patients, and compared in terms of metabolic control and complications. Results show that careflow mining can help to summarize the complex evolution of the disease into meaningful patterns, which are also significant from a clinical point of view

    Prognostic Potential of Immune Inflammatory Biomarkers in Breast Cancer Patients Treated with Neoadjuvant Chemotherapy

    No full text
    Immune inflammatory biomarkers are easily obtained and inexpensive blood-based parameters that recently showed prognostic and predictive value in many solid tumors. In this study, we aimed to investigate the role of these biomarkers in predicting distant relapse in breast cancer patients treated with neoadjuvant chemotherapy (NACT). All breast cancer patients who referred to our Breast Unit and underwent NACT were retrospectively reviewed. The pre-treatment neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), monocyte-to-lymphocyte ratio (MLR), and pan-immune-inflammation value (PIV) were calculated from complete blood counts. The primary outcome was 5-year distant-metastasis-free survival (DMFS). In receiver operating characteristic analyses, the optimal cutoff values for the NLR, PLR, MLR, and PIV were determined at 2.25, 152.46, 0.25, and 438.68, respectively. High levels of the MLR, but not the NLR, PLR, or PIV, were associated with improved 5-year DMSF in the study population using both univariate (HR 0.52, p = 0.03) and multivariate analyses (HR, 0.44; p = 0.02). Our study showed that the MLR was a significant independent parameter affecting DMFS in breast cancer patients undergoing NACT. Prospective studies are required to confirm this finding and to define reliable cutoff values, thus leading the way for the clinical application of this biomarker

    Information extraction from Italian medical reports: An ontology-driven approach

    No full text
    Objective In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible. Materials and methods The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports. Results The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results. Discussion and conclusion Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database
    corecore