121 research outputs found

    Recall and bias of retrieving gene expression microarray datasets through PubMed identifiers

    Get PDF
    Background: The ability to locate publicly available gene expression microarray datasets effectively and efficiently facilitates the reuse of these potentially valuable resources. Centralized biomedical databases allow users to query dataset metadata descriptions, but these annotations are often too sparse and diverse to allow complex and accurate queries. In this study we examined the ability of PubMed article identifiers to locate publicly available gene expression microarray datasets, and investigated whether the retrieved datasets were representative of publicly available datasets found through statements of data sharing in the associated research articles. Results: In a recent article, Ochsner and colleagues identified 397 studies that had generated gene expression microarray data. Their search of the full text of each publication for statements of data sharing revealed 203 publicly available datasets, including 179 in the Gene Expression Omnibus (GEO) or ArrayExpress databases. Our scripted search of GEO and ArrayExpress for PubMed identifiers of the same 397 studies returned 160 datasets, including six not found by the original search for data sharing statements. As a proportion of datasets found by either method, the search for data sharing statements identified 91.4% of the 209 publicly available datasets, compared to only 76.6% found by our search carried out using PubMed identifiers. Searching GEO or ArrayExpress alone retrieved 63.2% and 46.9% of all available datasets, respectively. There was no difference in the type of datasets found by PubMed identifier searches in terms of research theme or the technology used. However, the studies identified were more likely to have larger sample sizes, were more frequently cited, and published in higher impact journals. Conclusions: Searching database entries using PubMed identifiers can identify the majority of publicly available datasets, but caution is required when this method is used to collect data for policy evaluation since studies in low impact journals are disproportionately excluded. We urge authors of all datasets to complete the citation fields for their dataset submissions once publication details are known, thereby ensuring their work has maximum visibility and can contribute to subsequent studies

    Mandates and the Contributions of Open Genomic Data

    Get PDF
    This research attempts to seek changing patterns of raw data availability and their correlations with implementations of open mandate policies. With a list of 13,785 journal articles whose authors archived datasets in a popular biomedical data repository after these articles were published in journals, this research uses regression analysis to test the correlations between data contributions and mandate implementations. It finds that both funder-based and publisher-based mandates have a strong impact on scholars’ likelihood to contribute to open data repositories. Evidence also suggests that like policies have changed the habit of authors in selecting publishing venues: open access journals have been apparently preferred by those authors whose projects are sponsored by the federal government agencies, and these journals are also highly ranked in the biomedical fields. Various stakeholders, particularly institutional administrators and open access professionals, may find the findings of this research helpful for adjusting data management policies to increase the number of quality free datasets and enhance data usability. The data-sharing example in biomedical studies provides a good case to show the importance of policy-making in the reshaping of scholarly communication

    Precision medicine and future of cancer treatment

    Get PDF
    Over the last few decades, there has been a deluge in the production of large-scale biological data mainly due to the advances in high-throughput technology. This initiated a paradigm shift on the focus in medical research. Ability to investigate molecular changes over the whole genome provided a unique opportunity in the field of translational research. This also gave rise to the concept of precision medicine which provided a strong hope for the development of better diagnostic and therapeutic tools. This is especially relevant to cancer as its incidence is increasing throughout the world. The purpose of this article is to review tools and applications of precision medicine in cancer

    Network-driven strategies to integrate and exploit biomedical data

    Get PDF
    [eng] In the quest for understanding complex biological systems, the scientific community has been delving into protein, chemical and disease biology, populating biomedical databases with a wealth of data and knowledge. Currently, the field of biomedicine has entered a Big Data era, in which computational-driven research can largely benefit from existing knowledge to better understand and characterize biological and chemical entities. And yet, the heterogeneity and complexity of biomedical data trigger the need for a proper integration and representation of this knowledge, so that it can be effectively and efficiently exploited. In this thesis, we aim at developing new strategies to leverage the current biomedical knowledge, so that meaningful information can be extracted and fused into downstream applications. To this goal, we have capitalized on network analysis algorithms to integrate and exploit biomedical data in a wide variety of scenarios, providing a better understanding of pharmacoomics experiments while helping accelerate the drug discovery process. More specifically, we have (i) devised an approach to identify functional gene sets associated with drug response mechanisms of action, (ii) created a resource of biomedical descriptors able to anticipate cellular drug response and identify new drug repurposing opportunities, (iii) designed a tool to annotate biomedical support for a given set of experimental observations, and (iv) reviewed different chemical and biological descriptors relevant for drug discovery, illustrating how they can be used to provide solutions to current challenges in biomedicine.[cat] En la cerca d’una millor comprensió dels sistemes biològics complexos, la comunitat científica ha estat aprofundint en la biologia de les proteïnes, fàrmacs i malalties, poblant les bases de dades biomèdiques amb un gran volum de dades i coneixement. En l’actualitat, el camp de la biomedicina es troba en una era de “dades massives” (Big Data), on la investigació duta a terme per ordinadors se’n pot beneficiar per entendre i caracteritzar millor les entitats químiques i biològiques. No obstant, la heterogeneïtat i complexitat de les dades biomèdiques requereix que aquestes s’integrin i es representin d’una manera idònia, permetent així explotar aquesta informació d’una manera efectiva i eficient. L’objectiu d’aquesta tesis doctoral és desenvolupar noves estratègies que permetin explotar el coneixement biomèdic actual i així extreure informació rellevant per aplicacions biomèdiques futures. Per aquesta finalitat, em fet servir algoritmes de xarxes per tal d’integrar i explotar el coneixement biomèdic en diferents tasques, proporcionant un millor enteniment dels experiments farmacoòmics per tal d’ajudar accelerar el procés de descobriment de nous fàrmacs. Com a resultat, en aquesta tesi hem (i) dissenyat una estratègia per identificar grups funcionals de gens associats a la resposta de línies cel·lulars als fàrmacs, (ii) creat una col·lecció de descriptors biomèdics capaços, entre altres coses, d’anticipar com les cèl·lules responen als fàrmacs o trobar nous usos per fàrmacs existents, (iii) desenvolupat una eina per descobrir quins contextos biològics corresponen a una associació biològica observada experimentalment i, finalment, (iv) hem explorat diferents descriptors químics i biològics rellevants pel procés de descobriment de nous fàrmacs, mostrant com aquests poden ser utilitzats per trobar solucions a reptes actuals dins el camp de la biomedicina

    A review on machine learning approaches and trends in drug discovery

    Get PDF
    Abstract: Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.Instituto de Salud Carlos III; PI17/01826Instituto de Salud Carlos III; PI17/01561Xunta de Galicia; Ref. ED431D 2017/16Xunta de Galicia; Ref. ED431D 2017/23Xunta de Galicia; Ref. ED431C 2018/4

    Biomedical informatics and translational medicine

    Get PDF
    Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams

    CASSANDRA: drug gene association prediction via text mining and ontologies

    Get PDF
    The amount of biomedical literature has been increasing rapidly during the last decade. Text mining techniques can harness this large-scale data, shed light onto complex drug mechanisms, and extract relation information that can support computational polypharmacology. In this work, we introduce CASSANDRA, a fully corpus-based and unsupervised algorithm which uses the MEDLINE indexed titles and abstracts to infer drug gene associations and assist drug repositioning. CASSANDRA measures the Pointwise Mutual Information (PMI) between biomedical terms derived from Gene Ontology (GO) and Medical Subject Headings (MeSH). Based on the PMI scores, drug and gene profiles are generated and candidate drug gene associations are inferred when computing the relatedness of their profiles. Results show that an Area Under the Curve (AUC) of up to 0.88 can be achieved. The algorithm can successfully identify direct drug gene associations with high precision and prioritize them over indirect drug gene associations. Validation shows that the statistically derived profiles from literature perform as good as (and at times better than) the manually curated profiles. In addition, we examine CASSANDRA’s potential towards drug repositioning. For all FDA-approved drugs repositioned over the last 5 years, we generate profiles from publications before 2009 and show that the new indications rank high in these profiles. In summary, co-occurrence based profiles derived from the biomedical literature can accurately predict drug gene associations and provide insights onto potential repositioning cases

    A PERSONAL GENOMIC INFORMATION ANALYSIS AND MANAGEMENT SYSTEM FOR HEALTHCARE PURPOSES

    Get PDF
    Currently, a large amount of personal genomic data can be generated at an affordable price in a short period of time due to the improvement in the DNA sequencing technologies. Abundant research results on genetic diseases have been published in recent years. Therefore, it is eventually possible to integrate multiple types of information together and apply them into genomic-based personalized healthcare. However, this is still a very challenging task for healthcare professionals because the desired information is hidden in highly complex and heterogeneous genomic data sets and spread in various databases, which were typically created for researchers. In this research project, a personal genomic information management and analysis system is created for healthcare professionals, especially physicians. To properly design such a system, an exploratory survey was conducted to identify the current status of physicians in using genomics in their clinical practice and to collect their expectations about the features of a patient genomic information system. The results of this study indicated that physicians have sufficient knowledge in genomics and they are interested in incorporating genomics into their clinical practice. The results also indicated that a well-designed patient genomic information system with desired features can help physicians to incorporate genomics into their clinical practice. Based on the survey findings, a personal genomic information system was created for the purpose of managing and analyzing patient genomic data. In this system, we first created an integrated database, and then developed data analysis algorithms to extract clinical information from patient genetic variation data, including disease-associated genetic variations and pharmacogenomic associations. Physicians can conveniently identify the genetic reasons for diseases and determine personalized treatment options based on the information provided by the system. A usability study was conducted to obtain physicians’ feedback about the system after they use it to finish some tasks such as searching the genetic variations of one patient, determining the patient’s risk of certain diseases, and identifying the corresponding pharmacogenomic results. The results of this study indicated that physicians could easily find the patient information they need and the information can be directly applied in their clinical practice

    PHARMACOGENOMICS IN THE EMIRATI POPULATION: APPLICATIONS IN CARDIOVASCULAR DISEASES AND ONCOLOGY

    Get PDF
    Pharmacogenetic variations contribute to interindividual differences in drug response. Advances in molecular techniques provided insights into interpopulation pharmacogenomic variations. A limited number of pharmacogenetic studies were conducted in the UAE population. The current study aims to explore the variation landscape in important pharmacogenes in Emiratis. Furthermore, it investigates the association between VKORC1 variants and warfarin dose in cardiovascular patients. Finally, this study explores the applied/needed germline pharmacogenetic tests in oncology in the UAE. In 100 healthy Emiratis, variants and star alleles in 100 relevant pharmacogenes were defined by next-generation sequencing. 63% of detected variants were rare, 30% were novel, and 141 variants were novel and damaging. By clinical annotations, filtering variants resulted in 99 clinically actionable variants, from which 44 are highly significant alleles. Revising the results against the clinical pharmacogenetics implementation consortium guidelines demonstrated that 93% of participants have at least one actionable variant with a dosing recommendation. The effect of VKORC1 on warfarin dose was explored in 90 patients. A model built from two VKORC1 variants, rs9923231 and rs61742245, with age, significantly predicted warfarin dose. High incidence rates of adverse chemotherapy effects were reported from 66 pediatric acute lymphoblastic leukemia patients, which indicates the plausibility of pharmacogenetic research to investigate toxicity biomarkers. Few cases had a clinical pharmacogenetic test of TPMT and NUDT15 before starting oral 6-mercaptopurine. Patients who received pharmacogenetic-guided doses suffered from less adverse effects. Exploring the adverse drug effects in a group of 77 breast cancer patients was faced by deficiencies in adverse effects reporting. The reported adverse events suggested suitable candidates for future pharmacogenetic research. This research highlighted population-specific variants, unexplored adverse drug events, and possible pharmacogenomics applications in the UAE. Various research opportunities were illustrated for the scientific community

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe
    corecore