10 research outputs found
Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach
BACKGROUND:
Translational research is a key area of focus of the National Institutes of Health (NIH), as demonstrated by the substantial investment in the Clinical and Translational Science Award (CTSA) program. The goal of the CTSA program is to accelerate the translation of discoveries from the bench to the bedside and into communities. Different classification systems have been used to capture the spectrum of basic to clinical to population health research, with substantial differences in the number of categories and their definitions. Evaluation of the effectiveness of the CTSA program and of translational research in general is hampered by the lack of rigor in these definitions and their application. This study adds rigor to the classification process by creating a checklist to evaluate publications across the translational spectrum and operationalizes these classifications by building machine learning-based text classifiers to categorize these publications.
METHODS:
Based on collaboratively developed definitions, we created a detailed checklist for categories along the translational spectrum from T0 to T4. We applied the checklist to CTSA-linked publications to construct a set of coded publications for use in training machine learning-based text classifiers to classify publications within these categories. The training sets combined T1/T2 and T3/T4 categories due to low frequency of these publication types compared to the frequency of T0 publications. We then compared classifier performance across different algorithms and feature sets and applied the classifiers to all publications in PubMed indexed to CTSA grants. To validate the algorithm, we manually classified the articles with the top 100 scores from each classifier.
RESULTS:
The definitions and checklist facilitated classification and resulted in good inter-rater reliability for coding publications for the training set. Very good performance was achieved for the classifiers as represented by the area under the receiver operating curves (AUC), with an AUC of 0.94 for the T0 classifier, 0.84 for T1/T2, and 0.92 for T3/T4.
CONCLUSIONS:
The combination of definitions agreed upon by five CTSA hubs, a checklist that facilitates more uniform definition interpretation, and algorithms that perform well in classifying publications along the translational spectrum provide a basis for establishing and applying uniform definitions of translational research categories. The classification algorithms allow publication analyses that would not be feasible with manual classification, such as assessing the distribution and trends of publications across the CTSA network and comparing the categories of publications and their citations to assess knowledge transfer across the translational research spectrum
PIDapaloozaSoNHandout.pdf
A handout describing a plan for ORCID adoption for the School of Nursing at the University of Wisconsin - Madison
Dinner and Data Management: Engaging undergraduates in research data management topics outside of the curriculum
Researchers are faced with unprecedented challenges due to the size and complexity of data, and libraries are stepping in to help by providing guidance on research data management primarily to graduate students and faculty. Currently, many universities are encouraging an undergraduate research experience where students engage in research projects in the classroom and in research labs, yet research data management is often not included as part of these opportunities. At UW-Madison, we piloted researchERS (Emerging Research Scholars), a program for undergraduates from all disciplines to learn data management skills. Focusing on core concepts as well as data ethics, reproducibility, and research workflows, the format of the program included seven evening workshops, two networking events, and one field trip. Each workshop invited campus and community speakers relevant to the workshop’s theme as a way to introduce the students to the network of available resources and data expertise and provided food for attendees. The workshops also built in customized activities to show students how to incorporate best practices into their work. Local businesses provided a tour of their facilities as well as a talk on how they leverage data. This paper will describe this program as well as the benefits and drawbacks of tailoring a research data management program toward undergraduates
Tableau Data Curation Primer
This work was created as part of the Data Curation Network “Specialized Data Curation” Workshop #2 held at Johns Hopkins University April 17-18, 2019.Tableau Software is a proprietary suite of products for data exploration, analysis, and visualization with an initial concentration in business intelligence. This primer focuses on the Tableau workbook files – .twb and .twbx – produced using Tableau Desktop . Like Microsoft Excel, Tableau Desktop uses a workbook and sheet file structure. Workbooks can contain worksheets, dashboards, and stories.Institute of Museum and Library Services RE-85-18-0040-18
Dinner and Data Management: Engaging undergraduates in research data management topics outside of the curriculum
Researchers are faced with unprecedented challenges due to the size and complexity of data, and libraries are stepping in to help by providing guidance on research data management primarily to graduate students and faculty. Currently, many universities are encouraging an undergraduate research experience where students engage in research projects in the classroom and in research labs, yet research data management is often not included as part of these opportunities. At UW-Madison, we piloted researchERS (Emerging Research Scholars), a program for undergraduates from all disciplines to learn data management skills. Focusing on core concepts as well as data ethics, reproducibility, and research workflows, the format of the program included seven evening workshops, two networking events, and one field trip. Each workshop invited campus and community speakers relevant to the workshop’s theme as a way to introduce the students to the network of available resources and data expertise and provided food for attendees. The workshops also built in customized activities to show students how to incorporate best practices into their work. Local businesses provided a tour of their facilities as well as a talk on how they leverage data. This paper will describe this program as well as the benefits and drawbacks of tailoring a research data management program toward undergraduates
Recommended from our members
E-Science Expo-What you need to know about data curation, data management, data preservation
The e-science Fellows are graduate students in the School of Information Studies. They are the initial cohort in an NSF-funded program seeking to develop e-science/data intensive librarianship and improve data management/preservation practices, especially for federally-funded research. They will be reporting on their research projects, including: Data Management Sharing and Dissemination Policies Data Management Surveys Institutional Data policy For more information, see http://eslib.ischool.syr.edu/
MOESM5 of Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach
Additional file 5. Threshold validation: For 50 articles randomly chosen from each decile of the scores returned by the T0 classifier this file lists the PMID, the classifier score, and a manual classification of either “T0” or “not T0”
known as Love Your Data in 2016 - 2017
All pages from the Love Data Week event website are archived here in PDF. Love Data Week was established in 2016 as Love Your Data week. Originally created in the USA, it quickly grew to an international event in which a wide range of institutions, organizations, scholars, students, and other data lovers could celebrate their data. Coordinated by Heather Coates, the planning committee developed themes, wrote, curated content, developed activities, all to celebrate data in all its forms, promote good research data management strategies, ask hard questions about the role of data in our lives, and share data success and horror stories. Though the website is defunct, the event lives on, driven by the community