691 research outputs found

    Identifying Data Sharing in Biomedical Literature

    Get PDF
    Many policies and projects now encourage investigators to share their raw research data with other scientists. Unfortunately, it is difficult to measure the effectiveness of these initiatives because data can be shared in such a variety of mechanisms and locations. We propose a novel approach to finding shared datasets: using NLP techniques to identify declarations of dataset sharing within the full text of primary research articles. Using regular expression patterns and machine learning algorithms on open access biomedical literature, our system was able to identify 61% of articles with shared datasets with 80% precision. A simpler version of our classifier achieved higher recall (86%), though lower precision (49%). We believe our results demonstrate the feasibility of this approach and hope to inspire further study of dataset retrieval techniques and policy evaluation.
&#xa

    A review of journal policies for sharing research data

    Get PDF
    *Background:* Sharing data is a tenet of science, yet commonplace in only a few subdisciplines. Recognizing that a data sharing culture is unlikely to be achieved without policy guidance, some funders and journals have begun to request and require that investigators share their primary datasets with other researchers. The purpose of this study is to understand the current state of data sharing policies within journals, the features of journals which are associated with the strength of their data sharing policies, and whether the strength of data sharing policies impact the observed prevalence of data sharing. 

*Methods:* We investigated these relationships with respect to gene expression microarray data in the journals that most often publish studies about this type of data. We measured data sharing prevalence as the proportion of papers with submission links from NCBI's Gene Expression Omnibus (GEO) database. We conducted univariate and linear multivariate regressions to understand the relationship between the strength of data sharing policy and journal impact factor, journal subdiscipline, journal publisher (academic societies vs. commercial), and publishing model (open vs. closed access).

*Results:* Of the 70 journal policies, 18 (26%) made no mention of sharing publication-related data within their Instruction to Author statements. Of the 42 (60%) policies with a data sharing policy applicable to microarrays, we classified 18 (26% of 70) as moderately strong and 24 (34% of 70) as strong.
Existence of a data sharing policy was associated with the type of journal publisher: half of all commercial publishers had a policy compared to 82% of journals published by academic society. All four of the open-access journals had a data sharing policy. Policy strength was associated with impact factor: the journals with no data sharing policy, a weak policy, and a strong policy had respective median impact factors of 3.6, 4.5, and 6.0. Policy strength was positively associated with measured data sharing submission into the GEO database: the journals with no data sharing policy, a weak policy, and a strong policy had median data sharing prevalence of 11%, 19%, and 29% respectively.

*Conclusion:* This review and analysis begins to quantify the relationship between journal policies and data sharing outcomes and thereby contributes to assessing the incentives and initiatives designed to facilitate widespread, responsible, effective data sharing. 

&#xa

    Using open access literature to guide full-text query formulation

    Get PDF
    *Background*
Much scientific knowledge is contained in the details of the full-text biomedical literature. Most research in automated retrieval presupposes that the target literature can be downloaded and preprocessed prior to query. Unfortunately, this is not a practical or maintainable option for most users due to licensing restrictions, website terms of use, and sheer volume. Scientific article full-text is increasingly queriable through portals such as PubMed Central, Highwire Press, Scirus, and Google Scholar. However, because these portals only support very basic Boolean queries and full text is so expressive, formulating an effective query is a difficult task for users. We propose improving the formulation of full-text queries by using the open access literature as a proxy for the literature to be searched. We evaluated the feasibility of this approach by building a high-precision query for identifying studies that perform gene expression microarray experiments.

*Methodology and Results*
We built decision rules from unigram and bigram features of the open access literature. Minor syntax modifications were needed to translate the decision rules into the query languages of PubMed Central, Highwire Press, and Google Scholar. We mapped all retrieval results to PubMed identifiers and considered our query results as the union of retrieved articles across all portals. Compared to our reference standard, the derived full-text query found 56% (95% confidence interval, 52% to 61%) of intended studies, and 90% (86% to 93%) of studies identified by the full-text search met the reference standard criteria. Due to this relatively high precision, the derived query was better suited to the intended application than alternative baseline MeSH queries.

*Significance*
Using open access literature to develop queries for full-text portals is an open, flexible, and effective method for retrieval of biomedical literature articles based on article full-text. We hope our approach will raise awareness of the constraints and opportunities in mainstream full-text information retrieval and provide a useful tool for today’s researchers.
&#xa

    Prevalence and Patterns of Microarray Data Sharing

    Get PDF
    Sharing research data is a cornerstone of science. Although many tools and policies exist to encourage data sharing, the prevalence with which datasets are shared is not well understood. We report our preliminary results on patterns of sharing microarray data in public databases.

The most comprehensive method for measuring occurrences of public data sharing is manual curation of research reports, since data sharing plans are usually communicated in free text within the body of an article. Our early findings from manual curation of 100 papers suggest that 30% of investigators publicly share their full microarray datasets. Of these, 70% of the datasets are deposited at NCBI's Gene Expression Omnibus (GEO) database, 20% at EBI's ArrayExpress, and 10% in smaller databases or lab or publisher websites.

Next, we supplemented this manual process with a rough automated estimate of data sharing prevalence. Using PubMed, we identified research articles with MeSH terms for both "Gene Expression Profiling" and "Oligonucleotide Array Sequence Analysis" and published in 2006. We then searched GEO and ArrayExpress for links to these PubMed IDs to determine which of the articles had been credited as an originating data source.

Of the 2503 articles, 440 (18%) articles had links from either GEO or ArrayExpress. Of these 440 articles, 70% had links from GEO and 30% from ArrayExpress, with an overlapping 12% from both GEO and ArrayExpress.

Interestingly, studies with free full text at PubMed were twice (Odds Ratio=2.1; 95% confidence interval: [1.7 to 2.5]) as likely to be linked as a data source within GEO or ArrayExpress than those without free full text. Studies with human data were less likely to have a link (OR=0.8 [0.6 to 0.9]) than studies with only non-human data. The proportion of articles with a link within these two databases has increased over time: the odds of a data-source link for studies was 2.5 [2.0 to 3.1] times greater for studies published in 2006 than 2002.

As might be expected, studies with the fewest funding sources had the fewest data-sharing links: only 28 (6%) of the 433 studies with no funding source were listed within GEO or ArrayExpress. In contrast, studies funded by the NIH, the US government, or a non-US government source had data-sharing links in 282 of 1556 cases (18%), while studies funded by two or more of these mechanisms were listed in the databases in 130 out of 514 cases (25%).

In summary, our initial manual approach for identifying studies which shared their data was comprehensive but time-consuming; natural language processing techniques could be helpful. Our subsequent automated approach yielded conservative estimates for total data sharing prevalence, nonetheless revealing several promising hypotheses for data sharing behavior

We hope these preliminary results will inspire additional investigations into data sharing behavior, and in turn the development of effective policies and tools to facilitate this important aspect of scientific research

    Using open access literature to guide full-text query formulation

    Get PDF
    *Background* 
Much scientific knowledge is contained in the details of the full-text biomedical literature. Most research in automated retrieval presupposes that the target literature can be downloaded and preprocessed prior to query. Unfortunately, this is not a practical or maintainable option for most users due to licensing restrictions, website terms of use, and sheer volume. Scientific article full-text is increasingly queriable through portals such as PubMed Central, Highwire Press, Scirus, and Google Scholar. However, because these portals only support very basic Boolean queries and full text is so expressive, formulating an effective query is a difficult task for users. We propose improving the formulation of full-text queries by using the open access literature as a proxy for the literature to be searched. We evaluated the feasibility of this approach by building a high-precision query for identifying studies that perform gene expression microarray experiments.
 
*Methodology and Results* 
We built decision rules from unigram and bigram features of the open access literature. Minor syntax modifications were needed to translate the decision rules into the query languages of PubMed Central, Highwire Press, and Google Scholar. We mapped all retrieval results to PubMed identifiers and considered our query results as the union of retrieved articles across all portals. Compared to our reference standard, the derived full-text query found 56% (95% confidence interval, 52% to 61%) of intended studies, and 90% (86% to 93%) of studies identified by the full-text search met the reference standard criteria. Due to this relatively high precision, the derived query was better suited to the intended application than alternative baseline MeSH queries.
 
*Significance* 
Using open access literature to develop queries for full-text portals is an open, flexible, and effective method for retrieval of biomedical literature articles based on article full-text. We hope our approach will raise awareness of the constraints and opportunities in mainstream full-text information retrieval and provide a useful tool for today’s researchers.
&#xa

    Love: A Biological, Psychological and Philosophical Study

    Get PDF
    The concept of love has been an eternally elusive subject. It is a definition and meaning that philosophers, psychologists, and biologists have been seeking since the beginning of time. Wars have been waged and fought over it, while friendships have been initiated and have ended because of this idea. But what exactly is love, and why is it important to define this enigma? In order to help define this idea of love, several books and numerous research articles were consulted, and interviews were conducted with faculty of The University of Rhode Island. Dr. Nasser Zawia was interviewed, in order to help understand the role of neurobiology in the process of falling in love. Dr. Zawia explained the importance of neurotransmitters and brain activity when a person is in love. Dr. Dianne Kinsey was consulted, in order to help clarify the importance of the psychology of love. Finally, an interview with Dr. William Krieger revealed the importance of the study of philosophy and how it relates to the concept of love. Research has concluded that the disciplines of biology, psychology, and philosophy are all important in analyzing love; however, more research needs to be done in order to define what love actually is, and how we can apply this knowledge in our everyday lives. With the divorce rates increasing, and the idea of marriage changing in today’s society, the importance of studying the concept of love cannot be overlooked. It is in this research that we, as a community, will be able to understand love, and its importance to the survival of the human race

    THE STAKEHOLDER GAP LENS: TEACHER AND PARENTAL PERCEPTIONS OF THE ACHIEVEMENT GAP IN KENTUCKY\u27S PUBLIC SCHOOLS

    Get PDF
    The research around the achievement gap is extensive. However, regardless that the term “achievement gap” is so widely used in academia today, there is often confusion surrounding what the achievement gap is. This study seeks to answer three research questions: (1) To what extent does an achievement gap exist among different subgroups of students in Kentucky’s K-12 public schools? (2) How do the perceptions of parents and teachers interact with decision-making? (3) How do the ideas of parents and teachers regarding closing the achievement gap compare? This research examines perceptions of the existence of an achievement gap in Kentucky’s public schools from the perspectives of two groups of stakeholders: parents and teachers. This study aims to identify trends in thinking about the existence of an achievement gap, how information is communicated, and how stakeholders think gaps can be closed. The results of this study indicate that stakeholders have a general understanding of the achievement gap; however, methods of communication with parents need strengthening. Findings show that Kentucky schools with gaps tend to have multiple subgroups, rather than a single group, performing lower than their peers, but stakeholders have mixed ideas on closing these gaps

    Recall and bias of retrieving gene expression microarray datasets through PubMed identifiers

    Get PDF
    Background: The ability to locate publicly available gene expression microarray datasets effectively and efficiently facilitates the reuse of these potentially valuable resources. Centralized biomedical databases allow users to query dataset metadata descriptions, but these annotations are often too sparse and diverse to allow complex and accurate queries. In this study we examined the ability of PubMed article identifiers to locate publicly available gene expression microarray datasets, and investigated whether the retrieved datasets were representative of publicly available datasets found through statements of data sharing in the associated research articles. Results: In a recent article, Ochsner and colleagues identified 397 studies that had generated gene expression microarray data. Their search of the full text of each publication for statements of data sharing revealed 203 publicly available datasets, including 179 in the Gene Expression Omnibus (GEO) or ArrayExpress databases. Our scripted search of GEO and ArrayExpress for PubMed identifiers of the same 397 studies returned 160 datasets, including six not found by the original search for data sharing statements. As a proportion of datasets found by either method, the search for data sharing statements identified 91.4% of the 209 publicly available datasets, compared to only 76.6% found by our search carried out using PubMed identifiers. Searching GEO or ArrayExpress alone retrieved 63.2% and 46.9% of all available datasets, respectively. There was no difference in the type of datasets found by PubMed identifier searches in terms of research theme or the technology used. However, the studies identified were more likely to have larger sample sizes, were more frequently cited, and published in higher impact journals. Conclusions: Searching database entries using PubMed identifiers can identify the majority of publicly available datasets, but caution is required when this method is used to collect data for policy evaluation since studies in low impact journals are disproportionately excluded. We urge authors of all datasets to complete the citation fields for their dataset submissions once publication details are known, thereby ensuring their work has maximum visibility and can contribute to subsequent studies

    Predicting Gambling Situations: The Roles of Impulsivity, Substance Use, and Post-Traumatic Stress

    Get PDF
    Gambling disorder and symptoms of post-traumatic stress are highly comorbid. Numerous studies suggest that the presence of one (either disordered gambling or post-traumatic stress) substantially increases the odds of later developing the other. However, little is known about the etiological links between these two domains or the nuances of the comorbidity. Past research has suggested that symptoms of post-traumatic stress might be related to unique motivations for and beliefs about gambling. The present work sought to examine whether or not symptoms of post-traumatic stress might also be related to specific situational vulnerabilities to gambling behaviors. Using a large cross-sectional sample of Internet-using adults in the United States who were primarily recreational gamblers (N = 743; 46% men,  = 36.0, SD = 11.1), as well as an inpatient sample of US Armed Forces veterans seeking treatment for gambling disorder (N = 332, 80% men,  = 53.5, SD = 11.5), the present work tested whether or not symptoms of post-traumatic stress were uniquely related to a variety of gambling situations. Results in both samples revealed that even when controlling for potentially confounding variables (eg, substance use and trait impulsivity), symptoms of post-traumatic stress were uniquely related to gambling in response to negative affect, gambling in response to social pressure, and gambling due to a need for excitement. These findings are consistent with recent work suggesting that individuals with post-traumatic stress symptoms are more likely to engage in gambling behaviors for unique reasons that differ from gamblers without such symptoms

    A Review of Health Literacy and Its Relationship to Nutrition Education

    Get PDF
    Health literacy has emerged as a focus of increasing research in the medical literature, yet it has received little attention in the nutrition literature. Because nutrition practice is an important sector of the health care environment and reduced health literacy confers known health consequences, dietitians should be equipped with an understanding of how health literacy extends to nutritional care. Identification instruments that are available fail to provide an understanding of nutrition literacy. Nutrition literacy may include knowledge of nutrition principles and nutrition skills. Additional research into the development of appropriate nutrition literacy tools and their application is needed
    corecore