65 research outputs found
Sexual and Gender Minority Status and Suicide Mortality: An Explainable Artificial Intelligence Analysis
Objectives: Suicide risk is elevated in lesbian, gay, bisexual, and transgender (LGBT) individuals. Limited data on LGBT status in healthcare systems hinder our understanding of this risk. This study used natural language processing to extract LGBT status and a deep neural network (DNN) to examine suicidal death risk factors among US Veterans.Methods: Data on 8.8 million veterans with visits between 2010 and 2017 was used. A case-control study was performed, and suicide death risk was analyzed by a DNN. Feature impacts and interactions on the outcome were evaluated.Results: The crude suicide mortality rate was higher in LGBT patients. However, after adjusting for over 200 risk and protective factors, known LGBT status was associated with reduced risk compared to LGBT-Unknown status. Among LGBT patients, black, female, married, and older Veterans have a higher risk, while Veterans of various religions have a lower risk.Conclusion: Our results suggest that disclosed LGBT status is not directly associated with an increase suicide death risk, however, other factors (e.g., depression and anxiety caused by stigma) are associated with suicide death risks
Dynamic summarization of bibliographic-based data
<p>Abstract</p> <p>Background</p> <p>Traditional information retrieval techniques typically return excessive output when directed at large bibliographic databases. Natural Language Processing applications strive to extract salient content from the excessive data. Semantic MEDLINE, a National Library of Medicine (NLM) natural language processing application, highlights relevant information in PubMed data. However, Semantic MEDLINE implements manually coded schemas, accommodating few information needs. Currently, there are only five such schemas, while many more would be needed to realistically accommodate all potential users. The aim of this project was to develop and evaluate a statistical algorithm that automatically identifies relevant bibliographic data; the new algorithm could be incorporated into a dynamic schema to accommodate various information needs in Semantic MEDLINE, and eliminate the need for multiple schemas.</p> <p>Methods</p> <p>We developed a flexible algorithm named Combo that combines three statistical metrics, the Kullback-Leibler Divergence (KLD), Riloff's RlogF metric (RlogF), and a new metric called PredScal, to automatically identify salient data in bibliographic text. We downloaded citations from a PubMed search query addressing the genetic etiology of bladder cancer. The citations were processed with SemRep, an NLM rule-based application that produces semantic predications. SemRep output was processed by Combo, in addition to the standard Semantic MEDLINE genetics schema and independently by the two individual KLD and RlogF metrics. We evaluated each summarization method using an existing reference standard within the task-based context of genetic database curation.</p> <p>Results</p> <p>Combo asserted 74 genetic entities implicated in bladder cancer development, whereas the traditional schema asserted 10 genetic entities; the KLD and RlogF metrics individually asserted 77 and 69 genetic entities, respectively. Combo achieved 61% recall and 81% precision, with an F-score of 0.69. The traditional schema achieved 23% recall and 100% precision, with an F-score of 0.37. The KLD metric achieved 61% recall, 70% precision, with an F-score of 0.65. The RlogF metric achieved 61% recall, 72% precision, with an F-score of 0.66.</p> <p>Conclusions</p> <p>Semantic MEDLINE summarization using the new Combo algorithm outperformed a conventional summarization schema in a genetic database curation task. It potentially could streamline information acquisition for other needs without having to hand-build multiple saliency schemas.</p
The James Webb Space Telescope Mission
Twenty-six years ago a small committee report, building on earlier studies,
expounded a compelling and poetic vision for the future of astronomy, calling
for an infrared-optimized space telescope with an aperture of at least .
With the support of their governments in the US, Europe, and Canada, 20,000
people realized that vision as the James Webb Space Telescope. A
generation of astronomers will celebrate their accomplishments for the life of
the mission, potentially as long as 20 years, and beyond. This report and the
scientific discoveries that follow are extended thank-you notes to the 20,000
team members. The telescope is working perfectly, with much better image
quality than expected. In this and accompanying papers, we give a brief
history, describe the observatory, outline its objectives and current observing
program, and discuss the inventions and people who made it possible. We cite
detailed reports on the design and the measured performance on orbit.Comment: Accepted by PASP for the special issue on The James Webb Space
Telescope Overview, 29 pages, 4 figure
Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity.
Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe