4,651 research outputs found
PubMed and beyond: a survey of web tools for searching biomedical literature
The past decade has witnessed the modern advances of high-throughput technology and rapid growth of research capacity in producing large-scale biological data, both of which were concomitant with an exponential growth of biomedical literature. This wealth of scholarly knowledge is of significant importance for researchers in making scientific discoveries and healthcare professionals in managing health-related matters. However, the acquisition of such information is becoming increasingly difficult due to its large volume and rapid growth. In response, the National Center for Biotechnology Information (NCBI) is continuously making changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted themselves to developing Web tools for helping users quickly and efficiently search and retrieve relevant publications. These practices, together with maturity in the field of text mining, have led to an increase in the number and quality of various Web tools that provide comparable literature search service to PubMed. In this study, we review 28 such tools, highlight their respective innovations, compare them to the PubMed system and one another, and discuss directions for future development. Furthermore, we have built a website dedicated to tracking existing systems and future advances in the field of biomedical literature search. Taken together, our work serves information seekers in choosing tools for their needs and service providers and developers in keeping current in the field
Analysis of Abstractive and Extractive Summarization Methods
This paper explains the existing approaches employed for (automatic) text summarization. The summarizing method is part of the natural language processing (NLP) field and is applied to the source document to produce a compact version that preserves its aggregate meaning and key concepts. On a broader scale, approaches for text-based summarization are categorized into two groups: abstractive and extractive. In abstractive summarization, the main contents of the input text are paraphrased, possibly using vocabulary that is not present in the source document, while in extractive summarization, the output summary is a subset of the input text and is generated by using the sentence ranking technique. In this paper, the main ideas behind the existing methods used for abstractive and extractive summarization are discussed broadly. A comparative study of these methods is also highlighted
Query Based Sampling and Multi-Layered Semantic Analysis to find Robust Network of Drug-Disease Associations
This thesis presents the design and implementation of a system to discover the semantically related networks of drug-disease associations, called DDNet, from medical literature. A fully functional DDNet can be transformative in identification of drug targets and may new avenues for drug repositioning in clinical and translational research. In particular, a Local Latent Semantic Analysis (LLSA) was introduced to implement a system that is efficient, scalalble and relatively free from systemi bias. In addition, a query-based sampling was introduced to find representative samples from the ocean of data to build model that is relatively free from garbage-in garbage-out syndrome. Also, the concept of mapping ontologies was adopted to determine the relevant results and reverse ontology mapping were used to create a network of associations. In addition, a web service application was developed to query the system and visualize the computed network of associations in a form that is easy to interact. A pilot study was conducted to evaluate the performance of the system using both subjective and objective measures. The PahrmGKB was used as the gold standard and the PR curve was obtained from a large number of queries at different recall points. Empirical analyses suggest that DDNet is robust, relatively stable and scalable over traditional Global LSA model
Towards Automatic Extraction of Social Networks of Organizations in PubMed Abstracts
Social Network Analysis (SNA) of organizations can attract great interest
from government agencies and scientists for its ability to boost translational
research and accelerate the process of converting research to care. For SNA of
a particular disease area, we need to identify the key research groups in that
area by mining the affiliation information from PubMed. This not only involves
recognizing the organization names in the affiliation string, but also
resolving ambiguities to identify the article with a unique organization. We
present here a process of normalization that involves clustering based on local
sequence alignment metrics and local learning based on finding connected
components. We demonstrate the application of the method by analyzing
organizations involved in angiogenensis treatment, and demonstrating the
utility of the results for researchers in the pharmaceutical and biotechnology
industries or national funding agencies.Comment: This paper has been withdrawn; First International Workshop on Graph
Techniques for Biomedical Networks in Conjunction with IEEE International
Conference on Bioinformatics and Biomedicine, Washington D.C., USA, Nov. 1-4,
2009; http://www.public.asu.edu/~sjonnal3/home/papers/IEEE%20BIBM%202009.pd
Using computational modeling to assess the impact of clinical decision support on cancer screening improvement strategies within the community health centers
AbstractOur conceptual model demonstrates our goal to investigate the impact of clinical decision support (CDS) utilization on cancer screening improvement strategies in the community health care (CHC) setting. We employed a dual modeling technique using both statistical and computational modeling to evaluate impact. Our statistical model used the Spearman’s Rho test to evaluate the strength of relationship between our proximal outcome measures (CDS utilization) against our distal outcome measure (provider self-reported cancer screening improvement). Our computational model relied on network evolution theory and made use of a tool called Construct-TM to model the use of CDS measured by the rate of organizational learning. We employed the use of previously collected survey data from community health centers Cancer Health Disparities Collaborative (HDCC). Our intent is to demonstrate the added valued gained by using a computational modeling tool in conjunction with a statistical analysis when evaluating the impact a health information technology, in the form of CDS, on health care quality process outcomes such as facility-level screening improvement. Significant simulated disparities in organizational learning over time were observed between community health centers beginning the simulation with high and low clinical decision support capability
EFindSite: Improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands
Molecular structures and functions of the majority of proteins across different species are yet to be identified. Much needed functional annotation of these gene products often benefits from the knowledge of protein-ligand interactions. Towards this goal, we developed eFindSite, an improved version of FINDSITE, designed to more efficiently identify ligand binding sites and residues using only weakly homologous templates. It employs a collection of effective algorithms, including highly sensitive meta-threading approaches, improved clustering techniques, advanced machine learning methods and reliable confidence estimation systems. Depending on the quality of target protein structures, eFindSite outperforms geometric pocket detection algorithms by 15-40 % in binding site detection and by 5-35 % in binding residue prediction. Moreover, compared to FINDSITE, it identifies 14 % more binding residues in the most difficult cases. When multiple putative binding pockets are identified, the ranking accuracy is 75-78 %, which can be further improved by 3-4 % by including auxiliary information on binding ligands extracted from biomedical literature. As a first across-genome application, we describe structure modeling and binding site prediction for the entire proteome of Escherichia coli. Carefully calibrated confidence estimates strongly indicate that highly reliable ligand binding predictions are made for the majority of gene products, thus eFindSite holds a significant promise for large-scale genome annotation and drug development projects. eFindSite is freely available to the academic community at http://www.brylinski.org/efindsite. © 2013 Springer Science+Business Media Dordrecht
Utilization of global ranking information in GraphBased biomedical literature clustering
Paper presented at the 9th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2007, Regensburg, Germany.In this paper, we explore how global ranking method in conjunction with local density method help identify meaningful term clusters from ontology enriched graph representation of biomedical literature corpus. One big problem with document clustering is how to discount the effects of class-unspecific general terms and strengthen the effects of class-specific core terms. We claim that running global ranking method on a well constructed term graph can identify class-specific core terms. In detail, PageRank and HITS are applied on a direct abstract-title graph to target class specific core terms. Then k dense terms clusters (graph) are identified from these terms. Finally, a document is assigned to the closest term graph. A series of experiments are conducted on a document corpus collected from PubMed. Experimental results show that our approach is very effective to identify class-specific core terms and thus help document clustering
- …