9 research outputs found
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
Nerd of the niche
Interview with organic plant breeder Anders Borgen about the work and the secto
Recommended from our members
Nonspecificity fingerprints for clinical-stage antibodies in solution.
Monoclonal antibodies (mAbs) have successfully been developed for the treatment of a wide range of diseases. The clinical success of mAbs does not solely rely on optimal potency and safety but also require good biophysical properties to ensure a high developability potential. In particular, nonspecific interactions are a key developability parameter to monitor during discovery and development. Despite an increased focus on the detection of nonspecific interactions, their underlying physicochemical origins remain poorly understood. Here, we employ solution-based microfluidic technologies to characterize a set of clinical-stage mAbs and their interactions with commonly used nonspecificity ligands to generate nonspecificity fingerprints, providing quantitative data on the underlying physical chemistry. Furthermore, the solution-based analysis enables us to measure binding affinities directly, and we evaluate the contribution of avidity in nonspecific binding by mAbs. We find that avidity can increase the apparent affinity by two orders of magnitude. Notably, we find that a subset of these highly developed mAbs show nonspecific electrostatic interactions, even at physiological pH and ionic strength, and that they can form microscale particles with charge-complementary polymers. The group of mAb constructs flagged here for nonspecificity are among the worst performers in independent reports of surface and column-based screens. The solution measurements improve on the state-of-the-art by providing a stand-alone result for individual mAbs without the need to benchmark against cohort data. Based on our findings, we propose a quantitative solution-based nonspecificity score, which can be integrated in the development workflow for biological therapeutics and more widely in protein engineering
Recommended from our members
Multidimensional Protein Solubility Optimization with an Ultrahigh-Throughput Microfluidic Platform.
Funder: Novo NordiskFunder: Frances and Augustus Newman FoundationFunder: China Scholarship CouncilFunder: University of CambridgeFunder: AstraZenecaProtein-based biologics are highly suitable for drug development as they exhibit low toxicity and high specificity for their targets. However, for therapeutic applications, biologics must often be formulated to elevated concentrations, making insufficient solubility a critical bottleneck in the drug development pipeline. Here, we report an ultrahigh-throughput microfluidic platform for protein solubility screening. In comparison with previous methods, this microfluidic platform can make, incubate, and measure samples in a few minutes, uses just 20 μg of protein (>10-fold improvement), and yields 10,000 data points (1000-fold improvement). This allows quantitative comparison of formulation excipients, such as sodium chloride, polysorbate, histidine, arginine, and sucrose. Additionally, we can measure how solubility is affected by the combinatorial effect of multiple additives, find a suitable pH for the formulation, and measure the impact of mutations on solubility, thus enabling the screening of large libraries. By reducing material and time costs, this approach makes detailed multidimensional solubility optimization experiments possible, streamlining drug development and increasing our understanding of biotherapeutic solubility and the effects of excipients.the European Union’s 279 Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant MicroREvolution 280
a Royal Society University Research Fellowshi
Recommended from our members
Multidimensional Protein Solubility Optimization with an Ultrahigh-Throughput Microfluidic Platform.
Protein-based biologics are highly suitable for drug development as they exhibit low toxicity and high specificity for their targets. However, for therapeutic applications, biologics must often be formulated to elevated concentrations, making insufficient solubility a critical bottleneck in the drug development pipeline. Here, we report an ultrahigh-throughput microfluidic platform for protein solubility screening. In comparison with previous methods, this microfluidic platform can make, incubate, and measure samples in a few minutes, uses just 20 μg of protein (>10-fold improvement), and yields 10,000 data points (1000-fold improvement). This allows quantitative comparison of formulation excipients, such as sodium chloride, polysorbate, histidine, arginine, and sucrose. Additionally, we can measure how solubility is affected by the combinatorial effect of multiple additives, find a suitable pH for the formulation, and measure the impact of mutations on solubility, thus enabling the screening of large libraries. By reducing material and time costs, this approach makes detailed multidimensional solubility optimization experiments possible, streamlining drug development and increasing our understanding of biotherapeutic solubility and the effects of excipients.the European Union’s 279 Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant MicroREvolution 280
a Royal Society University Research Fellowshi
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. © The Author(s) 2019. Published by Oxford University Press