41 research outputs found
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
A Swiss cheese error detection method for real-time EPID-based quality assurance and error prevention.
PURPOSE
To develop a robust and efficient process that detects relevant dose errors (dose errors of ≥5%) in external beam radiation therapy and directly indicates the origin of the error. The process is illustrated in the context of electronic portal imaging device (EPID)-based angle-resolved volumetric modulated arc therapy (VMAT) quality assurance (QA), particularly as would be implemented in a real-time monitoring program.
METHODS
A Swiss cheese error detection (SCED) method was created as a paradigm for a cine EPID-based during-treatment QA. For VMAT, the method compares a treatment-plan-based reference set of EPID images with images acquired over each 2° gantry angle interval. The process utilizes a sequence of independent consecutively executed error detection tests: an aperture check that verifies infield radiation delivery and ensures no out-of-field radiation; output normalization checks at two different stages; global image alignment check to examine if rotation, scaling and translation are within tolerances; pixel intensity check containing the standard gamma evaluation (3%, 3 mm) and pixel intensity deviation checks including and excluding high dose gradient regions. Tolerances for each check were determined. To test the SCED method, 12 different types of errors were selected to modify the original plan. A series of angle-resolved predicted EPID images was artificially generated for each test case, resulting in a sequence of pre-calculated frames for each modified treatment plan. The SCED method was applied multiple times for each test case to assess the ability to detect introduced plan variations. To compare the performance of the SCED process with that of a standard gamma analysis, both error detection methods were applied to the generated test cases with realistic noise variations.
RESULTS
Averaged over ten test runs, 95.1% of all plan variations that resulted in relevant patient dose errors were detected within 2° and 100% within 14° (<4% of patient dose delivery). Including cases that led to slightly modified but clinically equivalent plans, 89.1% were detected by the SCED method within 2°. Based on the type of check that detected the error, determination of error sources was achieved. With noise ranging from no random noise to four times the established noise value, the averaged relevant dose error detection rate of the SCED method was between 94.0% and 95.8% and that of gamma between 82.8% and 89.8%.
CONCLUSIONS
An EPID-frame-based error detection process for VMAT deliveries was successfully designed and tested via simulations. The SCED method was inspected for robustness with realistic noise variations, demonstrating that it has the potential to detect a large majority of relevant dose errors. Compared to a typical (3%, 3 mm) gamma analysis, the SCED method produced a higher detection rate for all introduced dose errors, identified errors in an earlier stage, displayed a higher robustness to noise variations and indicated the error source. This article is protected by copyright. All rights reserved
Recommended from our members
Monte Carlo–based dosimetry of head-and-neck patients treated with SIB-IMRT
Purpose: To evaluate the accuracy of previously reported superposition/convolution (SC) dosimetric results by comparing with Monte Carlo (MC) dose calculations for head-and-neck intensity-modulated radiation therapy (IMRT) patients treated with the simultaneous integrated boost technique.
Methods and Materials: Thirty-one plans from 24 patients previously treated on a phase I/II head-and-neck squamous cell carcinoma simultaneous integrated boost IMRT protocol were used. Clinical dose distributions, computed with an SC algorithm, were recomputed using an EGS4-based MC algorithm. Phantom-based dosimetry quantified the fluence prediction accuracy of each algorithm. Dose–volume indices were used to compare patient dose distributions.
Results and Discussion: The MC algorithm predicts flat-phantom measurements better than the SC algorithm. Average patient dose indices agreed within 2.5% of the local dose for targets; 5.0% for parotids; and 1.9% for cord and brainstem. However, only 1 of 31 plans agreed within 3% for all indices; 4 of 31 agreed within 5%. In terms of the prescription dose, 4 of 31 plans agreed within 3% for all indices, whereas 28 of 31 agreed within 5%.
Conclusions: Average SC-computed doses agreed with MC results in the patient geometry; however deviations >5% were common. The fluence modulation prediction is likely the major source of the dose discrepancy. The observed dose deviations can impact dose escalation protocols, because they would result in shifting patients to higher dose levels