25 research outputs found

    Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications.</p> <p>Results</p> <p>We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79ā€“0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86ā€“0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently.</p> <p>Conclusion</p> <p>Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.</p

    Rational design of an orthogonal tryptophanyl nonsense suppressor tRNA

    Get PDF
    While a number of aminoacyl tRNA synthetase (aaRS):tRNA pairs have been engineered to alter or expand the genetic code, only the Methanococcus jannaschii tyrosyl tRNA synthetase and tRNA have been used extensively in bacteria, limiting the types and numbers of unnatural amino acids that can be utilized at any one time to expand the genetic code. In order to expand the number and type of aaRS/tRNA pairs available for engineering bacterial genetic codes, we have developed an orthogonal tryptophanyl tRNA synthetase and tRNA pair, derived from Saccharomyces cerevisiae. In the process of developing an amber suppressor tRNA, we discovered that the Escherichia coli lysyl tRNA synthetase was responsible for misacylating the initial amber suppressor version of the yeast tryptophanyl tRNA. It was discovered that modification of the G:C content of the anticodon stem and therefore reducing the structural flexibility of this stem eliminated misacylation by the E. coli lysyl tRNA synthetase, and led to the development of a functional, orthogonal suppressor pair that should prove useful for the incorporation of bulky, unnatural amino acids into the genetic code. Our results provide insight into the role of tRNA flexibility in molecular recognition and the engineering and evolution of tRNA specificity

    Information retrieval: implementing and evaluating search engines

    No full text
    Information retrieval is the foundation for modern search engines. This textbook offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus, a multi-user open-source information retrieval system developed by one of the authors and available online, provides model implementations and a basis for student work. The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems implementation perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. Additionally, professionals in computer science, computer engineering, and software engineering will find Information Retrieval a valuable reference. After an introduction to the basics of information retrieval, the text covers three major topic areas ļæ½ indexing, retrieval, and evaluation ļæ½ in self-contained parts. The fInal part of the book draws on and extends the general material in the earlier parts, treating specific application areas, including parallel search engines, link analysis, crawling, and information retrieval over collections of XML documents. End-of-chapter references point to further reading; end-of-chapter exercises range from pencil and paper problems to substantial programming projects

    Document Prioritization for Scalable Query Processing

    No full text

    Faster and smaller inverted indices with treaps

    No full text
    corecore