TREC Genomics Track Overview

Abstract

The first year of TREC Genomics Track featured two tasks: ad hoc retrieval and information extraction. Both tasks centered around the Gene Reference into Function (GeneRIF) resource of the National Library of Medicine, which was used as both pseudorelevance judgments for ad hoc document retrieval as well as target text for information extraction. The track attracted 29 groups who participated in one or both tasks. The growing amount of scientific discovery in genomics and related biomedical disciplines has led to a corresponding growth in the amount of on-line data and information. A growing challenge for biomedical researchers is how to access and manage this ever-increasing quantity of information. This situation presents opportunities and challenges for the information retrieval (IR) field. IR has historically focused on document retrieval, but the field has expanded in recent years with the growth of new information needs (e.g., question-answering, cross-lingual), data types (e.g., video) and platforms (e.g., the Web). This paper describes the events leading up to the first year of TREC Genomics Track, the first year’s results, and future directions for subsequent years. Genomics and Information Resources The field of genomics is concerned with the genome, which is usually defined as the genetic material of living organisms. Its research focuses on the central dogma of biology: deoxyribonucleic acid (DNA) is transcribed into ribonucleic acid (RNA), which serves to translate the nucleotide sequences of DNA into proteins. The latter are responsible for functions in living organisms and the collection of all proteins in is increasingly called the proteome. With the advent of new technologies for sequencing the genome and proteome, along with other tools for identifying the expression o

    Similar works

    Full text

    thumbnail-image

    Available Versions