450,010 research outputs found

    Experiments with document archive size detection

    Get PDF
    The size of a document archive is a very important parameter for resource selection in distributed information retrieval systems. In this paper, we present a method for automatically detecting the size (ie the number of documents) of a document archive, in case the archive itself does not provide such information. In addition, a method for detecting incremental change of the archive size is also presented, which can be useful for deciding if a resource description has become obsolete and needs to be regenerated. An experimental evaluation of these methods shows that they provide quite acurate information

    Current status of the international Halley Watch infrared net archive

    Get PDF
    The primary purposes of the Halley Watch have been to promote Halley observations, coordinate and standardize the observing where useful, and to archive the results in a database readily accessible to cometary scientists. The intention of IHW is to store the observations themselves, along with any information necessary to allow users to understand and use the data, but to exclude interpretations of these data. Each of the archives produced by the IHW will appear in two versions: a printed archive and a digital archive on CD-ROMs. The archive is expected to have a very long lifetime. The IHW has already produced an archive for P/Crommelin. This consists of one printed volume and two 1600 bpi tapes. The Halley archive will contain at least twenty gigabytes of information

    Workshop Review: Timescapes Secondary Analysis Workshop

    Get PDF
    The Timescapes Workshops were offered as three, one day events held around the UK for researchers and practitioners to learn about and interact with the Timescapes Archive. This archive forms an integral part of a five year ESRC qualitative longitudinal study which explores and documents the changing nature of personal and family relationships. The workshop provided a forum from which to explore the purpose and value of archiving qualitative data sets for future (secondary use). Issues of ownership and consent were central to many of the discussion which took place throughout the day. In addition, the practical ‘hands-on’ session with the archive raised issues about the skill of archiving for future use as well as the optimal functionality and usability of an archive for secondary analysis. This workshop provided a useful addition to the training needs increasingly required by qualitative researchers where archiving for secondary use is now an important consideration within the design and dissemination phases of research

    Establishing a Central Archive for Transit Passenger Data

    Get PDF
    This report describes the rationale, background, establishing organization, and future steps of CATPAD, the Central Archive for Transit Passenger Data. The Central Archive for Transit Passenger Data is a repository that collects, indexes, archives, and makes available online the transit survey instruments, data, and reports collected across the country. This resource is unique in its focus on the disaggregated information of individual transit users – information that is critical for a range of transportation planning analyses. In addition, where available, CATPAD contains aggregated information, such as station boardings and service and fare schedules, to provide key context for the disaggregate person-level data. The Central Archive for Transit Passenger Data seeks to overcome the current impediments to accessing transit survey data by providing a single, searchable, internet archive to store and disseminate this valuable information. The Central Archive for Transit Passenger Data explicitly aims to expand the public return on the considerable investment made to gather transit passenger data. The resource is designed from the start to serve the needs of a range of use cases from transportation planners and policy makers to researchers and community advocates. The goal of CATPAD is to make useful data available to inform transit decision making at all levels and to foster ongoing refinement of the nation’s transit network

    SOUSA: the Swift Optical/Ultraviolet Supernova Archive

    Get PDF
    The Ultra-Violet Optical Telescope on the Swift spacecraft has observed hundreds of supernovae, covering all major types and most subtypes. Here we introduce the Swift Optical/Ultraviolet Supernova Archive (SOUSA), which will contain all of the supernova images and photometry. We describe the observation and reduction procedures and how they impact the final data. We show photometry from well-observed examples of most supernova classes, whose absolute magnitudes and colors may be used to infer supernova types in the absence of a spectrum. A full understanding of the variety within classes and a robust photometric separation of the groups requires a larger sample, which will be provided by the final archive. The data from the existing Swift supernovae are also useful for planning future observations with Swift as well as future UV observatories.Comment: Accepted for publication in the UV issue of Astrophysics and Space Science 10 pages, 6 figures SOUSA is an archive in progress with data being posted to the Swift SN website: http://swift.gsfc.nasa.gov/docs/swift/sne/swift_sn.htm

    Archiving scientific data

    Get PDF
    We present an archiving technique for hierarchical data with key structure. Our approach is based on the notion of timestamps whereby an element appearing in multiple versions of the database is stored only once along with a compact description of versions in which it appears. The basic idea of timestamping was discovered by Driscoll et. al. in the context of persistent data structures where one wishes to track the sequences of changes made to a data structure. We extend this idea to develop an archiving tool for XML data that is capable of providing meaningful change descriptions and can also efficiently support a variety of basic functions concerning the evolution of data such as retrieval of any specific version from the archive and querying the temporal history of any element. This is in contrast to diff-based approaches where such operations may require undoing a large number of changes or significant reasoning with the deltas. Surprisingly, our archiving technique does not incur any significant space overhead when contrasted with other approaches. Our experimental results support this and also show that the compacted archive file interacts well with other compression techniques. Finally, another useful property of our approach is that the resulting archive is also in XML and hence can directly leverage existing XML tools

    Language ecology and photographic sound in the McWorld

    Full text link
    The unique sounds of the world’s small-scale languages are being extinguished at an alarming rate. This article explores links between acoustic ecology and language ecology and outlines an approach to the creation of archive material as both source for and useful by-product of sound art practice and research. Through my work with endangered clicklanguages in the Kalahari Desert, it considers the boundaries between language and music and discusses the use of flat speaker technology to explore new relations between sound and image, portrait and soundscape in a cross-cultural context

    MEDIN Feasibility Study : archiving oil and gas industry site survey data

    Get PDF
    This report was commissioned by the Marine Environmental and Information Network (MEDIN) to investigate the feasibility of collecting oil and gas industry site surveys conducted on the UKCS (UK Continental Shelf) for archive in the MEDIN DAC (Data Archive Centre) network. The archive of three principle data types is explored; information about legacy site surveys, catalogues of information about data products associated with site surveys and actual site survey data, which may include a survey report and enclosures and/or a selection of data e.g. side-scan or multibeam, sample descriptions and seismic profiles. The merits of the collection of these data types are explored alongside the cost implications, from both an oil and gas industry contractor’s and a marine geoscientist’s perspective, thereby enabling MEDIN to better understand and make decisions as to which data to concentrate on. The principles and proposed procedures for carrying out the collection of these data types are outlined however the practical details of these will require agreement should any decision be made to proceed. At this stage a further thorough detailed scope will be required in order to formulate procedures, qualify numbers, define activities, identify resources and plan timescales. The time period for the collection of legacy site surveys will require consideration i.e. how far back it is feasible to collect this information, and whether requests should be phased to include surveys acquired within predetermined time intervals. The size of the actual site survey data holdings, the storage capacity required to archive these and the amount of work involved in processing this data into useable and useful formats will require review. Some of these issues may need to be considered on a case-by-case basis. The procedures themselves will require regular review dependent on the response i.e. the volume, types and condition of data received
    • …
    corecore