14 research outputs found
Interim research assessment 2003-2005 - Computer Science
This report primarily serves as a source of information for the 2007 Interim Research Assessment Committee for Computer Science at the three technical universities in the Netherlands. The report also provides information for others interested in our research activities
Recommended from our members
An investigation of computerized information storage and retrieval methods, in a film library organized according to the universal decimal classification
The operation of the BBC Film Library was studied with the intention of defining those areas likely to benefit from computerization. The state of the art of computerized information retrieval was assessed by means of the literature, and those techniques likely to be of use at the Film Library were isolated.
Computer programs were written to provide an information storage and retrieval system paralleling the manual system currently used at the Film Library, organized according to the UNIVERSAL DECIMAL CLASSIFICATION (UDC). These programs were operated by the film librarians in situ.
A computerized system able to "learn" from enquiries was built and tested, and document clustering was also investigated as a method of subject classification.
A modular approach to retrieval system design was developed within the framework of a Relational Database system, so that the various retrieval methods examined in the course of the study could be cemented into one concertive retrieval system
Analytical study and computational modeling of statistical methods for data mining
Today, there is tremendous increase of the information available on electronic form. Day by day it is increasing massively. There are enough opportunities for research to retrieve knowledge from the data available in this information. Data mining and app
Information resources management, 1984-1989: A bibliography with indexes
This bibliography contains 768 annotated references to reports and journal articles entered into the NASA scientific and technical information database 1984 to 1989
Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy
The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference
Temporal search in web archives
Web archives include both archives of contents originally published on the Web (e.g., the Internet Archive) but also archives of contents published long ago that are now accessible on the Web (e.g., the archive of The Times). Thanks to the increased awareness that web-born contents are worth preserving and to improved digitization techniques, web archives have grown in number and size. To unfold their full potential, search techniques are needed that consider their inherent special characteristics. This work addresses three important problems toward this objective and makes the following contributions:
- We present the Time-Travel Inverted indeX (TTIX) as an efficient solution to time-travel text search in web archives, allowing users to search only the parts of the web archive that existed at a user's time of interest.
- To counter negative effects that terminology evolution has on the quality of search results in web archives, we propose a novel query-reformulation technique, so that old but highly relevant documents are retrieved in response to today's queries.
- For temporal information needs, for which the user is best satisfied by documents that refer to particular times, we describe a retrieval model that integrates temporal expressions (e.g., "in the 1990s") seamlessly into a language modelling approach.
Experiments for each of the proposed methods show their efficiency and effectiveness, respectively, and demonstrate the viability of our approach to search in web archives.Webarchive bezeichnen einerseits Archive ursprünglich im Web veröffentlichter Inhalte (z. B. das Internet Archive), andererseits Archive, die vor langer Zeit veröffentlichter Inhalte im Web zugreifbar machen (z. B. das Archiv von The Times). Ein gewachsenes Bewusstein, dass originär digitale Inhalte bewahrenswert sind, sowie verbesserte Digitalisierungsverfahren haben dazu geführt, dass Anzahl und Umfang von Webarchiven zugenommen haben. Um das volle Potenzial von Webarchiven auszuschöpfen, bedarf es durchdachter Suchverfahren. Diese Arbeit befasst sich mit drei relevanten Teilproblemen und leistet die folgenden Beiträge:
- Vorstellung des Time-Travel Inverted indeX (TTIX) als eine Erweiterung des invertierten Index, um Zeitreise-Textsuche auf Webarchiven effizient zu unterstützen.
- Eine neue Methode zur automatischen Umformulierung von Suchanfragen, um negativen Auswirkungen entgegenzuwirken, die eine fortwährende Terminologieveränderung auf die Ergebnisgüte beim Suchen in Webarchiven hat.
- Ein Retrieval-Modell, welches speziell auf Informationsbedürfnisse mit deutlichem Zeitbezug ausgerichtet ist. Dieses Retrieval-Modell bedient sich in Dokumenten enthaltener Zeitbezüge (z. B. "in the 1990s") und fügt diese nahtlos in einen auf Language Models beruhenden Retrieval-Ansatz ein.
Zahlreiche Experimente zeigen die Effizienz bzw. Effektivität der genannten Beiträge und demonstrieren den praktischen Nutzen der vorgestellten Verfahren
Temporal search in web archives
Web archives include both archives of contents originally published on the Web (e.g., the Internet Archive) but also archives of contents published long ago that are now accessible on the Web (e.g., the archive of The Times). Thanks to the increased awareness that web-born contents are worth preserving and to improved digitization techniques, web archives have grown in number and size. To unfold their full potential, search techniques are needed that consider their inherent special characteristics. This work addresses three important problems toward this objective and makes the following contributions:
- We present the Time-Travel Inverted indeX (TTIX) as an efficient solution to time-travel text search in web archives, allowing users to search only the parts of the web archive that existed at a user's time of interest.
- To counter negative effects that terminology evolution has on the quality of search results in web archives, we propose a novel query-reformulation technique, so that old but highly relevant documents are retrieved in response to today's queries.
- For temporal information needs, for which the user is best satisfied by documents that refer to particular times, we describe a retrieval model that integrates temporal expressions (e.g., "in the 1990s") seamlessly into a language modelling approach.
Experiments for each of the proposed methods show their efficiency and effectiveness, respectively, and demonstrate the viability of our approach to search in web archives.Webarchive bezeichnen einerseits Archive ursprünglich im Web veröffentlichter Inhalte (z. B. das Internet Archive), andererseits Archive, die vor langer Zeit veröffentlichter Inhalte im Web zugreifbar machen (z. B. das Archiv von The Times). Ein gewachsenes Bewusstein, dass originär digitale Inhalte bewahrenswert sind, sowie verbesserte Digitalisierungsverfahren haben dazu geführt, dass Anzahl und Umfang von Webarchiven zugenommen haben. Um das volle Potenzial von Webarchiven auszuschöpfen, bedarf es durchdachter Suchverfahren. Diese Arbeit befasst sich mit drei relevanten Teilproblemen und leistet die folgenden Beiträge:
- Vorstellung des Time-Travel Inverted indeX (TTIX) als eine Erweiterung des invertierten Index, um Zeitreise-Textsuche auf Webarchiven effizient zu unterstützen.
- Eine neue Methode zur automatischen Umformulierung von Suchanfragen, um negativen Auswirkungen entgegenzuwirken, die eine fortwährende Terminologieveränderung auf die Ergebnisgüte beim Suchen in Webarchiven hat.
- Ein Retrieval-Modell, welches speziell auf Informationsbedürfnisse mit deutlichem Zeitbezug ausgerichtet ist. Dieses Retrieval-Modell bedient sich in Dokumenten enthaltener Zeitbezüge (z. B. "in the 1990s") und fügt diese nahtlos in einen auf Language Models beruhenden Retrieval-Ansatz ein.
Zahlreiche Experimente zeigen die Effizienz bzw. Effektivität der genannten Beiträge und demonstrieren den praktischen Nutzen der vorgestellten Verfahren