28 research outputs found

    Inter-relaão das técnicas Term Extration e Query Expansion aplicadas na recuperação de documentos textuais

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-graduação em Engenharia e Gestão do ConhecimentoConforme Sighal (2006) as pessoas reconhecem a importância do armazenamento e busca da informação e, com o advento dos computadores, tornou-se possível o armazenamento de grandes quantidades dela em bases de dados. Em conseqüência, catalogar a informação destas bases tornou-se imprescindível. Nesse contexto, o campo da Recuperação da Informação, surgiu na década de 50, com a finalidade de promover a construção de ferramentas computacionais que permitissem aos usuários utilizar de maneira mais eficiente essas bases de dados. O principal objetivo da presente pesquisa é desenvolver um Modelo Computacional que possibilite a recuperação de documentos textuais ordenados pela similaridade semântica, baseado na intersecção das técnicas de Term Extration e Query Expansion

    Trends in Australian Manufacturing

    Get PDF
    Australian manufacturing is a picture of diversity and contrasts. This is the main finding of this paper which examines trends in the Australian manufacturing sector over the last two decades. Manufacturing output has quadrupled since the mid-1950s. The fastest growing activities have been those with links to Australia^Rs n atural endowments and products that are more differentiated, with higher skill levels and R&D intensities.Australia; Research; Reports; Employment; Employment opportunities; Industrial disputes; Industrial relations; Industry; I ndustry policy; Labour market; Manufacturing; Part-time employment; Permanent employment; Productivity; Skilled labour; Structural a djustment; Temporary employment; Trade unions;

    TREC 14 Enterprise Track at CSIRO and ANU

    No full text
    Introduction The primary goals of the CSIRO and ANU team's participation in the enterprise track were two-fold: 1) to investigate how well our search engine PADRE responds to the new collection and the new tasks, and 2) to explore if document structure specific to an email collection can be used to improve system performance. By the time of submission deadline, we completed two tasks: known-item search and discussion search. For both tasks, we used the PADRE retrieval system [1], in which the Okapi BM25 relevance function was implemented. Each message in the collection was treated as an independent document, so both topic distillation scoring and same site suppression mechanism were turned off (i.e. -nocool and --SSS0 respectively). During the indexing, stemming and stopword elimination were not applied and sequences of letters and/or digits were considered as indexable words. We parsed the HTML pages in the original collection into an XML format (the DTD is shown in the appendix)

    Ontology Ranking: Finding the Right Ontologies on the Web

    No full text
    Ontology search, which is the process of finding ontologies or ontological terms for users’ defined queries from an ontology collection, is an important task to facilitate ontology reuse of ontology engineering. Ontology reuse is desired to avoid the tedious process of building an ontology from scratch and to limit the design of several competing ontologies that represent similar knowledge. Since many organisations in both the private and public sectors are publishing their data in RDF, they increasingly require to find or design ontologies for data annotation and/or integration. In general, there exist multiple ontologies representing a domain, therefore, finding the best matching ontologies or their terms is required to facilitate manual or dynamic ontology selection for both ontology design and data annotation. The ranking is a crucial component in the ontology retrieval process which aims at listing the ‘relevant0 ontologies or their terms as high as possible in the search results to reduce the human intervention. Most existing ontology ranking techniques inherit one or more information retrieval ranking parameter(s). They linearly combine the values of these parameters for each ontology to compute the relevance score against a user query and rank the results in descending order of the relevance score. A significant aspect of achieving an effective ontology ranking model is to develop novel metrics and dynamic techniques that can optimise the relevance score of the most relevant ontology for a user query. In this thesis, we present extensive research in ontology retrieval and ranking, where several research gaps in the existing literature are identified and addressed. First, we begin the thesis with a review of the literature and propose a taxonomy of Semantic Web data (i.e., ontologies and linked data) retrieval approaches. That allows us to identify potential research directions in the field. In the remainder of the thesis, we address several of the identified shortcomings in the ontology retrieval domain. We develop a framework for the empirical and comparative evaluation of different ontology ranking solutions, which has not been studied in the literature so far. Second, we propose an effective relationship-based concept retrieval framework and a concept ranking model through the use of learning to rank approach which addresses the limitation of the existing linear ranking models. Third, we propose RecOn, a framework that helps users in finding the best matching ontologies to a multi-keyword query. There the relevance score of an ontology to the query is computed by formulating and solving the ontology recommendation problem as a linear and an optimisation problem. Finally, the thesis also reports on an extensive comparative evaluation of our proposed solutions with several other state-of-the-art techniques using real-world ontologies. This thesis will be useful for researchers and practitioners interested in ontology search, for methods and performance benchmark on ranking approaches to ontology search

    Diversity of brain size in fishes: preliminary analysis of a database including 1174 species in 45 orders

    Get PDF
    Absolule and relative values of brain weight are now available for 1174 species of fishes, representing 45 taxonomic orders. The original FishBase "Brains" data was assembled by the research team of Bauchot and colleagues, to which the present report adds data for species representing several additional major taxonomic groups. This database is part of the FíshBase 97 package which provides researchers with a tool to explore lhe functional meaning of absolute and relative brain size díversily, in comparison with phylogenetic position, life history mode, locomotion, habitat, and other behavioral parameters. Several results are provided as an example of the use of these data. Galeomorph sharks and batoid rays possess the largest brains among fishes. and elongate forms with anguilliform locomotion (e.g.. hagfishes. lampreys, lrue eels, carapids, zoarcids) possess the smallest relative brain sizes. Among teleost fishes, Osteoglossomorphs possess the largest relative brain sizes. Brain size correlations with oxygen consumption suggest that larger brains consume proportionately more oxygen, or that active fish with higher metabolic rates have larger brain

    Status of the freshwater fishes of the Philippines

    Get PDF

    From social tagging to polyrepresentation: a study of expert annotating behavior of moving images

    Get PDF
    Mención Internacional en el título de doctorThis thesis investigates “nichesourcing” (De Boer, Hildebrand, et al., 2012), an emergent initiative of cultural heritage crowdsoucing in which niches of experts are involved in the annotating tasks. This initiative is studied in relation to moving image annotation, and in the context of audiovisual heritage, more specifically, within the sector of film archives. The work presents a case study of film and media scholars to investigate the types of annotations and attribute descriptions that they could eventually contribute, as well as the information needs, and seeking and searching behaviors of this group, in order to determine what the role of the different types of annotations in supporting their expert tasks would be. The study is composed of three independent but interconnected studies using a mixed methodology and an interpretive approach. It uses concepts from the information behavior discipline, and the "Integrated Information Seeking and Retrieval Framework" (IS&R) (Ingwersen and Järvelin, 2005) as guidance for the investigation. The findings show that there are several types of annotations that moving image experts could contribute to a nichesourcing initiative, of which time-based tags are only one of the possibilities. The findings also indicate that for the different foci in film and media research, in-depth indexing at the content level is only needed for supporting a specific research focus, for supporting research in other domains, or for engaging broader audiences. The main implications at the level of information infrastructure are the requirement for more varied annotating support, more interoperability among existing metadata standards and frameworks, and the need for guidelines about crowdsoucing and nichesourcing implementation in the audiovisual heritage sector. This research presents contributions to the studies of social tagging applied to moving images, to the discipline of information behavior, by proposing new concepts related to the area of use behavior, and to the concept of “polyrepresentation” (Ingwersen, 1992, 1996) applied to the humanities domain.Esta tesis investiga la iniciativa del nichesourcing (De Boer, Hildebrand, et al., 2012), como una forma de crowdsoucing en sector del patrimonio cultural, en la cuál grupos de expertos participan en las tareas de anotación de las colecciones. El ámbito de aplicación es la anotación de las imágenes en movimiento en el contexto del patrimonio audiovisual, más específicamente, en el caso de los archivos fílmicos. El trabajo presenta un estudio de caso aplicado a un dominio específico de expertos en el ámbito audiovisual: los académicos de cine y medios. El análisis se centra en dos aspectos específicos del problema: los tipos de anotaciones y atributos en las descripciones que podrían obtenerse de este nicho de expertos; y en las necesidades de información y el comportamiento informacional de dicho grupo, con el fin de determinar cuál es el rol de los diferentes tipos de anotaciones en sus tareas de investigación. La tesis se compone de tres estudios independientes e interconectados; se usa una metodología mixta e interpretativa. El marco teórico se compone de conceptos del área de estudios de comportamiento informacional (“information behavior”) y del “Marco integrado de búsqueda y recuperación de la información” ("Integrated Information Seeking and Retrieval Framework" (IS&R)) propuesto por Ingwersen y Järvelin (2005), que sirven de guía para la investigación. Los hallazgos indican que existen diversas formas de anotación de la imagen en movimiento que podrían generarse a partir de las contribuciones de expertos, de las cuáles las etiquetas a nivel de plano son sólo una de las posibilidades. Igualmente, se identificaron diversos focos de investigación en el área académica de cine y medios. La indexación detallada de contenidos sólo es requerida por uno de esos grupos y por investigadores de otras disciplinas, o como forma de involucrar audiencias más amplias. Las implicaciones más relevantes, a nivel de la infraestructura informacional, se refieren a los requisitos de soporte a formas más variadas de anotación, el requisito de mayor interoperabilidad de los estándares y marcos de metadatos, y la necesidad de publicación de guías de buenas prácticas sobre de cómo implementar iniciativas de crowdsoucing o nichesourcing en el sector del patrimonio audiovisual. Este trabajo presenta aportes a la investigación sobre el etiquetado social aplicado a las imágenes en movimiento, a la disciplina de estudios del comportamiento informacional, a la que se proponen nuevos conceptos relacionados con el área de uso de la información, y al concepto de “poli-representación” (Ingwersen, 1992, 1996) en las disciplinas humanísticas.Programa Oficial de Doctorado en Documentación: Archivos y Bibliotecas en el Entorno DigitalPresidente: Peter Emil Rerup Ingwersen.- Secretario: Antonio Hernández Pérez.- Vocal: Nils Phar
    corecore