28 research outputs found
Inter-relaão das técnicas Term Extration e Query Expansion aplicadas na recuperação de documentos textuais
Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-graduação em Engenharia e Gestão do ConhecimentoConforme Sighal (2006) as pessoas reconhecem a importância do armazenamento e busca da informação e, com o advento dos computadores, tornou-se possível o armazenamento de grandes quantidades dela em bases de dados. Em conseqüência, catalogar a informação destas bases tornou-se imprescindível. Nesse contexto, o campo da Recuperação da Informação, surgiu na década de 50, com a finalidade de promover a construção de ferramentas computacionais que permitissem aos usuários utilizar de maneira mais eficiente essas bases de dados. O principal objetivo da presente pesquisa é desenvolver um Modelo Computacional que possibilite a recuperação de documentos textuais ordenados pela similaridade semântica, baseado na intersecção das técnicas de Term Extration e Query Expansion
Trends in Australian Manufacturing
Australian manufacturing is a picture of diversity and contrasts. This is the main finding of this paper which examines trends in the Australian manufacturing sector over the last two decades. Manufacturing output has quadrupled since the mid-1950s. The fastest growing activities have been those with links to Australia^Rs n atural endowments and products that are more differentiated, with higher skill levels and R&D intensities.Australia; Research; Reports; Employment; Employment opportunities; Industrial disputes; Industrial relations; Industry; I ndustry policy; Labour market; Manufacturing; Part-time employment; Permanent employment; Productivity; Skilled labour; Structural a djustment; Temporary employment; Trade unions;
TREC 14 Enterprise Track at CSIRO and ANU
Introduction The primary goals of the CSIRO and ANU team's participation in the enterprise track were two-fold: 1) to investigate how well our search engine PADRE responds to the new collection and the new tasks, and 2) to explore if document structure specific to an email collection can be used to improve system performance. By the time of submission deadline, we completed two tasks: known-item search and discussion search. For both tasks, we used the PADRE retrieval system [1], in which the Okapi BM25 relevance function was implemented. Each message in the collection was treated as an independent document, so both topic distillation scoring and same site suppression mechanism were turned off (i.e. -nocool and --SSS0 respectively). During the indexing, stemming and stopword elimination were not applied and sequences of letters and/or digits were considered as indexable words. We parsed the HTML pages in the original collection into an XML format (the DTD is shown in the appendix)
Ontology Ranking: Finding the Right Ontologies on the Web
Ontology search, which is the process of finding ontologies or
ontological terms for users’ defined queries from an ontology
collection, is an important task to facilitate ontology reuse of
ontology engineering. Ontology reuse is desired to avoid the
tedious process of building an ontology from scratch and to limit
the design of several competing ontologies that represent similar
knowledge. Since many organisations in both the private and
public sectors are publishing their data in RDF, they
increasingly require to find or design ontologies for data
annotation and/or integration. In general, there exist multiple
ontologies representing a domain, therefore, finding the best
matching ontologies or their terms is required to facilitate
manual or dynamic ontology selection for both ontology design and
data annotation.
The ranking is a crucial component in the ontology retrieval
process which aims at listing the ‘relevant0 ontologies or
their terms as high as possible in the search results to reduce
the human intervention. Most existing ontology ranking techniques
inherit one or more information retrieval ranking parameter(s).
They linearly combine the values of these parameters for each
ontology to compute the relevance score against a user query and
rank the results in descending order of the relevance score. A
significant aspect of achieving an effective ontology ranking
model is to develop novel metrics and dynamic techniques that can
optimise the relevance score of the most relevant ontology for a
user query.
In this thesis, we present extensive research in ontology
retrieval and ranking, where several research gaps in the
existing literature are identified and addressed. First, we begin
the thesis with a review of the literature and propose a taxonomy
of Semantic Web data (i.e., ontologies and linked data) retrieval
approaches. That allows us to identify potential research
directions in the field. In the remainder of the thesis, we
address several of the identified shortcomings in the ontology
retrieval domain. We develop a framework for the empirical and
comparative evaluation of different ontology ranking solutions,
which has not been studied in the literature so far. Second, we
propose an effective relationship-based concept retrieval
framework and a concept ranking model through the use of learning
to rank approach which addresses the limitation of the existing
linear ranking models. Third, we propose RecOn, a framework that
helps users in finding the best matching ontologies to a
multi-keyword query. There the relevance score of an ontology to
the query is computed by formulating and solving the ontology
recommendation problem as a linear and an optimisation problem.
Finally, the thesis also reports on an extensive comparative
evaluation of our proposed solutions with several other
state-of-the-art techniques using real-world ontologies. This
thesis will be useful for researchers and practitioners
interested in ontology search, for methods and performance
benchmark on ranking approaches to ontology search
Diversity of brain size in fishes: preliminary analysis of a database including 1174 species in 45 orders
Absolule and relative values of brain weight are now available for 1174 species of fishes, representing 45 taxonomic orders. The original FishBase "Brains" data was assembled by the
research team of Bauchot and colleagues, to which the present report adds data for species representing several additional major taxonomic groups. This database is part of the FíshBase 97 package which provides researchers with a tool to explore lhe functional meaning of absolute and relative brain size díversily, in comparison with phylogenetic position, life history mode, locomotion, habitat, and other behavioral parameters. Several results are provided as an example of the use of these data. Galeomorph
sharks and batoid rays possess the largest brains among fishes. and elongate forms with anguilliform locomotion (e.g.. hagfishes. lampreys, lrue eels, carapids, zoarcids) possess the smallest relative brain sizes. Among teleost fishes, Osteoglossomorphs possess the largest relative brain sizes. Brain size correlations with oxygen consumption suggest that larger brains consume proportionately more oxygen,
or that active fish with higher metabolic rates have larger brain
From social tagging to polyrepresentation: a study of expert annotating behavior of moving images
Mención Internacional en el título de doctorThis thesis investigates “nichesourcing” (De Boer, Hildebrand, et al., 2012), an emergent initiative of cultural heritage crowdsoucing in which niches of experts are involved in the annotating tasks. This initiative is studied in relation to moving image annotation, and in the context of audiovisual heritage, more specifically, within the sector of film archives. The work presents a case study of film and media scholars to investigate the types of annotations and attribute descriptions that they could eventually contribute, as well as the information needs, and seeking and searching behaviors of this group, in order to determine what the role of the different types of annotations in supporting their expert tasks would be. The study is composed of three independent but interconnected studies using a mixed methodology and an interpretive approach. It uses concepts from the information behavior discipline, and the "Integrated Information Seeking and Retrieval Framework" (IS&R) (Ingwersen and Järvelin, 2005) as guidance for the investigation. The findings show that there are several types of annotations that moving image experts could contribute to a nichesourcing initiative, of which time-based tags are only one of the possibilities. The findings also indicate that for the different foci in film and media research, in-depth indexing at the content level is only needed for supporting a specific research focus, for supporting research in other domains, or for engaging broader audiences. The main implications at the level of information infrastructure are the requirement for more varied annotating support, more interoperability among existing metadata standards and frameworks, and the need for guidelines about crowdsoucing and nichesourcing implementation in the audiovisual heritage sector. This research presents contributions to the studies of social tagging applied to moving images, to the discipline of information behavior, by proposing new concepts related to the area of use behavior, and to the concept of “polyrepresentation” (Ingwersen, 1992, 1996) applied to the humanities domain.Esta tesis investiga la iniciativa del nichesourcing (De Boer, Hildebrand, et al., 2012), como una forma de crowdsoucing en sector del patrimonio cultural, en la cuál grupos de expertos participan en las tareas de anotación de las colecciones. El ámbito de aplicación es la anotación de las imágenes en movimiento en el contexto del patrimonio audiovisual, más específicamente, en el caso de los archivos fílmicos. El trabajo presenta un estudio de caso aplicado a un dominio específico de expertos en el ámbito audiovisual: los académicos de cine y medios. El análisis se centra en dos aspectos específicos del problema: los tipos de anotaciones y atributos en las descripciones que podrían obtenerse de este nicho de expertos; y en las necesidades de información y el comportamiento informacional de dicho grupo, con el fin de determinar cuál es el rol de los diferentes tipos de anotaciones en sus tareas de investigación. La tesis se compone de tres estudios independientes e interconectados; se usa una metodología mixta e interpretativa. El marco teórico se compone de conceptos del área de estudios de comportamiento informacional (“information behavior”) y del “Marco integrado de búsqueda y recuperación de la información” ("Integrated Information Seeking and Retrieval Framework" (IS&R)) propuesto por Ingwersen y Järvelin (2005), que sirven de guía para la investigación. Los hallazgos indican que existen diversas formas de anotación de la imagen en movimiento que podrían generarse a partir de las contribuciones de expertos, de las cuáles las etiquetas a nivel de plano son sólo una de las posibilidades. Igualmente, se identificaron diversos focos de investigación en el área académica de cine y medios. La indexación detallada de contenidos sólo es requerida por uno de esos grupos y por investigadores de otras disciplinas, o como forma de involucrar audiencias más amplias. Las implicaciones más relevantes, a nivel de la infraestructura informacional, se refieren a los requisitos de soporte a formas más variadas de anotación, el requisito de mayor interoperabilidad de los estándares y marcos de metadatos, y la necesidad de publicación de guías de buenas prácticas sobre de cómo implementar iniciativas de crowdsoucing o nichesourcing en el sector del patrimonio audiovisual. Este trabajo presenta aportes a la investigación sobre el etiquetado social aplicado a las imágenes en movimiento, a la disciplina de estudios del comportamiento informacional, a la que se proponen nuevos conceptos relacionados con el área de uso de la información, y al concepto de “poli-representación” (Ingwersen, 1992, 1996) en las disciplinas humanísticas.Programa Oficial de Doctorado en Documentación: Archivos y Bibliotecas en el Entorno DigitalPresidente: Peter Emil Rerup Ingwersen.- Secretario: Antonio Hernández Pérez.- Vocal: Nils Phar