9 research outputs found
Relato sobre o desenvolvimento de modelos para obtenção automática do conteúdo de sites sobre saúde
Este relato técnico descreve o desenvolvimento de modelos, técnicas e protótipos para localização, padronização e extração automática do conteúdo apresentado em sites/páginas web com assuntos relacionados à área da saúde, visando à estimativa da qualidade destes sites/páginas extraídos. As técnicas e propostas descritas neste documento foram desenvolvidas ao longo do primeiro semestre de 2009 pelos alunos da disciplina CMPl12 - Sistemas de Informação Distribuídos do Programa de Pós-Graduação do Instituto de Informática da Universidade Federal do Rio Grande do Sul, ministrada pelo Professor Dr. José Palazzo Moreira de Oliveira. Cada uma das tarefas descritas aplicou técnicas e tecnologias diferentes para o seu desenvolvimento, apresentando resultados de diferentes naturezas, como tabelas, protótipos e modelos. Entretanto, todas foram desenvolvidas em busca do mesmo objetivo: a extração automática do conteúdo de sites/páginas que tratam sobre o tema "Doença de Alzheimer". Ao final to trabalho, obteve-se um conjunto de resultados, os quais serão utilizados para possibilitar a realização de estimativas a respeito da qualidade dos sites/páginas extraídos, de acordo com métricas de qualidade definidas.This report describes the development of models, techniques and prototypes to location, standardization and automatic extraction of content presented in web sites/pages with subject related to health, objecting estimate its quality. The techniques and proposals described here was performed during the first half of 2009 by students of the lecture CMP112 – Distributed Information Systems of Institute of Informatics of Federal University of Rio Grande do Sul, conducted by Professor Dr. José Palazzo Moreira de Oliveira. Each one of the tasks described in this report used different techniques and technologies for their development, presenting results of different natures, such as tables, prototypes and models. However, all tasks were developed looking for the same objective: the automatic extraction of content from web sites/pages related with the subject “Alzheimer’s Disease”. At the end of the work, we obtained a set of results, which will be used to enable the development of estimative concerning the quality of extracted web sites/pages, according with defined quality metrics
Neotropical freshwater fisheries : A dataset of occurrence and abundance of freshwater fishes in the Neotropics
The Neotropical region hosts 4225 freshwater fish species, ranking first among the world's most diverse regions for freshwater fishes. Our NEOTROPICAL FRESHWATER FISHES data set is the first to produce a large-scale Neotropical freshwater fish inventory, covering the entire Neotropical region from Mexico and the Caribbean in the north to the southern limits in Argentina, Paraguay, Chile, and Uruguay. We compiled 185,787 distribution records, with unique georeferenced coordinates, for the 4225 species, represented by occurrence and abundance data. The number of species for the most numerous orders are as follows: Characiformes (1289), Siluriformes (1384), Cichliformes (354), Cyprinodontiformes (245), and Gymnotiformes (135). The most recorded species was the characid Astyanax fasciatus (4696 records). We registered 116,802 distribution records for native species, compared to 1802 distribution records for nonnative species. The main aim of the NEOTROPICAL FRESHWATER FISHES data set was to make these occurrence and abundance data accessible for international researchers to develop ecological and macroecological studies, from local to regional scales, with focal fish species, families, or orders. We anticipate that the NEOTROPICAL FRESHWATER FISHES data set will be valuable for studies on a wide range of ecological processes, such as trophic cascades, fishery pressure, the effects of habitat loss and fragmentation, and the impacts of species invasion and climate change. There are no copyright restrictions on the data, and please cite this data paper when using the data in publications