87 research outputs found
Evaluation of automatic concept extraction tools within a digital library environment
El rápido avance de la tecnología ha originado la proliferación de fuentes de información digital.
Esta evolución informática ha provocado la creación de bibliotecas digitales que han ido
convirtiendose poco a poco en un gran pilar para la difusión del conocimiento. Sin embargo, la
información contenida en las bibliotecas digitales aún no está descrita totalmente y su explotación
es aún insuficiente. Recientemente, se ha comprobado que la descripción de la información
usando “metadatos” puede ser primordial para el mejoramiento de la consulta de la información
dentro de una biblioteca digital. Nuestro enfoque está basado en la creación e introducción de
nuevos “metadatos” capaces de describir, en nuestro caso, las tesis doctorales de una biblioteca
digital. Estos “metadatos” corresponden a los conceptos más importantes de cada una de las
tesis. Actualmente, la identificación manual de conceptos es un largo proceso llevado a cabo por
un especialista del área. Por lo tanto, es importante hacer uso de herramientas capaces de extraer
automáticamente conceptos. En este artículo analizamos cuatro herramientas de PLN
(Procesamiento del Lenguaje Natural) capaces de extraer automáticamente los conceptos claves
de un corpus. Estas herramientas son: (1) TerminologyExtractor de Chamblon Systems Inc., (2)
Xerox Terminology Suite de Xerox, (3) Nomino de Nomino Technologies y (4) Copernic
Summarizer de NRC. Este artículo presenta también un prototipo de herramienta de anotación desarrollado para insertar de manera automática conceptos a las tesis digitales.The rapid advance of technology has led to the proliferation of digital information sources.
This computer evolution has led to the creation of digital libraries that have been
gradually becoming a great pillar for the dissemination of knowledge. However, the
information contained in digital libraries is not yet fully described and its use
is still insufficient. Recently, it has been found that the description of information
using "metadata" can be essential for improving the query of information
inside a digital library. Our approach is based on the creation and introduction of
new “metadata” capable of describing, in our case, the doctoral theses of a library
digital. These “metadata” correspond to the most important concepts of each of the
thesis. Currently, the manual identification of concepts is a long process carried out by
an area specialist. Therefore, it is important to make use of tools capable of extracting
automatically concepts. In this article we analyze four NLP tools
(Natural Language Processing) capable of automatically extracting the key concepts
of a corpus. These tools are: (1) TerminologyExtractor from Chamblon Systems Inc., (2)
Xerox Terminology Suite from Xerox, (3) Nomino from Nomino Technologies and (4) Copernic
NRC Summary. This article also presents a prototype of an annotation tool developed to automatically insert concepts into digital theses
Evaluación de herramientas de extracción automática de conceptos dentro de un ambiente de biblioteca digital
El rápido avance de la tecnología ha originado la proliferación de fuentes de información digital. Esta evolución informática ha provocado la creación de bibliotecas digitales que han ido convirtiendose poco a poco en un gran pilar para la difusión del conocimiento. Sin embargo, la información contenida en las bibliotecas digitales aún no está descrita totalmente y su explotación es aún insuficiente. Recientemente, se ha comprobado que la descripción de la información usando “metadatos” puede ser primordial para el mejoramiento de la consulta de la información dentro de una biblioteca digital.Palabras claves: Biblioteca digital, metadatos, Procesamiento del Lenguaje Natural, extracción de información, anotación, búsqueda de información
Revista Colombiana de Computación. Volumen 6 Número 1 Junio de 2005
En esta edición tenemos una selección internacional, incluyendo artículos de países tales como Francia, Argentina, España, Inglaterra y por supuesto Colombia.In this edition we have an international selection, including articles from countries such as France, Argentina, Spain, England and of course Colombia
Using scientific documents for distance learning
International audienceIn scientific digital libraries many documents such as publications, technical reports, theses, etc. could be used as basic data supports for distance learning in the universities. But the native structure of these documents is generally not directly adapted to e-learning. In this paper we present our project in which we study how to modify and manage such documents for a better use in distance learning. In the first part, we introduce the definition of a format for scientific documents suited to e-learning, we propose then to use the XML language features to encode and manage these documents. In the next part, we present the documentary system design based on the new structure of the documents and we finish with the description of the prototype that has been made
A Web-Based Interface to Design Information Visualization
International audienceInformation Visualization is a challenging field, enabling a better use of humans' visual and cognitive system, to make sense of very large datasets. This paper aims at improving the current Information Visualizations design workflow, by enabling a better cooperation among programmers, designers and users, in a one-to-one and community oriented fashion. Our contribution is a web-based interface, to create visualization flows that can be edited and shared, between actors within communities. We detail a real case study where programmers, designers and users successfully worked together to quickly design and improve an interactive image visualization interface, based on images similarities
Architecture d'un Serveur Multimédia pour les Sciences de l'Ingénieur
National audienceIn this paper, we present the architecture of a multimedia Internet server specialized in engineering science field, called SEMUSDI. The original feature of this server is the co-using of an object-oriented database and a set of CGI scripts and Javascript libraries. This association permits to propose an Internet server homogeneous, performant and really interactive. In the first part, we introduce the goals of our project. Then, we describe the server functions and the software architecture. In the last part, we explain the technical solutions chosen for the system, and we finish with the prospects.Cet article présente l'architecture d'un serveur internet multimédia pour les sciences de l'ingénieur nommé SEMUSDI. L'originalité de ce serveur découle de l'utilisation conjointe d'une base de données orientée objet, d'un ensemble de scripts CGI et de librairies Javascript. Cela permet d'obtenir un serveur homogène, rapide et véritablement interactif. Dans une première partie, nous allons rappeler les objectifs du projet. Puis nous présenterons les fonctions du serveur et l'architecture générale définie. Ensuite, nous détaillerons l'analyse et les solutions techniques proposées pour le système et nous terminerons par les perspectives d'évolution du projet
Physical Document Adaptation to user s context and user s profile
International audienceModern technology promises mobile users Internet connectivity anytime, anywhere, and using any device. However, given the constrained capabilities of mobile devices, the limited bandwidth of wireless networks and the varying personal preferences, effective information access requires the development of new computational patterns. The variety of mobile devices available today makes device-specific authoring of web content an expensive approach. The problem is further compounded by the heterogeneous nature of the supporting devices and the users' behaviour. This research investigates the challenges posed by these problems, and proposes a context-aware adaptation framework to bridge the gap between the existing Internet content and today's heterogeneous computing environment
- …
