183 research outputs found
A walk on Python-igraph
[ES]Breve tutorial de la biblioteca y aplicación para programadores i-graph
A walk on Python-igraph
Brief tutorial of the i-graph library, a tool for programmers
Retrieval of bilingual Spanish-English information by means of a standard automatic translation system
This paper describes our participation in bilingual retrieval (queries in Spanish on documents in English), by means of an information retrieval system based on the vector model. The queries, formulated in Spanish, were translated into English by means of a commercial automatic translation system; the terms extracted from the resulting translations were filtered in order to get rid of empty words and then they were normalised by stemming. Results are poorer than those obtained through monolingual retrieval with the original queries in English slightly above 15%
Automatic Classification of Documents. A Case Study
[ES]La clasificación de documentos consume gran cantidad de trabajo y puede llegar a ser
impracticable si la cantidad de documentos es elevada. Cuando los documentos son digitales,
es posible aplicar técnicas de clasificación automática. Los sistemas de clasificación
automática de tipo supervisado son capaces de identificar la clase o categoría adecuada para
un documento determinado, después de una fase de aprendizaje o entrenamiento, durante la
cual el sistema aprende las características que definen las diferentes categorías. Se describen
algunas de las técnicas más utilizadas, como los clasificadores bayesianos, así como los
diferentes ajustes que pueden ser efectuados para mejorar su efectividad. Se describe una
aplicación de tales técnicas en un caso real, se analizan los detalles de la implementación y se
discuten los resultados.[EN]Classification of documents consumes a great amount of work and may become
impractical if the number of documents is high. When documents are in digital format, one
can apply automatic techniques of classification. The so called supervised automatic
classification systems are able to identify the category or class to which a document must be
assigned. This is achieved by means a training process, in which the system learns the key
features of every class. We describe some of most used techniques, as the Bayes based
classifiers, as well as the issues that we can adjust to improve their effectivity. We also
describe their practical use in a real case, we analyze their implementation and results are
discusse
The implications of Wikipedia for contemporary science education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge
Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool
Web Page Retrieval by Combining Evidence
The participation of the REINA Research Group in WebCLEF 2005 focused in the monolingual mixed task. Queries or topics are of two types: named and home pages. For both, we first perform a search by thematic contents; for the same query, we do a search in several elements of information from every page (title, some meta tags, anchor text) and then we combine the results. For queries about home pages, we try to detect using a method based in some keywords and their patterns of use. After, a re-rank of the results of the thematic contents retrieval is performed, based on Page-Rank and Centrality coeficients
Análisis cibermétrico y visual de Twitter
This paper try to solve the necessity of collect the profile, followers and followed of a Twitter user via API and develop a crawler application use the library Python-Twitter, with the aim of make an analysis and visualization of the Twitter users network
The implications of Wikipedia for contemporary science education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge
[EN]Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool
La cibermetría en la recuperación de información en el Web
The exponential growth of web and distributed data characteristics, high volatility, unstructured data, redundant and highly heterogeneous, have introduced new problems in information retrieval processes. Therefore it is necessary to open new avenue of research that allow us to obtain good levels of accuracy. The papers are based on exploiting the hypertext features of the site is reaching great fame. The cybermetrics is providing many options for working with links and is offering some interesting options at this time, and much of the techniques used in the same may be useful in the processes of information retrieval on the web
Science and Technology in Social Networks: Twitter
[ES]El uso de Internet como fuente principal de búsqueda de información científica se ve reforzado con el uso de los redes sociales. Esta situación requiere un estudio y estandarización del contenido obtenido por esta vía. A través del estudio de los perfiles de twitter que difunden información científica se pueden identificar los temas principales y la cantidad y calidad de la información científica compartida
- …