Search CORE

7 research outputs found

Automatic classification of new articles in Spanish

Author: Calvo Rafael A.
Ceccatto Hermenegildo Alejandro
Cerviño Beresi U.
García Adeva Juan José
Publication venue
Publication date: 01/01/2004
Field of study

We apply machine learning techniques to the automatic classification of news articles from the local newspaper La Capital of Rosario, Argentina. The corpus (LCC) is an archive of approximately 75,000 manually categorized articles in Spanish published in 1991. We benchmark on LCC three widely used supervised learning methods: k-Nearest Neighbors, Na¨ ve Bayes and Arti ficial Neural Networks, illustrating the corpus properties.Eje: V - Workshop de agentes y sistemas inteligentesRed de Universidades con Carreras en Informátic

Automatic classification of new articles in Spanish

Author: Calvo Rafael A.
Ceccatto Hermenegildo Alejandro
Cerviño Beresi U.
García Adeva Juan José
Publication venue
Publication date: 17/10/2012
Field of study

Servicio de Difusión de la Creación Intelectual

Automatic classification of new articles in Spanish

Author: Calvo Rafael A.
Ceccatto Hermenegildo Alejandro
Cerviño Beresi U.
García Adeva Juan José
Publication venue
Publication date: 17/10/2012
Field of study

Servicio de Difusión de la Creación Intelectual

Image indexing and retrieval in the compressed domain

Author: Armstrong Andrew
Publication venue
Publication date: 01/09/2003
Field of study

University of South Wales Research Explorer

Intelligent software agents for teaching across the WWW

Author: Bergasa-Suso Jorge
Publication venue
Publication date: 01/12/2005
Field of study

Portsmouth University Research Portal (Pure)

Intelligent Document Classification

Author: H. A. Ceccatto
Rafael A. Calvo
Publication venue
Publication date: 01/01/2000
Field of study

In this work we investigate some technical questions related to the application of neural networks in document classification. First, we discuss the effects of different averaging protocols for the 2 statistic used to remove non-informative terms. This is an especially relevant issue for the neural network technique, which requires an aggressive dimensionality reduction to be feasible. Second, we estimate the importance of performance fluctuations due to inherent randomness in the training process of a neural network, a point not properly addressed in previous works. Finally, we compare the neural network results with those obtained using the best methods for this application. For this we optimize the network architecture by evaluating much larger nets than previously considered in similar studies in the literature

CiteSeerX

Intelligent document classification

Author: H.A. Ceccatto
Rafael A. Calvo
Publication venue: 'IOS Press'
Publication date
Field of study

Crossref