Search CORE

47 research outputs found

Recommended from our members

The Learning Grid and E-Assessment using Latent Semantic Analysis

Author: Haley Debra
Lefrere Paul
Nuseibeh Bashar
Taylor Josie
Thomas Pete
Publication venue: 'IOS Press'
Publication date: 01/11/2005
Field of study

E-assessment is an important component of e-learning and e-qualification. Formative and summative assessment serve different purposes and both types of evaluation are critical to the pedagogicalprocess. While students are studying, practicing, working, or revising, formative assessment provides direction, focus, and guidance. Summative assessment provides the means to evaluate a learner's achievement and communicate that achievement to interested parties. Latent Semantic Analysis (LSA) is a statistical method for inferring meaning from a text. Applications based on LSA exist that provide both summative and formative assessment of a learner's work. However, the huge computational needs are a major problem with this promising technique. This paper explains how LSA works, describes the breadth of existing applications using LSA, explains how LSA is particularly suited to e-assessment, and proposes research to exploit the potential computational power of the Grid to overcome one of LSA's drawbacks

Open Research Online (The Open University)

Part-of-Speech Enhanced Context Recognition

Author: Hansen Lars Kai
Madsen Rasmus Elsborg
Publication venue
Publication date: 01/01/2004
Field of study

Online Research Database In Technology

Part-of-Speech Enhanced Context Recognition

Author: Hansen Lars Kai
Larsen Jan
Madsen Rasmus Elsborg
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Crossref

Online Research Database In Technology

Vocabulary Pruning for Improved Context Recognition

Author: Hansen Lars Kai
Larsen Jan
Madsen Rasmus Elsborg
Sigurdsson Sigurdur
Publication venue: IEEE Press
Publication date: 01/01/2004
Field of study

Online Research Database In Technology

A document management methodology based on similarity contents

Author: Meziane F
Rezgui Y
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

The advent of the WWW and distributed information systems have made it possible to share documents between different users and organisations. However, this has created many problems related to the security, accessibility, right and most importantly the consistency of documents. It is important that the people involved in the documents management process have access to the most up-to-date version of documents, retrieve the correct documents and should be able to update the documents repository in such a way that his or her document are known to others. In this paper we propose a method for organising, storing and retrieving documents based on similarity contents. The method uses techniques based on information retrieval, document indexation and term extraction and indexing. This methodology is developed for the E-Cognos project which aims at developing tools for the management and sharing of documents in the construction domain

University of Salford Institutional Repository

Probabilistic Latent Semantic Analyses (PLSA) in Bibliometric Analysis for Technology Forecasting

Author: Chan K.C.
Liu Jinlan
TSIM Y.C.
Yeung W.S.
Zan Wang
Publication venue: Facultad de Economía y Negocios, Universidad Alberto Hurtado
Publication date: 01/03/2007
Field of study

Due to the availability of internet-based abstract services and patent databases, bibliometric analysis has become one of key technology forecasting approaches. Recently, latent semantic analysis (LSA) has been applied to improve the accuracy in document clustering. In this paper, a new LSA method, probabilistic latent semantic analysis (PLSA) which uses probabilistic methods and algebra to search latent space in the corpus is further applied in document clustering. The results show that PLSA is more accurate than LSA and the improved iteration method proposed by authors can simplify the computing process and improve the computing efficiencyDebido a la disponibilidad de servicios abstractos de internet y bases de datos de patentes, un análisis bibliométrico se ha transformado en una aproximación clave de sondeo de tecnologías. Recientemente, el Análisis Semántico Latente (LSA) ha sido aplicado para mejorar la precisión en el clustering de documentos. En el siguiente trabajo se muestra, un nuevo método LSA, el Análisis Semántico Probabilística Latente (PLSA), que utiliza métodos probabilísticas y álgebra para buscar espacio latente en el cuerpo generado por el clustering de documentos. Los resultados demuestran que PLSA es más preciso que LSA y mejora el método de iteración propuesto por autores que simplifican los procesos de computación y mejoran la eficiencia de cómputo.Due to the availability of internet-based abstract services and patent databases, bibliometric analysis has become one of key technology forecasting approaches. Recently, latent semantic analysis (LSA) has been applied to improve the accuracy in document clustering. In this paper, a new LSA method, probabilistic latent semantic analysis (PLSA) which uses probabilistic methods and algebra to search latent space in the corpus is further applied in document clustering. The results show that PLSA is more accurate than LSA and the improved iteration method proposed by authors can simplify the computing process and improve the computing efficienc

Directory of Open Access Journals

Journal of Technology Management & Innovation

Pruning the vocabulary for better context recognition

Author: Hansen Lars Kai
Larsen Jan
Madsen Rasmus Elsborg
Sigurdsson Sigurdur
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Language independent `bag-of-words' representations are surprisingly effective for text classification. The representation is high dimensional though, containing many nonconsistent words for text categorization. These non-consistent words result in reduced generalization performance of subsequent classifiers, e.g., from ill-posed principal component transformations. In this communication our aim is to study the effect of reducing the least relevant words from the bagof -words representation. We consider a new approach, using neural network based sensitivity maps and information gain for determination of term relevancy, when pruning the vocabularies. With reduced vocabularies documents are classified using a latent semantic indexing representation and a probabilistic neural network classifier. Reducing the bag-of-words vocabularies with 90%-98%, we find consistent classification improvement using two mid size data-sets. We also study the applicability of information gain and sensitivity maps for automated keyword generation

CiteSeerX

Crossref

Online Research Database In Technology

The singular values and vectors of low rank perturbations of large rectangular random matrices

Author: Florent Benaych-georges
Raj
Rao Nadakuditi
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we consider the singular values and singular vectors of finite, low rank perturbations of large rectangular random matrices. Specifically, we prove almost sure convergence of the extreme singular values and appropriate projections of the corresponding singular vectors of the perturbed matrix. As in the prequel, where we considered the eigenvalue aspect of the problem, the non-random limiting value is shown to depend explicitly on the limiting singular value distribution of the unperturbed matrix via an integral transforms that linearizes rectangular additive convolution in free probability theory. The large matrix limit of the extreme singular values of the perturbed matrix differs from that of the original matrix if and only if the singular values of the perturbing matrix are above a certain critical threshold which depends on this same aforementioned integral transform. We examine the consequence of this singular value phase transition on the associated left and right singular eigenvectors and discuss the finite

n

fluctuations above these non-random limits.Comment: 22 pages, presentation of the main results and of the hypotheses slightly modifie

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

HAL-Polytechnique

Automatic image captioning

Author: Duygulu P.
Faloutsos C.
Pan J.-Y.
Yang H.-J.
Publication venue
Publication date: 01/01/2004
Field of study

In this paper, we examine the problem of automatic image captioning. Given a training set of captioned images, we want to discover correlations between image features and keywords, so that we can automatically find good keywords for a new image. We experiment thoroughly with multiple design alternatives on large datasets of various content styles, and our proposed methods achieve up to a 45% relative improvement on captioning accuracy over the state of the art

Bilkent University Institutional Repository