4,116 research outputs found
Variation of word frequencies across genre classification tasks
This paper examines automated genre classification of text documents and its role in enabling the effective management of digital documents by digital libraries and other repositories. Genre classification, which narrows down the possible structure of a document, is a valuable step in
realising the general automatic extraction of semantic metadata essential to the efficient management and use of digital objects. In the present report, we present an analysis of word frequencies in different genre classes in an effort to understand the distinction between independent classification tasks. In particular, we examine automated experiments on thirty-one genre classes to determine the relationship between the word frequency metrics and the degree of its significance in carrying out classification in varying environments
HITS and misses: combining BM25 with HITS for expert search
This paper describes the participation of Dublin City University in the CriES (Cross-Lingual Expert Search) pilot challenge. To realize expert search, we combine traditional information retrieval (IR)using the BM25 model with reranking of results using the HITS algorithm. The experiments were performed on two indexes, one containing
all questions and one containing all answers. Two runs were submitted. The first one contains the combination of results from IR on the questions with authority values from HITS; the second contains the reranked results from IR on answers with authority values. To investigate the impact of multilinguality, additional experiments were conducted on the English topic subset and on all topics translated into English with Google Translate. The overall performance is moderate and leaves much room for improvement. However, reranking results with authority values from HITS typically improved results and more than doubled the number of
relevant and retrieved results and precision at 10 documents in many experiments
Building a Document Genre Corpus: a Profile of the KRYS I Corpus
This paper describes the KRYS I corpus (http://www.krys-corpus.eu/Info.html), consisting of documents classified into 70 genre classes. It has been constructed as part of an effort to automate document genre classification as distinct from topic detection. Previously there has been very little work on building corpora of texts which have been classified using a non-topical genre palette. The reason for this is partly due to the fact that genre as a concept, is rooted in philosophy, rhetoric and literature, and highly complex and domain dependent in its interpretation ([11]). The usefulness of genre in everyday information search is only now starting to be recognised and there is no genre classification schema that has been consolidated to have applicable value in this direction. By presenting here our experiences in constructing the KRYS I corpus, we hope to shed light on the information gathering and seeking behaviour and the role of genre in these activities, as well as a way forward for creating a better corpus for testing automated genre classification tasks and the application of these tasks to other domains
SWA-KMDLS: An Enhanced e-Learning Management System Using Semantic Web and Knowledge Management Technology
In this era of knowledge economy in which knowledge have become the most precious
resource, surveys have shown that e-Learning has been on the increasing trend in various
organizations including, among others, education and corporate. The use of e-Learning is
not only aim to acquire knowledge but also to maintain competitiveness and advantages
for individuals or organizations. However, the early promise of e-Learning has yet to be
fully realized, as it has been no more than a handout being published online, coupled with
simple multiple-choice quizzes. The emerging of e-Learning 2.0 that is empowered by
Web 2.0 technology still hardly overcome common problem such as information
overload and poor content aggregation in a highly increasing number of learning objects
in an e-Learning Management System (LMS) environment.
The aim of this research study is to exploit the Semantic Web (SW) and Knowledge
Management (KM) technology; the two emerging and promising technology to enhance
the existing LMS. The proposed system is named as Semantic Web Aware-Knowledge
Management Driven e-Learning System (SWA-KMDLS). An Ontology approach that is
the backbone of SW and KM is introduced for managing knowledge especially from
learning object and developing automated question answering system (Aquas) with
expert locator in SWA-KMDLS. The METHONTOLOGY methodology is selected to
develop the Ontology in this research work.
The potential of SW and KM technology is identified in this research finding which will
benefit e-Learning developer to develop e-Learning system especially with social
constructivist pedagogical approach from the point of view of KM framework and SW
environment. The (semi-) automatic ontological knowledge base construction system
(SAOKBCS) has contributed to knowledge extraction from learning object semiautomatically
whilst the Aquas with expert locator has facilitated knowledge retrieval
that encourages knowledge sharing in e-Learning environment.
The experiment conducted has shown that the SAOKBCS can extract concept that is the
main component of Ontology from text learning object with precision of 86.67%, thus
saving the expert time and effort to build Ontology manually. Additionally the
experiment on Aquas has shown that more than 80% of users are satisfied with answers
provided by the system. The expert locator framework can also improve the performance
of Aquas in the future usage.
Keywords: semantic web aware – knowledge e-Learning Management System (SWAKMDLS),
semi-automatic ontological knowledge base construction system (SAOKBCS),
automated question answering system (Aquas), Ontology, expert locator
Evaluating SMS parsing using automated testing software
Mobile phones are ubiquitous with millions of users acquiring them every day for personal, business and social usage or communication.
Its enormous pervasiveness has created a great advantage for its use as a technological tool applicable to overcome the challenges of
information dissemination regarding burning issues, advertisement, and health related matters. Short message services (SMS), an integral
functional part of cell phones, can be turned into a major tool for accessing databases of information on HIV/AIDS as appreciable
percentage of the youth embrace the technology. The common features by the users of the unique language are the un-grammatical
structure, convenience of spelling, homophony of words and alphanumeric mix up of the arrangement of words. This proves it to be
difficult to serve as query in the search engine architecture. In this work SMS query was used for information accessing in Frequently
Asked Question FAQ system under a specified medical domain. Finally, when the developed system was measured in terms of proximity
to the answer retrieved remarkable results were observed
- …