3,180 research outputs found

    Classifying Amharic News Text Using Self-Organizing Maps

    Get PDF
    The paper addresses using artificial neural networks for classification of Amharic news items. Amharic is the language for countrywide communication in Ethiopia and has its own writing system containing extensive systematic redundancy. It is quite dialectally diversified and probably representative of the languages of a continent that so far has received little attention within the language processing field. The experiments investigated document clustering around user queries using Self-Organizing Maps, an unsupervised learning neural network strategy. The best ANN model showed a precision of 60.0% when trying to cluster unseen data, and a 69.5% precision when trying to classify it

    Automatic Construction of Multi-faceted User Profiles using Text Clustering and its Application to Expert Recommendation and Filtering Problems

    Full text link
    In the information age we are living in today, not only are we interested in accessing multimedia objects such as documents, videos, etc. but also in searching for professional experts, people or celebrities, possibly for professional needs or just for fun. Information access systems need to be able to extract and exploit various sources of information (usually in text format) about such individuals, and to represent them in a suitable way usually in the form of a profile. In this article, we tackle the problems of profile-based expert recommendation and document filtering from a machine learning perspective by clustering expert textual sources to build profiles and capture the different hidden topics in which the experts are interested. The experts will then be represented by means of multi-faceted profiles. Our experiments show that this is a valid technique to improve the performance of expert finding and document filtering

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Searching for Non-English Web Content: An Empirical Study of the Spanish Business Intelligence Portal

    Get PDF
    As non-English-speaking online populations grow rapidly, there are increasing needs to support searching for non-English Web content. Prior research has assumed English to be the primary language for Web searching, but this is not the case for many non-English-speaking regions. For example, Latin America will have the fastest growing population in the coming decades but existing Spanish search engines lack search, browse, and analysis capabilities. In this paper, we have proposed a language-independent approach to supporting non-English Web searching. Based on the approach, we have developed the Spanish Business Intelligence Portal (SBizPort) to support searching, browsing, summarization, categorization, and visualization of Web information. Results from an empirical study involving Spanish subjects show that the portal achieved significantly better user ratings on information quality, cross-regional search capability, and overall satisfaction than the benchmark search portal. This study thus contributes to human-computer interaction research on non-English Web searching
    • …
    corecore