4,111 research outputs found

    Classifying Amharic News Text Using Self-Organizing Maps

    Get PDF
    The paper addresses using artificial neural networks for classification of Amharic news items. Amharic is the language for countrywide communication in Ethiopia and has its own writing system containing extensive systematic redundancy. It is quite dialectally diversified and probably representative of the languages of a continent that so far has received little attention within the language processing field. The experiments investigated document clustering around user queries using Self-Organizing Maps, an unsupervised learning neural network strategy. The best ANN model showed a precision of 60.0% when trying to cluster unseen data, and a 69.5% precision when trying to classify it

    CHORUS Deliverable 4.3: Report from CHORUS workshops on national initiatives and metadata

    Get PDF
    Minutes of the following Workshops: • National Initiatives on Multimedia Content Description and Retrieval, Geneva, October 10th, 2007. • Metadata in Audio-Visual/Multimedia production and archiving, Munich, IRT, 21st – 22nd November 2007 Workshop in Geneva 10/10/2007 This highly successful workshop was organised in cooperation with the European Commission. The event brought together the technical, administrative and financial representatives of the various national initiatives, which have been established recently in some European countries to support research and technical development in the area of audio-visual content processing, indexing and searching for the next generation Internet using semantic technologies, and which may lead to an internet-based knowledge infrastructure. The objective of this workshop was to provide a platform for mutual information and exchange between these initiatives, the European Commission and the participants. Top speakers were present from each of the national initiatives. There was time for discussions with the audience and amongst the European National Initiatives. The challenges, communalities, difficulties, targeted/expected impact, success criteria, etc. were tackled. This workshop addressed how these national initiatives could work together and benefit from each other. Workshop in Munich 11/21-22/2007 Numerous EU and national research projects are working on the automatic or semi-automatic generation of descriptive and functional metadata derived from analysing audio-visual content. The owners of AV archives and production facilities are eagerly awaiting such methods which would help them to better exploit their assets.Hand in hand with the digitization of analogue archives and the archiving of digital AV material, metadatashould be generated on an as high semantic level as possible, preferably fully automatically. All users of metadata rely on a certain metadata model. All AV/multimedia search engines, developed or under current development, would have to respect some compatibility or compliance with the metadata models in use. The purpose of this workshop is to draw attention to the specific problem of metadata models in the context of (semi)-automatic multimedia search

    Mobile Phone Text Processing and Question-Answering

    Get PDF
    Mobile phone text messaging between mobile users and information services is a growing area of Information Systems. Users may require the service to provide an answer to queries, or may, in wikistyle, want to contribute to the service by texting in some information within the service’s domain of discourse. Given the volume of such messaging it is essential to do the processing through an automated service. Further, in the case of repeated use of the service, the quality of such a response has the potential to benefit from a dynamic user profile that the service can build up from previous texts of the same user. This project will investigate the potential for creating such intelligent mobile phone services and aims to produce a computational model to enable their efficient implementation. To make the project feasible, the scope of the automated service is considered to lie within a limited domain of, for example, information about entertainment within a specific town centre. The project will assume the existence of a model of objects within the domain of discourse, hence allowing the analysis of texts within the context of a user model and a domain model. Hence, the project will involve the subject areas of natural language processing, language engineering, machine learning, knowledge extraction, and ontological engineering

    Target Apps Selection: Towards a Unified Search Framework for Mobile Devices

    Full text link
    With the recent growth of conversational systems and intelligent assistants such as Apple Siri and Google Assistant, mobile devices are becoming even more pervasive in our lives. As a consequence, users are getting engaged with the mobile apps and frequently search for an information need in their apps. However, users cannot search within their apps through their intelligent assistants. This requires a unified mobile search framework that identifies the target app(s) for the user's query, submits the query to the app(s), and presents the results to the user. In this paper, we take the first step forward towards developing unified mobile search. In more detail, we introduce and study the task of target apps selection, which has various potential real-world applications. To this aim, we analyze attributes of search queries as well as user behaviors, while searching with different mobile apps. The analyses are done based on thousands of queries that we collected through crowdsourcing. We finally study the performance of state-of-the-art retrieval models for this task and propose two simple yet effective neural models that significantly outperform the baselines. Our neural approaches are based on learning high-dimensional representations for mobile apps. Our analyses and experiments suggest specific future directions in this research area.Comment: To appear at SIGIR 201

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
    corecore