16 research outputs found

    Пошук і реферування в системі електронного документообігу

    Get PDF
    Робота присвячена проблемі пошуку документів у масиві за атрибутами та на основі повнотекстового пошуку. Представлено модифікований метод рубрикації та метод реферування на основі рубрикації. Показано переваги використання цього підходу на прикладі системи електронного документообігу SmartBase.SEDO.This work deals with the problem of document search in arrays by attributes and uses full-text search technology. Modification of rubrication method is presented and abstracting rubrication-based method is developed. The advantages of this conception usage is demonstrated on the electronic documents circulation system SmartBase.SEDO

    Developing Adaptive and Personalized Mobile Applications: A Framework and Design Issues

    Get PDF
    The rapid growth of mobile technology has expedited ubiquitous information access via handheld devices. However, the fundamental natures of mobile information systems are different from those of desktop applications in terms of purpose of use, device features, communication networks, and working environments. This poses various challenges to mobile information systems on how to deliver and present multimedia content in an effective and adaptive manner. One of the major challenges is to deliver personalized information to the right person in a preferred format based on the changing environment. This paper proposes an innovative framework for developing mobile applications that deliver personalized, context-aware, and adaptive content to mobile users. The framework consists of four major components: information selection, content analysis, media transcoding, and customized presentation. It can be applied to a variety of mobile applications such as mobile web, news alert services, and mobile commerce

    In Situ Text Summarisation for Museum Visitors

    Get PDF

    THE METHOD FOR DETECTING PLAGIARISM IN A COLLECTION OF DOCUMENTS

    Get PDF
    The development of the intelligent system for searching for plagiarism by combining two algorithms of searching fuzzy duplicate is considered in this article. This combining contributed to the high computational efficiency. Another advantage of the algorithm is its high efficiency when small-sized documents are compared. The practical use of the algorithm makes it possible to improve the quality of the detection of plagiarism. Also, this algorithm can be used in different systems text search

    Towards Personalized and Human-in-the-Loop Document Summarization

    Full text link
    The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.Comment: PhD thesi

    Design implications for mobile user interfaces of Internet services

    Get PDF
    Internet services are becoming essential in people's daily lives. In addition to accessing them on a PC, Internet services offer functionality and content that are also relevant for mobile use. At the same time, mobile devices of today are technologically sophisticated enabling online access anytime, anywhere. The remaining challenge is to utilize the capabilities of a mobile device in a way that offers people a positive user experience when they are using Internet services on the go. This Thesis belongs to the area of Human-Computer Interaction focusing on the use of Internet services on a mobile device. It considers the limitations of a mobile device in terms of user interface design and its goal is to define design implications that assist in designing mobile user interfaces for Internet services. The design implications mainly aim to give guidance on how to design a mobile Web browser, but they are completed with research findings on designing a mobile client application for an Internet service. The research was implemented through user needs studies, user interface design, and user evaluations. The research studies focused on two approaches that support the use of Internet services on mobile devices: the Minimap Web browser and the Image Exchange mobile client application presented these two approaches. The resulting design implications suggest that the following aspects should be considered when designing mobile user interfaces for Internet services: content optimization, utilization of desktop and mobile usage patterns, full exploitation of device capabilities, compensation for device resources, and content updating. The possible differences in characteristics of a mobile Web browser and a mobile client application are also examined. Finally, this Thesis discusses the latest developments that enable alternative ways to support Internet services on mobile devices in the future

    Sumarização automática de texto

    Get PDF
    O acto de sumarizar ou resumir, isto é, tornar mais sucinta a descrição de uma ideia ou conceito, é uma actividade bastante trivial. As pessoas produzem constantemente, este tipo de representações sucintas para algo que pretendam descrever ou comunicar, sendo que, uma forma muito comum de síntese são os sumários escritos. Tradicionalmente este tipo de sumários são manualmente produzidos por pessoas que analisam textos e tentam identi car os principais conceitos presentes nos mesmos. A chamada sobrecarga de informação , em muito potenciada pela explosão da Internet, tem instigado a disponibilidade de um cada vez maior volume de informação, que torna esse trabalho manual bastante difícil, senão mesmo impossível. Vários têm sido os esforços realizados na tentativa de resolução deste problema, procurando desenvolver técnicas que possibilitem obter o conteúdo mais relevante de documentos, de maneira condensada, sem alterar o seu signi cado original, e com a mínima intervenção humana. O trabalho desenvolvido no âmbito desta dissertação visou explorar diversas abordagens de sumarização extractiva de texto através da implementação de métodos computacionais baseados em estatísticas textuais e teoria de grafos. Foi ainda implementado um método baseado na fusão das abordagens anteriores com outras características como a procura de palavras- -chave e a posição das frases no texto, o que resultou na denominação de método híbrido. A sumarização realizada é puramente extractiva, ou seja, a composição do sumário gerado é baseada na classi cação das frases do texto original e posterior selecção do subconjunto das frases mais informativas, por forma a satisfazer determinada taxa de compressão. Numa abordagem puramente estatística, foi desenvolvido um método que pretende avaliar a relevância de termos do texto com base nos valores das suas frequências, no texto fonte e num corpus. A abordagem baseada em teoria de grafos foi utilizada para levar a cabo duas tarefas distintas, a classi cação de frases através da avaliação da sua centralidade, e a extracção de palavras- chave. A abordagem híbrida utiliza as várias características descritas numa combinação linear, mediada por um conjunto de pesos associados às diversas componentes. O desempenho das diferentes abordagens exploradas é avaliado utilizando colecções de textos noticiosos. Estes dados são provenientes das Document Understanding Conferences (DUC). Para avaliar a qualidade dos sumários produzidos, foi utilizada a ferramenta ROUGE. Os diversos métodos propostos foram, então, comparados entre si avaliando-se intrínseca e automaticamente o nível de informação dos extractos produzidos. Os resultados obtidos evidenciam que o método híbrido é o que apresenta melhor desempenho aquando da comparação da sua pontuação ROUGE com os demais, cando esta tendência a dever-se essencialmente à utilização de uma heurística posicional que atribui maior importância a frases que ocupem uma posição cimeira no texto, sendo que este modelo se adequa especialmente bem à estrutura textual de artigos noticiosos

    Facilitating Reading through a Theme-Driven Approach

    Get PDF
    Readers often encounter the need to explore a document only for a specific point of interest. We call the phenomena of approaching a narrative not for its entirety, but for a thread of a particular topic, thematic reading. Present reading tools and information retrieval techniques provide only limited assistance to readers in such a situation. Our research centers on this phenomenon. We conducted investigations on both human behavior and machine automation, with a goal of better meeting the requirements of thematic reading. To observe readers? behavior and understand their expectations, we implemented a reader?s interface with designs targeting the predicted needs of thematic readers. We conducted user studies using both the system and Microsoft Word. We proved that thematic reading is capable of achieving the goal of understanding a specific topic, at least to a degree that succeeds in topic-wise tasks. We also reached guidelines for designing future reading platforms in major aspects such as view, navigation, and contextual awareness. As for machine automation, we investigated the potential to automatically locate thematically relevant excerpts. This investigation was inspired by the editorial compilation of a textbook index. To increase the search performance, we proposed a two-step methodology which first expands the query with expansion and then filters the intermediate results by checking the term-occurrence proximity. For query expansion, we compared the query expansion with WordNet, morphological inflections, and both processes together. Our results show that in the context of our study, WordNet made almost no contribution to the enhancement of recall, while expansion with the inflectional variants turned out to be a successful and essential scheme. For the refinement section, the results show that the proximity check on the alternative phrases formed after inflectional expansion can effectively increase the precision of the previously acquired return results. We further tested a different scheme ? using sliding window ? of defining target and verification units in the methodology. Our findings show that the structural delimitations (sentences and chapters) outperformed sliding windows. The first scheme was able to achieve consistently desirable results, while the results from the second were inconclusive
    corecore