7 research outputs found

    Пошук і реферування в системі електронного документообігу

    Get PDF
    Робота присвячена проблемі пошуку документів у масиві за атрибутами та на основі повнотекстового пошуку. Представлено модифікований метод рубрикації та метод реферування на основі рубрикації. Показано переваги використання цього підходу на прикладі системи електронного документообігу SmartBase.SEDO.This work deals with the problem of document search in arrays by attributes and uses full-text search technology. Modification of rubrication method is presented and abstracting rubrication-based method is developed. The advantages of this conception usage is demonstrated on the electronic documents circulation system SmartBase.SEDO

    THE METHOD FOR DETECTING PLAGIARISM IN A COLLECTION OF DOCUMENTS

    Get PDF
    The development of the intelligent system for searching for plagiarism by combining two algorithms of searching fuzzy duplicate is considered in this article. This combining contributed to the high computational efficiency. Another advantage of the algorithm is its high efficiency when small-sized documents are compared. The practical use of the algorithm makes it possible to improve the quality of the detection of plagiarism. Also, this algorithm can be used in different systems text search

    Facilitating Reading through a Theme-Driven Approach

    Get PDF
    Readers often encounter the need to explore a document only for a specific point of interest. We call the phenomena of approaching a narrative not for its entirety, but for a thread of a particular topic, thematic reading. Present reading tools and information retrieval techniques provide only limited assistance to readers in such a situation. Our research centers on this phenomenon. We conducted investigations on both human behavior and machine automation, with a goal of better meeting the requirements of thematic reading. To observe readers? behavior and understand their expectations, we implemented a reader?s interface with designs targeting the predicted needs of thematic readers. We conducted user studies using both the system and Microsoft Word. We proved that thematic reading is capable of achieving the goal of understanding a specific topic, at least to a degree that succeeds in topic-wise tasks. We also reached guidelines for designing future reading platforms in major aspects such as view, navigation, and contextual awareness. As for machine automation, we investigated the potential to automatically locate thematically relevant excerpts. This investigation was inspired by the editorial compilation of a textbook index. To increase the search performance, we proposed a two-step methodology which first expands the query with expansion and then filters the intermediate results by checking the term-occurrence proximity. For query expansion, we compared the query expansion with WordNet, morphological inflections, and both processes together. Our results show that in the context of our study, WordNet made almost no contribution to the enhancement of recall, while expansion with the inflectional variants turned out to be a successful and essential scheme. For the refinement section, the results show that the proximity check on the alternative phrases formed after inflectional expansion can effectively increase the precision of the previously acquired return results. We further tested a different scheme ? using sliding window ? of defining target and verification units in the methodology. Our findings show that the structural delimitations (sentences and chapters) outperformed sliding windows. The first scheme was able to achieve consistently desirable results, while the results from the second were inconclusive

    Fractal summarization for mobile devices to access large documents on the web

    No full text
    corecore