Search CORE

227 research outputs found

Automatically generated summaries of sports videos based on semantic content

Author: Miguel André Almeida Tomás Ferreira de Barros
Publication venue
Publication date: 18/07/2019
Field of study

The sport has been a part of our lives since the beginning of times, whether we are spectators or participants. The diffusion and increase of multimedia platforms made the consumption of these contents available to everyone. Sports videos appeal to a large population all around the world and have become an important form of multimedia content that is streamed over the Internet and television networks. Moreover, sport content creators want to provide the users with relevant information such as live commentary, summarization of the games in form of text or video using automatic tools.As a result, MOG-Technologies wants to create a tool capable of summarizing football matches based on semantic content, and this problem was explored in the scope of this Dissertation. The main objective is to convert the television football commentator's speech into text taking advantage of Google's Speech-to-Text tool. Several machine learning models were then tested to classify sentences into important events. For the model training, a dataset was created, combining 43 games transcription from different television channels also from 72 games provided by Google Search timeline commentary, the combined dataset contains 3260 sentences. To validate the proposed solution the accuracy and f1 score were extracted for each machine learning model.The results show that the developed tool is capable of predicting events in live events, with low error rate. Also, combining multiple sources, not only the sport commentator speech, will help to increase the performance of the tool. It is important to notice that the dataset created during this Dissertation will allow MOG-Technologies to expand and perfect the concept discussed in this project

Repositório Aberto da Universidade do Porto

Analyzing Session Laws of the State of North Carolina: An Automated Approach Using Machine Learning and Natural Language Processing

Author: Dalwadi Rucha Hareshkumar
Publication venue
Publication date: 01/01/2020
Field of study

This exploratory study aims to automatically find the common themes and topics in North Carolina’s 3,01,328 session laws for a period of over hundred years from 1867 to 1968 and group the laws within identified topics. I specifically answer four research questions: (1) identifying the topics in the entire corpus; (2) finding the difference between the topics in private and public laws; (3) finding the difference between topics over time; and (4) discovering topics that may denote racially based legislation. To address the research questions and discovering topics in session laws, I adapt Latent Dirichlet Allocation, an unsupervised machine learning technique for topic modelling. I find that the entire corpus can be grouped into 28 different topics which vary in proportion between sections and decades. I also find that some topics were similar to topics identified in racially based laws covered in literature.Master of Science in Information Scienc

Carolina Digital Repository

Bridging Cross-Modal Alignment for OCR-Free Content Retrieval in Scanned Historical Documents

Author: Molina Rodríguez Adrià
Universitat Autònoma de Barcelona. Departament de Ciències de la Computació
Universitat Autònoma de Barcelona. Escola d'Enginyeria
Publication venue
Publication date: 01/01/2023
Field of study

In this work, we address the limitations of current approaches to document retrieval by incorporating vision-based topic extraction. While previous methods have primarily focused on visual elements or relied on optical character recognition (OCR) for text extraction, we propose a paradigm shift by directly incorporating vision into the topic space. We demonstrate that recognizing all visual elements within a document is unnecessary for identifying its underlying topic. Visual cues such as icons, writing style, and font can serve as sufficient indicators. By leveraging ranking loss functions and convolutional neural networks (CNNs), we learn complex topological representations that mimic the behavior of text representations. Our approach aims to eliminate the need for OCR and its associated challenges, including efficiency, performance, data-hunger, and expensive annotation. Furthermore, we highlight the significance of incorporating vision in historical documentation, where visually antiquated documents contain valuable cues. Our research contributes to the understanding of topic extraction from a vision perspective and offers insights into annotation-cheap document retrieval system

Diposit Digital de Documents de la UAB

Information Retrieval with Finnish Case Law Embeddings

Author: Sarsa Sami
Publication venue: Helsingfors universitet
Publication date: 01/01/2019
Field of study

In this work, five text vectorisation models' capability in embedding Finnish case law texts to vector space for inter-textual similarity computation is studied. The embeddings and their computed similarities are used to create a Finnish case law retrieval system that allows effective querying with full documents. A working web application is presented as a part of the work. The case law data for the work is provided by the Finnish Ministry of Justice, and the studied models are: TF-IDF, LDA, Word2Vec, Doc2Vec and Doc2vecC

Helsingin yliopiston digitaalinen arkisto

Rerunning OCR: A Machine Learning Approach to Quality Assessment and Enhancement Prediction

Author: Maurer Yves
Schneider Pit
Publication venue
Publication date: 31/03/2022
Field of study

Iterating with new and improved OCR solutions enforces decision making when it comes to targeting the right candidates for reprocessing. This especially applies when the underlying data collection is of considerable size and rather diverse in terms of fonts, languages, periods of publication and consequently OCR quality. This article captures the efforts of the National Library of Luxembourg to support those targeting decisions. They are crucial in order to guarantee low computational overhead and reduced quality degradation risks, combined with a more quantifiable OCR improvement. In particular, this work explains the methodology of the library with respect to text block level quality assessment. Through extension of this technique, a regression model, that is able to take into account the enhancement potential of a new OCR engine, is also presented. They both mark promising approaches, especially for cultural institutions dealing with historical data of lower quality.Comment: Journal of Data Mining and Digital Humanities; Major revisio

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals