53 research outputs found

    Handwritten Stenography Recognition and the LION Dataset

    Full text link
    Purpose: In this paper, we establish a baseline for handwritten stenography recognition, using the novel LION dataset, and investigate the impact of including selected aspects of stenographic theory into the recognition process. We make the LION dataset publicly available with the aim of encouraging future research in handwritten stenography recognition. Methods: A state-of-the-art text recognition model is trained to establish a baseline. Stenographic domain knowledge is integrated by applying four different encoding methods that transform the target sequence into representations, which approximate selected aspects of the writing system. Results are further improved by integrating a pre-training scheme, based on synthetic data. Results: The baseline model achieves an average test character error rate (CER) of 29.81% and a word error rate (WER) of 55.14%. Test error rates are reduced significantly by combining stenography-specific target sequence encodings with pre-training and fine-tuning, yielding CERs in the range of 24.5% - 26% and WERs of 44.8% - 48.2%. Conclusion: The obtained results demonstrate the challenging nature of stenography recognition. Integrating stenography-specific knowledge, in conjunction with pre-training and fine-tuning on synthetic data, yields considerable improvements. Together with our precursor study on the subject, this is the first work to apply modern handwritten text recognition to stenography. The dataset and our code are publicly available via Zenodo

    Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy

    Get PDF
    In recent years, deep learning has infiltrated every field it has touched, reducing the need for specialist knowledge and automating the process of knowledge discovery from data. This review argues that astronomy is no different, and that we are currently in the midst of a deep learning revolution that is transforming the way we do astronomy. We trace the history of astronomical connectionism from the early days of multilayer perceptrons, through the second wave of convolutional and recurrent neural networks, to the current third wave of self-supervised and unsupervised deep learning. We then predict that we will soon enter a fourth wave of astronomical connectionism, in which finetuned versions of an all-encompassing 'foundation' model will replace expertly crafted deep learning models. We argue that such a model can only be brought about through a symbiotic relationship between astronomy and connectionism, whereby astronomy provides high quality multimodal data to train the foundation model, and in turn the foundation model is used to advance astronomical research.Comment: 60 pages, 269 references, 29 figures. Review submitted to Royal Society Open Science. Comments and feedback welcom

    Automatic movie analysis and summarisation

    Get PDF
    Automatic movie analysis is the task of employing Machine Learning methods to the field of screenplays, movie scripts, and motion pictures to facilitate or enable various tasks throughout the entirety of a movie’s life-cycle. From helping with making informed decisions about a new movie script with respect to aspects such as its originality, similarity to other movies, or even commercial viability, all the way to offering consumers new and interesting ways of viewing the final movie, many stages in the life-cycle of a movie stand to benefit from Machine Learning techniques that promise to reduce human effort, time, or both. Within this field of automatic movie analysis, this thesis addresses the task of summarising the content of screenplays, enabling users at any stage to gain a broad understanding of a movie from greatly reduced data. The contributions of this thesis are four-fold: (i)We introduce ScriptBase, a new large-scale data set of original movie scripts, annotated with additional meta-information such as genre and plot tags, cast information, and log- and tag-lines. To our knowledge, Script- Base is the largest data set of its kind, containing scripts and information for almost 1,000 Hollywood movies. (ii) We present a dynamic summarisation model for the screenplay domain, which allows for extraction of highly informative and important scenes from movie scripts. The extracted summaries allow for the content of the original script to stay largely intact and provide the user with its important parts, while greatly reducing the script-reading time. (iii) We extend our summarisation model to capture additional modalities beyond the screenplay text. The model is rendered multi-modal by introducing visual information obtained from the actual movie and by extracting scenes from the movie, allowing users to generate visual summaries of motion pictures. (iv) We devise a novel end-to-end neural network model for generating natural language screenplay overviews. This model enables the user to generate short descriptive and informative texts that capture certain aspects of a movie script, such as its genres, approximate content, or style, allowing them to gain a fast, high-level understanding of the screenplay. Multiple automatic and human evaluations were carried out to assess the performance of our models, demonstrating that they are well-suited for the tasks set out in this thesis, outperforming strong baselines. Furthermore, the ScriptBase data set has started to gain traction, and is currently used by a number of other researchers in the field to tackle various tasks relating to screenplays and their analysis

    Artistic research into distraction, agency, and the internet

    Get PDF
    This practical study is concerned with flows of attention and distraction that are associated with experiences of the internet. Taking the term ‘internet’ to stand for a range of networked social, media-consumption, and data practices carried out on devices such as smartphones, this study sets out to explore how distraction might arise, how it might be conceptualised, and the potential consequences for agency of the conditions of its emergence. The study is led by the production and analysis of artworks, using practical approaches that engage critically with aspects of the experience of the internet. This thesis begins by exploring conceptions of the ‘attention economy’ articulated by Goldhaber (1997), Beller (2006), and Citton (2017), developing an understanding that counters mainstream deterministic positions regarding the impact of digital technologies on the capacity for focused attention. Distraction is considered as an experience that may be sought out by individuals but can be captured and extended by third parties such as social media platforms. The importance of the data generated by habitual or compulsive engagement with internet-enabled devices and services (Zuboff, 2015) is considered against a backdrop of quantification and managerialism that extends beyond experiences of the internet. The study reviews existing artworks made in response to these concerns, focusing on expressions of the ‘attention economy’ prevalent in ‘postinternet’ art. Works by Vierkant (2010), Roth (2015) and others that interrogate infrastructure, data-gathering, or networked methods of distribution are identified as relevant, and a position is developed from which the consequences of metricised display platforms for an artistic ‘attention economy’ can be explored. Prototype artworks made during the study are appraised using an artistic research methodology that foregrounds the role of the researcher as both producer and reader of the artwork. Works that actively create distraction, that gather and visualise data, and that emphasise calm self-interrogation, are discussed and evaluated. The practical aspects of the research contribute to knowledge by extending understanding of the spatial, infrastructural, and algorithmic dimensions of the relationship between distraction and agency

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet

    Get PDF
    This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    The Machine as Art/ The Machine as Artist

    Get PDF
    The articles collected in this volume from the two companion Arts Special Issues, “The Machine as Art (in the 20th Century)” and “The Machine as Artist (in the 21st Century)”, represent a unique scholarly resource: analyses by artists, scientists, and engineers, as well as art historians, covering not only the current (and astounding) rapprochement between art and technology but also the vital post-World War II period that has led up to it; this collection is also distinguished by several of the contributors being prominent individuals within their own fields, or as artists who have actually participated in the still unfolding events with which it is concerne

    AI for Everyone?

    Get PDF
    We are entering a new era of technological determinism and solutionism in which governments and business actors are seeking data-driven change, assuming that Artificial Intelligence is now inevitable and ubiquitous. But we have not even started asking the right questions, let alone developed an understanding of the consequences. Urgently needed is debate that asks and answers fundamental questions about power. This book brings together critical interrogations of what constitutes AI, its impact and its inequalities in order to offer an analysis of what it means for AI to deliver benefits for everyone. The book is structured in three parts: Part 1, AI: Humans vs. Machines, presents critical perspectives on human-machine dualism. Part 2, Discourses and Myths About AI, excavates metaphors and policies to ask normative questions about what is ‘desirable’ AI and what conditions make this possible. Part 3, AI Power and Inequalities, discusses how the implementation of AI creates important challenges that urgently need to be addressed. Bringing together scholars from diverse disciplinary backgrounds and regional contexts, this book offers a vital intervention on one of the most hyped concepts of our times

    Towards data justice unionism? A labour perspective on AI governance

    Get PDF
    We are entering a new era of technological determinism and solutionism in which governments and business actors are seeking data-driven change, assuming that Artificial Intelligence is now inevitable and ubiquitous. But we have not even started asking the right questions, let alone developed an understanding of the consequences. Urgently needed is debate that asks and answers fundamental questions about power. This book brings together critical interrogations of what constitutes AI, its impact and its inequalities in order to offer an analysis of what it means for AI to deliver benefits for everyone. The book is structured in three parts: Part 1, AI: Humans vs. Machines, presents critical perspectives on human-machine dualism. Part 2, Discourses and Myths About AI, excavates metaphors and policies to ask normative questions about what is ‘desirable’ AI and what conditions make this possible. Part 3, AI Power and Inequalities, discusses how the implementation of AI creates important challenges that urgently need to be addressed. Bringing together scholars from diverse disciplinary backgrounds and regional contexts, this book offers a vital intervention on one of the most hyped concepts of our times
    corecore