53 research outputs found
Handwritten Stenography Recognition and the LION Dataset
Purpose: In this paper, we establish a baseline for handwritten stenography
recognition, using the novel LION dataset, and investigate the impact of
including selected aspects of stenographic theory into the recognition process.
We make the LION dataset publicly available with the aim of encouraging future
research in handwritten stenography recognition.
Methods: A state-of-the-art text recognition model is trained to establish a
baseline. Stenographic domain knowledge is integrated by applying four
different encoding methods that transform the target sequence into
representations, which approximate selected aspects of the writing system.
Results are further improved by integrating a pre-training scheme, based on
synthetic data.
Results: The baseline model achieves an average test character error rate
(CER) of 29.81% and a word error rate (WER) of 55.14%. Test error rates are
reduced significantly by combining stenography-specific target sequence
encodings with pre-training and fine-tuning, yielding CERs in the range of
24.5% - 26% and WERs of 44.8% - 48.2%.
Conclusion: The obtained results demonstrate the challenging nature of
stenography recognition. Integrating stenography-specific knowledge, in
conjunction with pre-training and fine-tuning on synthetic data, yields
considerable improvements. Together with our precursor study on the subject,
this is the first work to apply modern handwritten text recognition to
stenography. The dataset and our code are publicly available via Zenodo
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
In recent years, deep learning has infiltrated every field it has touched,
reducing the need for specialist knowledge and automating the process of
knowledge discovery from data. This review argues that astronomy is no
different, and that we are currently in the midst of a deep learning revolution
that is transforming the way we do astronomy. We trace the history of
astronomical connectionism from the early days of multilayer perceptrons,
through the second wave of convolutional and recurrent neural networks, to the
current third wave of self-supervised and unsupervised deep learning. We then
predict that we will soon enter a fourth wave of astronomical connectionism, in
which finetuned versions of an all-encompassing 'foundation' model will replace
expertly crafted deep learning models. We argue that such a model can only be
brought about through a symbiotic relationship between astronomy and
connectionism, whereby astronomy provides high quality multimodal data to train
the foundation model, and in turn the foundation model is used to advance
astronomical research.Comment: 60 pages, 269 references, 29 figures. Review submitted to Royal
Society Open Science. Comments and feedback welcom
Automatic movie analysis and summarisation
Automatic movie analysis is the task of employing Machine Learning methods to the
field of screenplays, movie scripts, and motion pictures to facilitate or enable various
tasks throughout the entirety of a movie’s life-cycle. From helping with making
informed decisions about a new movie script with respect to aspects such as its originality,
similarity to other movies, or even commercial viability, all the way to offering
consumers new and interesting ways of viewing the final movie, many stages in the
life-cycle of a movie stand to benefit from Machine Learning techniques that promise
to reduce human effort, time, or both. Within this field of automatic movie analysis,
this thesis addresses the task of summarising the content of screenplays, enabling users
at any stage to gain a broad understanding of a movie from greatly reduced data. The
contributions of this thesis are four-fold: (i)We introduce ScriptBase, a new large-scale
data set of original movie scripts, annotated with additional meta-information such as
genre and plot tags, cast information, and log- and tag-lines. To our knowledge, Script-
Base is the largest data set of its kind, containing scripts and information for almost
1,000 Hollywood movies. (ii) We present a dynamic summarisation model for the
screenplay domain, which allows for extraction of highly informative and important
scenes from movie scripts. The extracted summaries allow for the content of the original
script to stay largely intact and provide the user with its important parts, while
greatly reducing the script-reading time. (iii) We extend our summarisation model
to capture additional modalities beyond the screenplay text. The model is rendered
multi-modal by introducing visual information obtained from the actual movie and by
extracting scenes from the movie, allowing users to generate visual summaries of motion
pictures. (iv) We devise a novel end-to-end neural network model for generating
natural language screenplay overviews. This model enables the user to generate short
descriptive and informative texts that capture certain aspects of a movie script, such as
its genres, approximate content, or style, allowing them to gain a fast, high-level understanding
of the screenplay. Multiple automatic and human evaluations were carried
out to assess the performance of our models, demonstrating that they are well-suited
for the tasks set out in this thesis, outperforming strong baselines. Furthermore, the
ScriptBase data set has started to gain traction, and is currently used by a number of
other researchers in the field to tackle various tasks relating to screenplays and their
analysis
Artistic research into distraction, agency, and the internet
This practical study is concerned with flows of attention and distraction
that are associated with experiences of the internet. Taking the term ‘internet’ to
stand for a range of networked social, media-consumption, and data practices
carried out on devices such as smartphones, this study sets out to explore how
distraction might arise, how it might be conceptualised, and the potential
consequences for agency of the conditions of its emergence. The study is led
by the production and analysis of artworks, using practical approaches that
engage critically with aspects of the experience of the internet.
This thesis begins by exploring conceptions of the ‘attention economy’
articulated by Goldhaber (1997), Beller (2006), and Citton (2017), developing an
understanding that counters mainstream deterministic positions regarding the
impact of digital technologies on the capacity for focused attention. Distraction
is considered as an experience that may be sought out by individuals but can
be captured and extended by third parties such as social media platforms. The
importance of the data generated by habitual or compulsive engagement with
internet-enabled devices and services (Zuboff, 2015) is considered against a
backdrop of quantification and managerialism that extends beyond experiences
of the internet.
The study reviews existing artworks made in response to these
concerns, focusing on expressions of the ‘attention economy’ prevalent in ‘postinternet’ art. Works by Vierkant (2010), Roth (2015) and others that interrogate
infrastructure, data-gathering, or networked methods of distribution are
identified as relevant, and a position is developed from which the consequences
of metricised display platforms for an artistic ‘attention economy’ can be
explored. Prototype artworks made during the study are appraised using an
artistic research methodology that foregrounds the role of the researcher as
both producer and reader of the artwork. Works that actively create distraction,
that gather and visualise data, and that emphasise calm self-interrogation, are
discussed and evaluated. The practical aspects of the research contribute to
knowledge by extending understanding of the spatial, infrastructural, and
algorithmic dimensions of the relationship between distraction and agency
Fundamentals
Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters
Image Annotation and Topic Extraction Using Super-Word Latent Dirichlet
This research presents a multi-domain solution that uses text and images to iteratively improve automated information extraction. Stage I uses local text surrounding an embedded image to provide clues that help rank-order possible image annotations. These annotations are forwarded to Stage II, where the image annotations from Stage I are used as highly-relevant super-words to improve extraction of topics. The model probabilities from the super-words in Stage II are forwarded to Stage III where they are used to refine the automated image annotation developed in Stage I. All stages demonstrate improvement over existing equivalent algorithms in the literature
Fundamentals
Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters
The Machine as Art/ The Machine as Artist
The articles collected in this volume from the two companion Arts Special Issues, “The Machine as Art (in the 20th Century)” and “The Machine as Artist (in the 21st Century)”, represent a unique scholarly resource: analyses by artists, scientists, and engineers, as well as art historians, covering not only the current (and astounding) rapprochement between art and technology but also the vital post-World War II period that has led up to it; this collection is also distinguished by several of the contributors being prominent individuals within their own fields, or as artists who have actually participated in the still unfolding events with which it is concerne
AI for Everyone?
We are entering a new era of technological determinism and solutionism in which governments and business actors are seeking data-driven change, assuming that Artificial Intelligence is now inevitable and ubiquitous. But we have not even started asking the right questions, let alone developed an understanding of the consequences. Urgently needed is debate that asks and answers fundamental questions about power. This book brings together critical interrogations of what constitutes AI, its impact and its inequalities in order to offer an analysis of what it means for AI to deliver benefits for everyone. The book is structured in three parts: Part 1, AI: Humans vs. Machines, presents critical perspectives on human-machine dualism. Part 2, Discourses and Myths About AI, excavates metaphors and policies to ask normative questions about what is ‘desirable’ AI and what conditions make this possible. Part 3, AI Power and Inequalities, discusses how the implementation of AI creates important challenges that urgently need to be addressed. Bringing together scholars from diverse disciplinary backgrounds and regional contexts, this book offers a vital intervention on one of the most hyped concepts of our times
Towards data justice unionism? A labour perspective on AI governance
We are entering a new era of technological determinism and solutionism in which governments and business actors are seeking data-driven change, assuming that Artificial Intelligence is now inevitable and ubiquitous. But we have not even started asking the right questions, let alone developed an understanding of the consequences. Urgently needed is debate that asks and answers fundamental questions about power. This book brings together critical interrogations of what constitutes AI, its impact and its inequalities in order to offer an analysis of what it means for AI to deliver benefits for everyone. The book is structured in three parts: Part 1, AI: Humans vs. Machines, presents critical perspectives on human-machine dualism. Part 2, Discourses and Myths About AI, excavates metaphors and policies to ask normative questions about what is ‘desirable’ AI and what conditions make this possible. Part 3, AI Power and Inequalities, discusses how the implementation of AI creates important challenges that urgently need to be addressed. Bringing together scholars from diverse disciplinary backgrounds and regional contexts, this book offers a vital intervention on one of the most hyped concepts of our times
- …