Search CORE

4,681 research outputs found

Recommended from our members

The Computational Attitude in Music Theory

Author: Bell Eamonn Patrick
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Music studies’s turn to computation during the twentieth century has engendered particular habits of thought about music, habits that remain in operation long after the music scholar has stepped away from the computer. The computational attitude is a way of thinking about music that is learned at the computer but can be applied away from it. It may be manifest in actual computer use, or in invocations of computationalism, a theory of mind whose influence on twentieth-century music theory is palpable. It may also be manifest in more informal discussions about music, which make liberal use of computational metaphors. In Chapter 1, I describe this attitude, the stakes for considering the computer as one of its instruments, and the kinds of historical sources and methodologies we might draw on to chart its ascendance. The remainder of this dissertation considers distinct and varied cases from the mid-twentieth century in which computers or computationalist musical ideas were used to pursue new musical objects, to quantify and classify musical scores as data, and to instantiate a generally music-structuralist mode of analysis. I present an account of the decades-long effort to prepare an exhaustive and accurate catalog of the all-interval twelve-tone series (Chapter 2). This problem was first posed in the 1920s but was not solved until 1959, when the composer Hanns Jelinek collaborated with the computer engineer Heinz Zemanek to jointly develop and run a computer program. Recognizing the transformation wrought on modern statistics and communications technology by information theory, I revisit Abraham Moles’s book Information Theory and Esthetic Perception (orig. 1958) and use its vocabulary to contextualize contemporary information-theoretic work on music that various evokes the computational mind by John. R. Pierce and Mary Shannon, Wilhelm Fucks, and Henry Quastler (Chapter 3). I conclude with a detailed look into a score-segmentation algorithm of the influential American music theorist Allen Forte (Chapter 4). Forte was a skilled programmer who spent several years at MIT in the 1960s, with cutting-edge computers and the company of first-rank figures in the nascent fields of computer science and artificial intelligence. Each one of the researchers whose work is treated in these case studies—at some stage in their relationship with music—adopted what I call the computational attitude to music, to varying degrees and for diverse ends. Of the many questions this dissertation seeks to answer: what was gained by adopting such an attitude? What was lost? Having understood these past explorations of the computational attitude to music, we are better suited ask of ourselves the same questions today

Columbia University Academic Commons

Handwritten text generation and strikethrough characters augmentation

Author: Chertok A.V.
Dimitrov D.V.
Karachev D.K.
Novopoltsev M.Y.
Potanin M.S.
Shonenkov A.V.
Publication venue: 'Samara State National Research University'
Publication date: 01/06/2022
Field of study

We introduce two data augmentation techniques, which, used with a Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate and Character Error Rate beyond best-reported results on handwriting text recognition tasks. We apply a novel augmentation that simulates strikethrough text (HandWritten Blots) and a handwritten text generation method based on printed text (StackMix), which proved to be very effective in handwriting text recognition tasks. StackMix uses weakly-supervised framework to get character boundaries. Because these data augmentation techniques are independent of the network used, they could also be applied to enhance the performance of other networks and approaches to handwriting text recognition. Extensive experiments on ten handwritten text datasets show that HandWritten Blots augmentation and StackMix significantly improve the quality of handwriting text recognition models

Samara University

On Quantitative Aspects of Musical Meaning : A model of emergent signification in time-ordered data sequence

Author: Tiits Kalev
Publication venue: Helsingfors universitet
Publication date: 01/08/2002
Field of study

Helsingin yliopiston digitaalinen arkisto

Character Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Character recognition is one of the pattern recognition technologies that are most widely used in practical applications. This book presents recent advances that are relevant to character recognition, from technical topics such as image processing, feature extraction or classification, to new applications including human-computer interfaces. The goal of this book is to provide a reference source for academic research and for professionals working in the character recognition field

Directory of Open Access Books (DOAB)

Advanced document data extraction techniques to improve supply chain performance

Author: Sharma Vikash
Publication venue
Publication date: 01/07/2021
Field of study

In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information

Repository@Hull - Worktribe

Stylistic analysis and recognition of piano sonatas of four composers -- Mozart, Chopin, Debussy, Anton Webern

Author: Lin-Jeng Emily Feng-Hwa
Publication venue: RIT Scholar Works
Publication date: 26/03/1987
Field of study

This thesis describes a system that incorporates techniques developed by musicologists to do stylistic analysis of music, an important applied field in music theory analysis. To do the analysis requires the knowledge of many musicological analysis methods and pattern recognition algorithms that are central issues to this project. In addition, AI techniques of learning were used to improve the whole system\u27s skills. The conclusions reached as a result of this project were that computers can perform musical tasks usually associated exclusively with naturally intelligent musicologists, and that learning techniques can expand and enrich the behavior of musically intelligent systems

RIT Scholar Works

Perceptual fail: Female power, mobile technologies and images of self

Author: Leishman Donna
Publication venue
Publication date: 10/07/2019
Field of study

Like a biological species, images of self have descended and modified throughout their journey down the ages, interweaving and recharging their viability with the necessary interjections from culture, tools and technology. Part of this journey has seen images of self also become an intrinsic function within the narratives about female power; consider Helen of Troy “a face that launched a thousand ships” (Marlowe, 1604) or Kim Kardashian (KUWTK) who heralded in the mass mediated ‘selfie’ as a social practice. The interweaving process itself sees the image oscillate between naturalized ‘icon’ and idealized ‘symbol’ of what the person looked like and/or aspired to become. These public images can confirm or constitute beauty ideals as well as influence (via imitation) behaviour and mannerisms, and as such the viewers belief in the veracity of the representative image also becomes intrinsically political manipulating the associated narratives and fostering prejudice (Dobson 2015, Korsmeyer 2004, Pollock 2003). The selfie is arguably ‘a sui generis,’ whilst it is a mediated photographic image of self, it contains its own codes of communication and decorum that fostered the formation of numerous new digital communities and influenced new media aesthetics . For example the selfie is both of nature (it is still a time based piece of documentation) and known to be perceptually untrue (filtered, modified and full of artifice). The paper will seek to demonstrate how selfie culture is infused both by considerable levels of perceptual failings that are now central to contemporary celebrity culture and its’ notion of glamour which in turn is intrinsically linked (but not solely defined) by the province of feminine desire for reinvention, transformation or “self-sexualisation” (Hall, West and McIntyre, 2012). The subject, like the Kardashians or selfies, is divisive. In conclusion this paper will explore the paradox of the perceptual failings at play within selfie culture more broadly, like ‘Reality TV’ selfies are infamously fake yet seem to provide Debord’s (1967) illusory cultural opiate whilst fulfilling a cultural longing. Questions then emerge when considering the narrative impact of these trends on engendered power structures and the traditional status of illusion and narrative fiction

Northumbria Research Link

Bridging the Domain-Gap in Computer Vision Tasks

Author: Brouns J.G.C.
Publication venue
Publication date: 19/01/2020
Field of study

Pure OAI Repository

De-identification of medical images using object-detection models, generative adversarial networks and perceptual loss

Author: Aasen Malik
Mathisen Fredrik Fidjestøl
Publication venue: The University of Bergen
Publication date: 01/01/2021
Field of study

Medical images play an essential role in the process of diagnostics and detection of a variety of diseases. Whether it being anatomical features or molecular cells, medical imaging help visualize and gain insight into the human body. These images are a crucial aid in the process of diagnosing patients. While these images are informative, they can also be quite difficult to interpret, necessitating highly trained medical professionals to read the images. The amount of medical images produced is enormous compared to the amount of professionals whose task it is to interpret them. The diagnosis can also vary based on the medical professional who inspects the image. The recent rise of a new generation of Computer Aided Detection (CAD) systems based on machine learning has become more and more important to battle this problem. These systems aids the medical professional in the diag- nostic process. This can lead to a more consistent and accurate interpretations of medical images by removing some human bias. In addition, such systems can be used to decrease the workload by either filtering out images deemed as belonging to healthy subjects, to be otherwise not of interest, or marking images as indicating a risk. When creating CAD systems utilizing machine learning you are very de- pendent on data. Since the systems will typically be placed in very delicate, high-risk situations, the quality of the data is always a priority. A common problem in medical imaging research is not getting sufficient data. Not that there is a shortage of images, but to be used in research, they typically have to be de-identified or anonymized. This process has to be verified manually and is therefore time-consuming. With the impressive advancement of machine learning in recent decades, it seems natural to attempt de-identification using machine learning, especially because several powerful models are being applied to similar tasks in other fields. One key reason for the success of machine learn- ing is its ability to detect and generate patterns. Currently, there are several applications that perform de-identification by placing black-boxes on top of in- formation detected as being sensitive [1, 2]. However, the black boxes can end up hiding also other parts of the image, but ideally all non-sensitive features in the image should be preserved. In this thesis we investigate the effect of using image-to-image deep learning to automate 2D medical image de-identification by detecting the sensitive information, and removing it without the use of black boxes. Our results indicate that de-identification models based on machine learning can result in viable and powerful solutions. The deep learning models manage to accurately detect and remove text, without large negative impact on the original image.Masteroppgave i Programutvikling samarbeid med HVLPROG399MAMN-PRO

University of Bergen

NORA - Norwegian Open Research Archives

Representing musical knowledge

Author: Horowitz Damon Matthew
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1995
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1995.Includes bibliographical references (leaves 66-69).by Damon Matthew Horowitz.M.S

DSpace@MIT