Search CORE

10 research outputs found

Memory-Based Shallow Parsing

Author: Sang Erik F. Tjong Kim
Publication venue
Publication date: 01/01/2002
Field of study

We present memory-based learning approaches to shallow parsing and apply these to five tasks: base noun phrase identification, arbitrary base phrase recognition, clause detection, noun phrase parsing and full parsing. We use feature selection techniques and system combination methods for improving the performance of the memory-based learner. Our approach is evaluated on standard data sets and the results are compared with that of other systems. This reveals that our approach works well for base phrase identification while its application towards recognizing embedded structures leaves some room for improvement

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Separación de fuentes y transcripción musical con deep learning.

Author: Beltrán Blázquez José Ramón
Castillón Ruiz Pablo
Hernández Oliván Carlos
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2022
Field of study

El objetivo de este trabajo es conseguir una transcripción MIDI a partir de un archivo MP3 en melodías polifónicas (varias notas sonando al mismo tiempo) y con varios instrumentos sonando a la vez, haciendo uso del deep learning. Para ello primero se separa cada instrumento mediante el uso de la librería Demucs, y luego se entrena un modelo que detecta las notas completas (frames). El deep learning es una herramienta que ha evolucionado en gran medida estos últimos años gracias al avance de la tecnología.<br /

Repositorio Universidad de Zaragoza

Recommended from our members

A feasibility study on compiling reactive problem solution methods for an Al domain

Author: Kaul Lothar
Publication venue: 'Oregon State University'
Publication date
Field of study

This paper investigates the feasibility of compiling the functionality of a decision theoretic problem solving engine into a set of rules or functionally similar construct. The decision theoretic engine runs in exponential time, while the rule set runs in linear time at worst. The main question that will determine the feasibility is whether the size of the rule set is small enough to be of practical use. Based on the tests run, size does not appear to be a limiting factor in compiling rule sets

ScholarsArchive@OSU

Evolving visual routines

Author: Johnson Michael Patrick, 1971-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1995
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1995.Includes bibliographical references (leaves 114-117).by Michael Patrick Johnson.M.S

DSpace@MIT

Criminal data analysis based on low rank sparse representation

Author: Aljrees T.
Aljrees T.
Publication venue
Publication date: 01/01/2019
Field of study

FINDING effective clustering methods for a high dimensional dataset is challenging due to the curse of dimensionality. These challenges can usually make the most of basic common algorithms fail in highdimensional spaces from tackling problems such as large number of groups, and overlapping. Most domains uses some parameters to describe the appearance, geometry and dynamics of a scene. This has motivated the implementation of several techniques of a high-dimensional data for finding a low-dimensional space. Many proposed methods fail to overcome the challenges, especially when the data input is high-dimensional, and the clusters have a complex. REGULARLY in high dimensional data, lots of the data dimensions are not related and might hide the existing clusters in noisy data. High-dimensional data often reside on some low dimensional subspaces. The problem of subspace clustering algorithms is to uncover the type of relationship of an objects from one dimension that are related in different subsets of another dimensions. The state-of-the-art methods for subspace segmentation which included the Low Rank Representation (LRR) and Sparse Representation (SR). The former seeks the global lowest-rank representation but restrictively assumes the independence among subspaces, whereas the latter seeks the clustering of disjoint or overlapped subspaces through locality measure, which, however, causes failure in the case of large noise. THIS thesis aims are to identify the key problems and obstacles that have challenged the researchers in recent years in clustering high dimensional data, then to implement an effective subspace clustering methods for solving high dimensional crimes domains for both real events and synthetic data which has complex data structure with 168 different offence crimes. As well as to overcome the disadvantages of existed subspace algorithms techniques. To this end, a Low-Rank Sparse Representation (LRSR) theory, the future will refer to as Criminal Data Analysis Based on LRSR will be examined, then to be used to recover and segment embedding subspaces. The results of these methods will be discussed and compared with what already have been examined on previous approaches such as K-mean and PCA segmented based on K-means. The previous approaches have helped us to chose the right subspace clustering methods. The Proposed method based on subspace segmentation method named Low Rank subspace Sparse Representation (LRSR) which not only recovers the low-rank subspaces but also gets a relatively sparse segmentation with respect to disjoint subspaces or even overlapping subspaces. BOTH UCI Machine Learning Repository, and crime database are the best to find and compare the best subspace clustering algorithm that fit for high dimensional space data. We used many Open-Source Machine Learning Frameworks and Tools for both employ our machine learning tasks and methods including preparing, transforming, clustering and visualizing the high-dimensional crime dataset, we precisely have used the most modern and powerful Machine Learning Frameworks data science that known as SciKit-Learn for library for the Python programming language, as well as we have used R, and Matlab in previous experiment

Middlesex University Research Repository

3-D Content-Based Retrieval and Classification with Applications to Museum Data

Author: Goodall Simon
Publication venue
Publication date: 01/01/2007
Field of study

There is an increasing number of multimedia collections arising in areas once only the domain of text and 2-D images. Richer types of multimedia such as audio, video and 3-D objects are becoming more and more common place. However, current retrieval techniques in these areas are not as sophisticated as textual and 2-D image techniques and in many cases rely upon textual searching through associated keywords. This thesis is concerned with the retrieval of 3-D objects and with the application of these techniques to the problem of 3-D object annotation. The majority of the work in this thesis has been driven by the European project, SCULPTEUR. This thesis provides an in-depth analysis of a range of 3-D shape descriptors for their suitability for general purpose and specific retrieval tasks using a publicly available data set, the Princeton Shape Benchmark, and using real world museum objects evaluated using a variety of performance metrics. This thesis also investigates the use of 3-D shape descriptors as inputs to popular classification algorithms and a novel classifier agent for use with the SCULPTEUR system is designed and developed and its performance analysed. Several techniques are investigated to improve individual classifier performance. One set of techniques combines several classifiers whereas the other set of techniques aim to find the optimal training parameters for a classifier. The final chapter of this thesis explores a possible application of these techniques to the problem of 3-D object annotation

Southampton (e-Prints Soton)

OpenGrey Repository

Genetic programming for cephalometric landmark detection

Author: Innes A
Publication venue: RMIT University
Publication date: 01/01/2007
Field of study

The domain of medical imaging analysis has burgeoned in recent years due to the availability and affordability of digital radiographic imaging equipment and associated algorithms and, as such, there has been significant activity in the automation of the medical diagnostic process. One such process, cephalometric analysis, is manually intensive and it can take an experienced orthodontist thirty minutes to analyse one radiology image. This thesis describes an approach, based on genetic programming, neural networks and machine learning, to automate this process. A cephalometric analysis involves locating a number of points in an X-ray and determining the linear and angular relationships between them. If the points can be located accurately enough, the rest of the analysis is straightforward. The investigative steps undertaken were as follows: Firstly, a previously published method, which was claimed to be domain independent, was implemented and tested on a selection of landmarks, ranging from easy to very difficult. These included the menton, upper lip, incisal upper incisor, nose tip and sella landmarks. The method used pixel values, and pixel statistics (mean and standard deviation) of pre-determined regions as inputs to a genetic programming detector. This approach proved unsatisfactory and the second part of the investigation focused on alternative handcrafted features sets and fitness measures. This proved to be much more successful and the third part of the investigation involved using pulse coupled neural networks to replace the handcrafted features with learned ones. The fourth and final stage involved an analysis of the evolved programs to determine whether reasonable algorithms had been evolved and not just random artefacts learnt from the training images. A significant finding from the investigative steps was that the new domain independent approach, using pulse coupled neural networks and genetic programming to evolve programs, was as good as or even better than one using the handcrafted features. The advantage of this finding is that little domain knowledge is required, thus obviating the requirement to manually generate handcrafted features. The investigation revealed that some of the easy landmarks could be found with 100% accuracy while the accuracy of finding the most difficult ones was around 78%. An extensive analysis of evolved programs revealed underlying regularities that were captured during the evolutionary process. Even though the evolutionary process took different routes and a diverse range of programs was evolved, many of the programs with an acceptable detection rate implemented algorithms with similar characteristics. The major outcome of this work is that the method described in this thesis could be used as the basis of an automated system. The orthodontist would be required to manually correct a few errors before completing the analysis

RMIT Research Repository

Recommended from our members

Active learning with committees : an approach to efficient learning in text categorization using linear threshold algorithms

Author: Liere Ray
Publication venue: 'Oregon State University'
Publication date
Field of study

We developed and investigated machine learning methods that require minimal preprocessing of the input data, use few training examples, run fast, and still obtain high levels of accuracy. Most approaches to designing machine learning programs are based on the supervised learning paradigm – training examples are chosen randomly and given to the learner. We explore the "active learning" paradigm – the learner automatically selects the more informative training examples. Our domain of interest is text categorization, but most of the methods developed are quite general. The purpose of text categorization is to assign each document in a collection to appropriate categories. Most existing text categorization methods require large amounts of time to prepare the documents for learning and large numbers of examples for training. Humans must assign correct categories to documents before they can be used for training; this costs time and money. Our goal is to develop machine learning methods that, when compared to other methods currently available, are more efficient in time and space, use fewer training documents, and are as accurate. We developed the Active Learning with Committees (ALC) framework – inspired by the Query by Committee approach of Freund, Seung, et al. A "committee" is a group of learners that jointly participate in learning and in predicting the classes of new examples. We perform minimal preprocessing of the documents and thus the domain is noisy, high dimensional, and has large numbers of irrelevant attributes. We use linear threshold learning algorithms to obtain computational efficiency with respect to these large numbers of attributes, with specific algorithms being chosen because they also generalize well when large numbers of attributes are irrelevant. We developed and analyzed several ALC systems. Our results show that it is possible to design active learning systems that scale up to large numbers of features and obtain accuracies comparable to the supervised learning methods while using an order of magnitude fewer examples and an order of magnitude less time. The ALC methods developed have run times on the order of seconds, typically use only 5 - 7% of the training documents, and are as accurate as their supervised counterparts

ScholarsArchive@OSU

Pharmacovigilance Decision Support : The value of Disproportionality Analysis Signal Detection Methods, the development and testing of Covariability Techniques, and the importance of Ontology

Author: Saunders Gary
Publication venue: University of Ballarat
Publication date
Field of study

The cost of adverse drug reactions to society in the form of deaths, chronic illness, foetal malformation, and many other effects is quite significant. For example, in the United States of America, adverse reactions to prescribed drugs is around the fourth leading cause of death. The reporting of adverse drug reactions is spontaneous and voluntary in Australia. Many methods that have been used for the analysis of adverse drug reaction data, mostly using a statistical approach as a basis for clinical analysis in drug safety surveillance decision support. This thesis examines new approaches that may be used in the analysis of drug safety data. These methods differ significantly from the statistical methods in that they utilize co variability methods of association to define drug-reaction relationships. Co variability algorithms were developed in collaboration with Musa Mammadov to discover drugs associated with adverse reactions and possible drug-drug interactions. This method uses the system organ class (SOC) classification in the Australian Adverse Drug Reaction Advisory Committee (ADRAC) data to stratify reactions. The text categorization algorithm BoosTexter was found to work with the same drug safety data and its performance and modus operandi was compared to our algorithms. These alternative methods were compared to a standard disproportionality analysis methods for signal detection in drug safety data including the Bayesean mulit-item gamma Poisson shrinker (MGPS), which was found to have a problem with similar reaction terms in a report and innocent by-stander drugs. A classification of drug terms was made using the anatomical-therapeutic-chemical classification (ATC) codes. This reduced the number of drug variables from 5081 drug terms to 14 main drug classes. The ATC classification is structured into a hierarchy of five levels. Exploitation of the ATC hierarchy allows the drug safety data to be stratified in such a way as to make them accessible to powerful existing tools. A data mining method that uses association rules, which groups them on the basis of content, was used as a basis for applying the ATC and SOC ontologies to ADRAC data. This allows different views of these associations (even very rare ones). A signal detection method was developed using these association rules, which also incorporates critical reaction terms.Doctor of Philosoph

Federation ResearchOnline