Search CORE

20 research outputs found

CWI-evaluation - Progress Report 1993-1998

Author
Publication venue: CWI
Publication date: 01/01/1998
Field of study

Part-of-speech Tagging: A Machine Learning Approach based on Decision Trees

Author: Màrquez Lluís
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/1999
Field of study

The study and application of general Machine Learning (ML) algorithms to theclassical ambiguity problems in the area of Natural Language Processing (NLP) isa currently very active area of research. This trend is sometimes called NaturalLanguage Learning. Within this framework, the present work explores the applicationof a concrete machine-learning technique, namely decision-tree induction, toa very basic NLP problem, namely part-of-speech disambiguation (POS tagging).Its main contributions fall in the NLP field, while topics appearing are addressedfrom the artificial intelligence perspective, rather from a linguistic point of view.A relevant property of the system we propose is the clear separation betweenthe acquisition of the language model and its application within a concrete disambiguationalgorithm, with the aim of constructing two components which are asindependent as possible. Such an approach has many advantages. For instance, thelanguage models obtained can be easily adapted into previously existing taggingformalisms; the two modules can be improved and extended separately; etc.As a first step, we have experimentally proven that decision trees (DT) providea flexible (by allowing a rich feature representation), efficient and compact wayfor acquiring, representing and accessing the information about POS ambiguities.In addition to that, DTs provide proper estimations of conditional probabilities fortags and words in their particular contexts. Additional machine learning techniques,based on the combination of classifiers, have been applied to address some particularweaknesses of our tree-based approach, and to further improve the accuracy in themost difficult cases.As a second step, the acquired models have been used to construct simple,accurate and effective taggers, based on diiferent paradigms. In particular, wepresent three different taggers that include the tree-based models: RTT, STT, andRELAX, which have shown different properties regarding speed, flexibility, accuracy,etc. The idea is that the particular user needs and environment will define whichis the most appropriate tagger in each situation. Although we have observed slightdifferences, the accuracy results for the three taggers, tested on the WSJ test benchcorpus, are uniformly very high, and, if not better, they are at least as good asthose of a number of current taggers based on automatic acquisition (a qualitativecomparison with the most relevant current work is also reported.Additionally, our approach has been adapted to annotate a general Spanishcorpus, with the particular limitation of learning from small training sets. A newtechnique, based on tagger combination and bootstrapping, has been proposed toaddress this problem and to improve accuracy. Experimental results showed thatvery high accuracy is possible for Spanish tagging, with a relatively low manualeffort. Additionally, the success in this real application has confirmed the validity of our approach, and the validity of the previously presented portability argumentin favour of automatically acquired taggers

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Secretaría de Estado de Cultura

Enhancing solids deposit prediction in gully pots with explainable hybrid models: A review

Author: Beaumont Hazel
Chatzirodou Antonia
Ekechukwu Chinedu
Eyo Eyo
Staddon Chad
Publication venue: IWA Publishing
Publication date
Field of study

Urban flooding has made it necessary to gain a better understanding of how well gully pots perform when overwhelmed by solids deposition due to various climatic and anthropogenic variables. This study investigates solids deposition in gully pots through the review of eight models, comprising four deterministic models, two hybrid models, a statistical model, and a conceptual model, representing a wide spectrum of solid depositional processes. Traditional models understand and manage the impact of climatic and anthropogenic variables on solid deposition but they are prone to uncertainties due to inadequate handling of complex and non-linear variables, restricted applicability, inflexibility and data bias. Hybrid models which integrate traditional models with data-driven approaches have proved to improve predictions and guarantee the development of uncertainty-proof models. Despite their effectiveness, hybrid models lack explainability. Hence, this study presents the significance of eXplainable Artificial Intelligence (XAI) tools in addressing the challenges associated with hybrid models. Finally, crossovers between various models and a representative workflow for the approach to solids deposition modelling in gully pots is suggested. The paper concludes that the application of explainable hybrid modeling can serve as a valuable tool for gully pot management as it can address key limitations present in existing models

UWE Bristol Research Repository

Boosting 3-D-Geometric Features for Efficient Face Recognition and Gender Classification

Author: Anuj Srivastava
Boulbaba Ben Amor
Driss Aboutajdine
Lahoucine Ballihi
Mohamed Daoudi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Recommended from our members

Kernel Approximation Methods for Speech Recognition

Author: May Avner
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Over the past five years or so, deep learning methods have dramatically improved the state of the art performance in a variety of domains, including speech recognition, computer vision, and natural language processing. Importantly, however, they suffer from a number of drawbacks: 1. Training these models is a non-convex optimization problem, and thus it is difficult to guarantee that a trained model minimizes the desired loss function. 2. These models are difficult to interpret. In particular, it is difficult to explain, for a given model, why the computations it performs make accurate predictions. In contrast, kernel methods are straightforward to interpret, and training them is a convex optimization problem. Unfortunately, solving these optimization problems exactly is typically prohibitively expensive, though one can use approximation methods to circumvent this problem. In this thesis, we explore to what extent kernel approximation methods can compete with deep learning, in the context of large-scale prediction tasks. Our contributions are as follows: 1. We perform the most extensive set of experiments to date using kernel approximation methods in the context of large-scale speech recognition tasks, and compare performance with deep neural networks. 2. We propose a feature selection algorithm which significantly improves the performance of the kernel models, making their performance competitive with fully-connected feedforward neural networks. 3. We perform an in-depth comparison between two leading kernel approximation strategies — random Fourier features [Rahimi and Recht, 2007] and the Nyström method [Williams and Seeger, 2001] — showing that although the Nyström method is better at approximating the kernel, it performs worse than random Fourier features when used for learning. We believe this work opens the door for future research to continue to push the boundary of what is possible with kernel methods. This research direction will also shed light on the question of when, if ever, deep models are needed for attaining strong performance

Columbia University Academic Commons

Doctor of Philosophy

Author: Jurrus Elizabeth
Publication venue: University of Utah
Publication date: 01/08/2011
Field of study

dissertationNeuroscientists are developing new imaging techniques and generating large volumes of data in an effort to understand the complex structure of the nervous system. The complexity and size of this data makes human interpretation a labor intensive task. To aid in the analysis, new segmentation techniques for identifying neurons in these feature rich datasets are required. However, the extremely anisotropic resolution of the data makes segmentation and tracking across slices difficult. Furthermore, the thickness of the slices can make the membranes of the neurons hard to identify. Similarly, structures can change significantly from one section to the next due to slice thickness which makes tracking difficult. This thesis presents a complete method for segmenting many neurons at once in two-dimensional (2D) electron microscopy images and reconstructing and visualizing them in three-dimensions (3D). First, we present an advanced method for identifying neuron membranes in 2D, necessary for whole neuron segmentation, using a machine learning approach. The method described uses a series of artificial neural networks (ANNs) in a framework combined with a feature vector that is composed of image and context; intensities sampled over a stencil neighborhood. Several ANNs are applied in series allowing each ANN to use the classification context; provided by the previous network to improve detection accuracy. To improve the membrane detection, we use information from a nonlinear alignment of sequential learned membrane images in a final ANN that improves membrane detection in each section. The final output, the detected membranes, are used to obtain 2D segmentations of all the neurons in an image. We also present a method that constructs 3D neuron representations by formulating the problem of finding paths through sets of sections as an optimal path computation, which applies a cost function to the identification of a cell from one section to the next and solves this optimization problem using Dijkstras algorithm. This basic formulation accounts for variability or inconsistencies between sections and prioritizes cells based on the evidence of their connectivity. Finally, we present a tool that combines these techniques with a visual user interface that enables users to quickly segment whole neurons in large volumes

The University of Utah: J. Willard Marriott Digital Library

Assessment of a multi-measure functional connectivity approach

Author: Fernandes Miguel Claudino Leão Garrett
Publication venue
Publication date: 01/01/2017
Field of study

Efforts to find differences in brain activity patterns of subjects with neurological and psychiatric disorders that could help in their diagnosis and prognosis have been increasing in recent years and promise to revolutionise clinical practice and our understanding of such illnesses in the future. Resting-state functional magnetic resonance imaging (rsfMRI) data has been increasingly used to evaluate said activity and to characterize the connectivity between distinct brain regions, commonly organized in functional connectivity (FC) matrices. Here, machine learning methods were used to assess the extent to which multiple FC matrices, each determined with a different statistical method, could change classification performance relative to when only one matrix is used, as is common practice. Used statistical methods include correlation, coherence, mutual information, transfer entropy and non-linear correlation, as implemented in the MULAN toolbox. Classification was made using random forests and support vector machine (SVM) classifiers. Besides the previously mentioned objective, this study had three other goals: to individually investigate which of these statistical methods yielded better classification performances, to confirm the importance of the blood-oxygen-level-dependent (BOLD) signal in the frequency range 0.009-0.08 Hz for FC based classifications as well as to assess the impact of feature selection in SVM classifiers. Publicly available rs-fMRI data from the Addiction Connectome Preprocessed Initiative (ACPI) and the ADHD-200 databases was used to perform classification of controls vs subjects with Attention-Deficit/Hyperactivity Disorder (ADHD). Maximum accuracy and macro-averaged f-measure values of 0.744 and 0.677 were respectively achieved in the ACPI dataset and of 0.678 and 0.648 in the ADHD-200 dataset. Results show that combining matrices could significantly improve classification accuracy and macro-averaged f-measure if feature selection is made. Also, the results of this study suggest that mutual information methods might play an important role in FC based classifications, at least when classifying subjects with ADHD

Repositório da Universidade Nova de Lisboa

New Paradigms for Active Learning

Author: Ni Ailing
Publication venue: Scholarship@Western
Publication date: 23/08/2012
Field of study

In traditional active learning, learning algorithms (or learners) mainly focus on the performance of the final model built and the total number of queries needed for learning a good model. However, in many real-world applications, active learners have to focus on the learning process for achieving finer goals, such as minimizing the number of mistakes in predicting unlabeled examples. These learning goals are common and important in real-world applications. For example, in direct marketing, a sales agent (learner) has to focus on the process of selecting customers to approach, and tries to make correct predictions (i.e., fewer mistakes) on the customers who will buy the product. However, traditional active learning algorithms cannot achieve the finer learning goals due to the different focuses. In this thesis, we study how to control the learning process in active learning such that those goals can be accomplished. According to various learning tasks and goals, we address four new active paradigms as follows. The first paradigm is learning actively and conservatively. Under this paradigm, the learner actively selects and predicts the most certain example (thus, conservatively) iteratively during the learning process. The goal of this paradigm is to minimize the number of mistakes in predicting unlabeled examples during active learning. Intuitively the conservative strategy is less likely to make mistakes, i.e., more likely to achieve the learning goal. We apply this new learning strategy in an educational software, as well as direct marketing. The second paradigm is learning actively and aggressively. Under this paradigm, unlabeled examples and multiple oracles are available. The learner actively selects the best multiple oracles to label the most uncertain example (thus, aggressively) iteratively during the learning process. The learning goal is to learn a good model with guaranteed label quality. The third paradigm is learning actively with conservative-aggressive tradeoff. Under this learning paradigm, firstly, unlabeled examples are available and learners are allowed to select examples actively to learn. Secondly, to obtain the labels, two actions can be considered: querying oracles and making predictions. Lastly, cost has to be paid for querying oracles or for making wrong predictions. The tradeoff between the two actions is necessary for achieving the learning goal: minimizing the total cost for obtaining the labels. The last paradigm is learning actively with minimal/maximal effort. Under this paradigm, the labels of the examples are all provided and learners are allowed to select examples actively to learn. The learning goal is to control the learning process by selecting examples actively such that the learning can be accomplished with minimal effort or a good model can be built fast with maximal effort. For each of the four learning paradigms, we propose effective learning algorithms accordingly and demonstrate empirically that related learning problems in real applications can be solved well and the learning goals can be accomplished. In summary, this thesis focuses on controlling the learning process to achieve fine goals in active learning. According to various real application tasks, we propose four novel learning paradigms, and for each paradigm we propose efficient learning algorithms to solve the learning problems. The experimental results show that our learning algorithms outperform other state-of-the-art learning algorithms

Scholarship@Western

Learning categorial grammars

Author: Costa Florêncio C.
Publication venue
Publication date: 14/11/2003
Field of study

In 1967 E. M. Gold published a paper in which the language classes from the Chomsky-hierarchy were analyzed in terms of learnability, in the technical sense of identification in the limit. His results were mostly negative, and perhaps because of this his work had little impact on linguistics. In the early eighties there was renewed interest in the paradigm, mainly because of work by Angluin and Wright. Around the same time, Arikawa and his co-workers refined the paradigm by applying it to so-called Elementary Formal Systems. By making use of this approach Takeshi Shinohara was able to come up with an impressive result; any class of context-sensitive grammars with a bound on its number of rules is learnable. Some linguistically motivated work on learnability also appeared from this point on, most notably Wexler & Culicover 1980 and Kanazawa 1994. The latter investigates the learnability of various classes of categorial grammar, inspired by work by Buszkowski and Penn, and raises some interesting questions. We follow up on this work by exploring complexity issues relevant to learning these classes, answering an open question from Kanazawa 1994, and applying the same kind of approach to obtain (non)learnable classes of Combinatory Categorial Grammars, Tree Adjoining Grammars, Minimalist grammars, Generalized Quantifiers, and some variants of Lambek Grammars. We also discuss work on learning tree languages and its application to learning Dependency Grammars. Our main conclusions are: - formal learning theory is relevant to linguistics, - identification in the limit is feasible for non-trivial classes, - the `Shinohara approach' -i.e., placing a numerical bound on the complexity of a grammar- can lead to a learnable class, but this completely depends on the specific nature of the formalism and the notion of complexity. We give examples of natural classes of commonly used linguistic formalisms that resist this kind of approach, - learning is hard work. Our results indicate that learning even `simple' classes of languages requires a lot of computational effort, - dealing with structure (derivation-, dependency-) languages instead of string languages offers a useful and promising approach to learnabilty in a linguistic contex

Utrecht University Repository