52 research outputs found

    3-D Content-Based Retrieval and Classification with Applications to Museum Data

    Get PDF
    There is an increasing number of multimedia collections arising in areas once only the domain of text and 2-D images. Richer types of multimedia such as audio, video and 3-D objects are becoming more and more common place. However, current retrieval techniques in these areas are not as sophisticated as textual and 2-D image techniques and in many cases rely upon textual searching through associated keywords. This thesis is concerned with the retrieval of 3-D objects and with the application of these techniques to the problem of 3-D object annotation. The majority of the work in this thesis has been driven by the European project, SCULPTEUR. This thesis provides an in-depth analysis of a range of 3-D shape descriptors for their suitability for general purpose and specific retrieval tasks using a publicly available data set, the Princeton Shape Benchmark, and using real world museum objects evaluated using a variety of performance metrics. This thesis also investigates the use of 3-D shape descriptors as inputs to popular classification algorithms and a novel classifier agent for use with the SCULPTEUR system is designed and developed and its performance analysed. Several techniques are investigated to improve individual classifier performance. One set of techniques combines several classifiers whereas the other set of techniques aim to find the optimal training parameters for a classifier. The final chapter of this thesis explores a possible application of these techniques to the problem of 3-D object annotation

    Keywords at Work: Investigating Keyword Extraction in Social Media Applications

    Full text link
    This dissertation examines a long-standing problem in Natural Language Processing (NLP) -- keyword extraction -- from a new angle. We investigate how keyword extraction can be formulated on social media data, such as emails, product reviews, student discussions, and student statements of purpose. We design novel graph-based features for supervised and unsupervised keyword extraction from emails, and use the resulting system with success to uncover patterns in a new dataset -- student statements of purpose. Furthermore, the system is used with new features on the problem of usage expression extraction from product reviews, where we obtain interesting insights. The system while used on student discussions, uncover new and exciting patterns. While each of the above problems is conceptually distinct, they share two key common elements -- keywords and social data. Social data can be messy, hard-to-interpret, and not easily amenable to existing NLP resources. We show that our system is robust enough in the face of such challenges to discover useful and important patterns. We also show that the problem definition of keyword extraction itself can be expanded to accommodate new and challenging research questions and datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145929/1/lahiri_1.pd

    Artificial ontogenesis: a connectionist model of development

    Get PDF
    This thesis suggests that ontogenetic adaptive processes are important for generating intelligent beha- viour. It is thus proposed that such processes, as they occur in nature, need to be modelled and that such a model could be used for generating artificial intelligence, and specifically robotic intelligence. Hence, this thesis focuses on how mechanisms of intelligence are specified.A major problem in robotics is the need to predefine the behaviour to be followed by the robot. This makes design intractable for all but the simplest tasks and results in controllers that are specific to that particular task and are brittle when faced with unforeseen circumstances. These problems can be resolved by providing the robot with the ability to adapt the rules it follows and to autonomously create new rules for controlling behaviour. This solution thus depends on the predefinition of how rules to control behaviour are to be learnt rather than the predefinition of rules for behaviour themselves.Learning new rules for behaviour occurs during the developmental process in biology. Changes in the structure of the cerebral 'cortex underly behavioural and cognitive development throughout infancy and beyond. The uniformity of the neocortex suggests that there is significant computational uniformity across the cortex resulting from uniform mechanisms of development, and holds out the possibility of a general model of development. Development is an interactive process between genetic predefinition and environmental influences. This interactive process is constructive: qualitatively new behaviours are learnt by using simple abilities as a basis for learning more complex ones. The progressive increase in competence, provided by development, may be essential to make tractable the process of acquiring higher -level abilities.While simple behaviours can be triggered by direct sensory cues, more complex behaviours require the use of more abstract representations. There is thus a need to find representations at the correct level of abstraction appropriate to controlling each ability. In addition, finding the correct level of abstrac- tion makes tractable the task of associating sensory representations with motor actions. Hence, finding appropriate representations is important both for learning behaviours and for controlling behaviours. Representations can be found by recording regularities in the world or by discovering re- occurring pat- terns through repeated sensory -motor interactions. By recording regularities within the representations thus formed, more abstract representations can be found. Simple, non -abstract, representations thus provide the basis for learning more complex, abstract, representations.A modular neural network architecture is presented as a basis for a model of development. The pat- tern of activity of the neurons in an individual network constitutes a representation of the input to that network. This representation is formed through a novel, unsupervised, learning algorithm which adjusts the synaptic weights to improve the representation of the input data. Representations are formed by neurons learning to respond to correlated sets of inputs. Neurons thus became feature detectors or pat- tern recognisers. Because the nodes respond to patterns of inputs they encode more abstract features of the input than are explicitly encoded in the input data itself. In this way simple representations provide the basis for learning more complex representations. The algorithm allows both more abstract represent- ations to be formed by associating correlated, coincident, features together, and invariant representations to be formed by associating correlated, sequential, features together.The algorithm robustly learns accurate and stable representations, in a format most appropriate to the structure of the input data received: it can represent both single and multiple input features in both the discrete and continuous domains, using either topologically or non -topologically organised nodes. The output of one neural network is used to provide inputs for other networks. The robustness of the algorithm enables each neural network to be implemented using an identical algorithm. This allows a modular `assembly' of neural networks to be used for learning more complex abilities: the output activations of a network can be used as the input to other networks which can then find representations of more abstract information within the same input data; and, by defining the output activations of neurons in certain networks to have behavioural consequences it is possible to learn sensory -motor associations, to enable sensory representations to be used to control behaviour

    Development and Application of Chemometric Methods for Modelling Metabolic Spectral Profiles

    No full text
    The interpretation of metabolic information is crucial to understanding the functioning of a biological system. Latent information about the metabolic state of a sample can be acquired using analytical chemistry methods, which generate spectroscopic profiles. Thus, nuclear magnetic resonance spectroscopy and mass spectrometry techniques can be employed to generate vast amounts of highly complex data on the metabolic content of biofluids and tissue, and this thesis discusses ways to process, analyse and interpret these data successfully. The evaluation of J -resolved spectroscopy in magnetic resonance profiling and the statistical techniques required to extract maximum information from the projections of these spectra are studied. In particular, data processing is evaluated, and correlation and regression methods are investigated with respect to enhanced model interpretation and biomarker identification. Additionally, it is shown that non-linearities in metabonomic data can be effectively modelled with kernel-based orthogonal partial least squares, for which an automated optimisation of the kernel parameter with nested cross-validation is implemented. The interpretation of orthogonal variation and predictive ability enabled by this approach are demonstrated in regression and classification models for applications in toxicology and parasitology. Finally, the vast amount of data generated with mass spectrometry imaging is investigated in terms of data processing, and the benefits of applying multivariate techniques to these data are illustrated, especially in terms of interpretation and visualisation using colour-coding of images. The advantages of methods such as principal component analysis, self-organising maps and manifold learning over univariate analysis are highlighted. This body of work therefore demonstrates new means of increasing the amount of biochemical information that can be obtained from a given set of samples in biological applications using spectral profiling. Various analytical and statistical methods are investigated and illustrated with applications drawn from diverse biomedical areas

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Get PDF
    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art

    Representation of statistical sound properties in human auditory cortex

    Get PDF
    The work carried out in this doctoral thesis investigated the representation of statistical sound properties in human auditory cortex. It addressed four key aspects in auditory neuroscience: the representation of different analysis time windows in auditory cortex; mechanisms for the analysis and segregation of auditory objects; information-theoretic constraints on pitch sequence processing; and the analysis of local and global pitch patterns. The majority of the studies employed a parametric design in which the statistical properties of a single acoustic parameter were altered along a continuum, while keeping other sound properties fixed. The thesis is divided into four parts. Part I (Chapter 1) examines principles of anatomical and functional organisation that constrain the problems addressed. Part II (Chapter 2) introduces approaches to digital stimulus design, principles of functional magnetic resonance imaging (fMRI), and the analysis of fMRI data. Part III (Chapters 3-6) reports five experimental studies. Study 1 controlled the spectrotemporal correlation in complex acoustic spectra and showed that activity in auditory association cortex increases as a function of spectrotemporal correlation. Study 2 demonstrated a functional hierarchy of the representation of auditory object boundaries and object salience. Studies 3 and 4 investigated cortical mechanisms for encoding entropy in pitch sequences and showed that the planum temporale acts as a computational hub, requiring more computational resources for sequences with high entropy than for those with high redundancy. Study 5 provided evidence for a hierarchical organisation of local and global pitch pattern processing in neurologically normal participants. Finally, Part IV (Chapter 7) concludes with a general discussion of the results and future perspectives

    Workshop Proceedings of the 12th edition of the KONVENS conference

    Get PDF
    The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years

    Forecasting: theory and practice

    Get PDF
    Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.info:eu-repo/semantics/publishedVersio

    Forecasting: theory and practice

    Get PDF
    Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases

    Adaptive Algorithms For Classification On High-Frequency Data Streams: Application To Finance

    Get PDF
    Mención Internacional en el título de doctorIn recent years, the problem of concept drift has gained importance in the financial domain. The succession of manias, panics and crashes have stressed the nonstationary nature and the likelihood of drastic structural changes in financial markets. The most recent literature suggests the use of conventional machine learning and statistical approaches for this. However, these techniques are unable or slow to adapt to non-stationarities and may require re-training over time, which is computationally expensive and brings financial risks. This thesis proposes a set of adaptive algorithms to deal with high-frequency data streams and applies these to the financial domain. We present approaches to handle different types of concept drifts and perform predictions using up-to-date models. These mechanisms are designed to provide fast reaction times and are thus applicable to high-frequency data. The core experiments of this thesis are based on the prediction of the price movement direction at different intraday resolutions in the SPDR S&P 500 exchange-traded fund. The proposed algorithms are benchmarked against other popular methods from the data stream mining literature and achieve competitive results. We believe that this thesis opens good research prospects for financial forecasting during market instability and structural breaks. Results have shown that our proposed methods can improve prediction accuracy in many of these scenarios. Indeed, the results obtained are compatible with ideas against the efficient market hypothesis. However, we cannot claim that we can beat consistently buy and hold; therefore, we cannot reject it.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Gustavo Recio Isasi.- Secretario: Pedro Isasi Viñuela.- Vocal: Sandra García Rodrígue
    corecore