61 research outputs found

    Genetic programming for manufacturing optimisation.

    Get PDF
    A considerable number of optimisation techniques have been proposed for the solution of problems associated with the manufacturing process. Evolutionary computation methods, a group of non-deterministic search algorithms that employ the concept of Darwinian strife for survival to guide the search for optimal solutions, have been extensively used for this purpose. Genetic programming is an evolutionary algorithm that evolves variable-length solution representations in the form of computer programs. While genetic programming has produced successful applications in a variety of optimisation fields, genetic programming methodologies for the solution of manufacturing optimisation problems have rarely been reported. The applicability of genetic programming in the field of manufacturing optimisation is investigated in this thesis. Three well-known problems were used for this purpose: the one-machine total tardiness problem, the cell-formation problem and the multiobjective process planning selection problem. The main contribution of this thesis is the introduction of novel genetic programming frameworks for the solution of these problems. In the case of the one-machine total tardiness problem genetic programming employed combinations of dispatching rules for the indirect representation of job schedules. The hybridisation of genetic programming with alternative search algorithms was proposed for the solution of more difficult problem instances. In addition, genetic programming was used for the evolution of new dispatching rules that challenged the efficiency of man-made dispatching rules for the solution of the problem. An integrated genetic programming - hierarchical clustering approach was proposed for the solution of simple and advanced formulations of the cell-formation problem. The proposed framework produced competitive results to alternative methodologies that have been proposed for the solution of the same problem. The evolution of similarity coefficients that can be used in combination with clustering techniques for the solution of cell-formation problems was also investigated. Finally, genetic programming was combined with a number of evolutionary multiobjective techniques for the solution of the multiobjective process planning selection problem. Results on test problems illustrated the ability of the proposed methodology to provide a wealth of potential solutions to the decision-maker

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Recuperação por conteudo em grandes coleções de imagens heterogeneas

    Get PDF
    Orientador: Alexandre Xavier FalcãoTese (doutorado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação CientificaResumo: A recuperação de imagens por conteúdo (CBIR) é uma área que vem recebendo crescente atenção por parte da comunidade científica por causa do crescimento exponencial do número de imagens que vêm sendo disponibilizadas, principalmente na WWW. À medida que cresce o volume de imagens armazenadas, Cresce também o interesse por sistemas capazes de recuperar eficientemente essas imagens a partir do seu conteúdo visual. Nosso trabalho concentrou-se em técnicas que pudessem ser aplicadas em grandes coleções de imagens heterogêneas. Nesse tipo de coleção, não se pode assumir nenhum tipo de conhecimento sobre o conteúdo semântico e ou visual das imagens, e o custo de utilizar técnicas semi-automáticas (com intervenção humana) é alto em virtude do volume e da heterogeneidade das imagens que precisam ser analisadas. Nós nos concentramos na informação de cor presente nas imagens, e enfocamos os três tópicos que consideramos mais importantes para se realizar a recuperação de imagens baseada em cor: (1) como analisar e extrair informação de cor das imagens de forma automática e eficiente; (2) como representar essa informação de forma compacta e efetiva; e (3) como comparar eficientemente as características visuais que descrevem duas imagens. As principais contribuições do nosso trabalho foram dois algoritmos para a análise automática do conteúdo visual das imagens (CBC e BIC), duas funções de distância para a comparação das informações extraídas das imagens (MiCRoM e dLog) e urna representação alternativa para abordagens que decompõem e representam imagens a partir de células de tamanho fixo (CCIf)Abstract: Content-based image retrieval (CBIR) is an area that has received increasing attention from the scientific community due to the exponential growing of available images, mainly at the WWW.This has spurred great interest for systems that are able to efficiently retrieve images according to their visual content. Our work has focused in techniques suitable for broad image domains. ln a broad image domain, it is not possible to assume or use any a p1'ior'i knowledge about the visual content and/or semantic content of the images. Moreover, the cost of using semialitomatic image analysis techniques is prohibitive because of the heterogeneity and the amount of images that must be analyzed. We have directed our work to color-based image retrieval, and have focused on the three main issues that should be addressed in order to achieve color-based image retrieval: (1) how to analyze and describe images in an automatic and efficient way; (2) how to represent the image content in a compact and effective way; and (3) how to efficiently compare the visual features extracted from the images. The main contributions of our work are two algorithms to automatically analyze the visual content of the images (CBC and BIC), two distance functions to compare the visual features extracted from the images (MiCRoM and dLog), and an alteruative representation for CBIR approaches that decompose and represent images according to a grid of equalsized cells (CCH)DoutoradoDoutor em Ciência da Computaçã

    Parameter-free agglomerative hierarchical clustering to model learners' activity in online discussion forums

    Get PDF
    L'anàlisi de l'activitat dels estudiants en els fòrums de discussió online implica un problema de modelització altament depenent del context, el qual pot ser plantejat des d'aproximacions tant teòriques com empíriques. Quan aquest problema és abordat des de l'àmbit de la mineria de dades, l'enfocament més comunament adoptat és el de la classificació no supervisada (o clustering), donant lloc, d'aquesta manera, a un escenari de clustering en el qual el nombre real de clústers és a priori desconegut. Per tant, aquesta aproximació revela una qüestió subjacent, la qual no és sinó un dels problemes més coneguts del paradigma del clustering: l'estimació del nombre de clústers, habitualment seleccionat per l'usuari concorde a algun tipus de criteri subjectiu que pot comportar fàcilment l'aparició de biaixos indesitjats en els models obtinguts. Amb l'objectiu d'evitar qualsevol intervenció de l'usuari en l'etapa de clustering, dos nous criteris d'unió entre clústers són proposats en la present tesi, els quals, al seu torn, permeten la implementació d'un nou algorisme de clustering jeràrquic aglomeratiu lliure de paràmetres. Un complet conjunt d'experiments indica que el nou algorisme de clustering és capaç de proporcionar solucions de clustering òptimes enfront d'una gran varietat d'escenaris de clustering, sent capaç de bregar amb diferents classes de dades, així com de millorar el rendiment ofert pels algorismes de clustering més àmpliament emprats en la pràctica. Finalment, una estratègia d'anàlisi de dues etapes basada en el paradigma del clustering subespaial és proposada a fi d'abordar adequadament el problema de la modelització de la participació dels estudiants en les discussions asíncrones. Combinada amb el nou algorisme clustering, l'estratègia proposada demostra ser capaç de limitar la intervenció subjectiva de l'usuari a les etapes d'interpretació del procés d'anàlisi i de donar lloc a una completa modelització de l'activitat duta a terme pels estudiants en els fòrums de discussió online.El análisis de la actividad de los estudiantes en los foros de discusión online acarrea un problema de modelización altamente dependiente del contexto, el cual puede ser planteado desde aproximaciones tanto teóricas como empíricas. Cuando este problema es abordado desde el ámbito de la minería de datos, el enfoque más comúnmente adoptado es el de la clasificación no supervisada (o clustering), dando lugar, de este modo, a un escenario de clustering en el que el número real de clusters es a priori desconocido. Por tanto, esta aproximación revela una cuestión subyacente, la cual no es sino uno de los problemas más conocidos del paradigma del clustering: la estimación del número de clusters, habitualmente seleccionado por el usuario acorde a algún tipo de criterio subjetivo que puede conllevar fácilmente la aparición de sesgos indeseados en los modelos obtenidos. Con el objetivo de evitar cualquier intervención del usuario en la etapa de clustering, dos nuevos criterios de unión entre clusters son propuestos en la presente tesis, los cuales, a su vez, permiten la implementación de un nuevo algoritmo de clustering jerárquico aglomerativo libre de parámetros. Un completo conjunto de experimentos indica que el nuevo algoritmo de clustering es capaz de proporcionar soluciones de clustering óptimas frente a una gran variedad de escenarios de clustering, siendo capaz de lidiar con diferentes clases de datos, así como de mejorar el rendimiento ofrecido por los algoritmos de clustering más ampliamente utilizados en la práctica. Finalmente, una estrategia de análisis de dos etapas basada en el paradigma del clustering subespacial es propuesta a fin de abordar adecuadamente el problema de la modelización de la participación de los estudiantes en las discusiones asíncronas. Combinada con el nuevo algoritmo clustering, la estrategia propuesta demuestra ser capaz de limitar la intervención subjetiva del usuario a las etapas de interpretación del proceso de análisis y de dar lugar a una completa modelización de la actividad llevada a cabo por los estudiantes en los foros de discusión online.The analysis of learners' activity in online discussion forums leads to a highly context-dependent modelling problem, which can be posed from both theoretical and empirical approaches. When this problem is tackled from the data mining field, a clustering-based perspective is usually adopted, thus giving rise to a clustering scenario where the real number of clusters is a priori unknown. Hence, this approach reveals an underlying problem, which is one of the best-known issues of the clustering paradigm: the estimation of the number of clusters, habitually selected by user according to some kind of subjective criterion that may easily lead to the appearance of undesired biases in the obtained models. With the aim of avoiding any user intervention in the cluster analysis stage, two new cluster merging criteria are proposed in the present thesis, which allow to implement a novel parameter-free agglomerative hierarchical algorithm. A complete set of experiments indicate that the new clustering algorithm is able to provide optimal clustering solutions in the face of a great variety of clustering scenarios, both having the ability to deal with different kinds of data and outperforming clustering algorithms most widely used in practice. Finally, a two-stage analysis strategy based on the subspace clustering paradigm is proposed to properly tackle the issue of modelling learners' participation in the asynchronous discussions. In combination with the new clustering algorithm, the proposed strategy proves to be able to limit user's subjective intervention to the interpretation stages of the analysis process and to lead to a complete modelling of the activity performed by learners in online discussion forums

    Semantic Domains in Akkadian Text

    Get PDF
    The article examines the possibilities offered by language technology for analyzing semantic fields in Akkadian. The corpus of data for our research group is the existing electronic corpora, Open richly annotated cuneiform corpus (ORACC). In addition to more traditional Assyriological methods, the article explores two language technological methods: Pointwise mutual information (PMI) and Word2vec.Peer reviewe

    CyberResearch on the Ancient Near East and Eastern Mediterranean

    Get PDF
    CyberResearch on the Ancient Near East and Neighboring Regions provides case studies on archaeology, objects, cuneiform texts, and online publishing, digital archiving, and preservation. Eleven chapters present a rich array of material, spanning the fifth through the first millennium BCE, from Anatolia, the Levant, Mesopotamia, and Iran. Customized cyber- and general glossaries support readers who lack either a technical background or familiarity with the ancient cultures. Edited by Vanessa Bigot Juloux, Amy Rebecca Gansell, and Alessandro Di Ludovico, this volume is dedicated to broadening the understanding and accessibility of digital humanities tools, methodologies, and results to Ancient Near Eastern Studies. Ultimately, this book provides a model for introducing cyber-studies to the mainstream of humanities research

    Projection methods for clustering and semi-supervised classification

    Get PDF
    This thesis focuses on data projection methods for the purposes of clustering and semi-supervised classification, with a primary focus on clustering. A number of contributions are presented which address this problem in a principled manner; using projection pursuit formulations to identify subspaces which contain useful information for the clustering task. Projection methods are extremely useful in high dimensional applications, and situations in which the data contain irrelevant dimensions which can be counterinformative for the clustering task. The final contribution addresses high dimensionality in the context of a data stream. Data streams and high dimensionality have been identified as two of the key challenges in data clustering. The first piece of work is motivated by identifying the minimum density hyperplane separator in the finite sample setting. This objective is directly related to the problem of discovering clusters defined as connected regions of high data density, which is a widely adopted definition in non-parametric statistics and machine learning. A thorough investigation into the theoretical aspects of this method, as well as the practical task of solving the associated optimisation problem efficiently is presented. The proposed methodology is applied to both clustering and semi-supervised classification problems, and is shown to reliably find low density hyperplane separators in both contexts. The second and third contributions focus on a different approach to clustering based on graph cuts. The minimum normalised graph cut objective has gained considerable attention as relaxations of the objective have been developed, which make them solvable for reasonably well sized problems. This has been adopted by the highly popular spectral clustering methods. The second piece of work focuses on identifying the optimal subspace in which to perform spectral clustering, by minimising the second eigenvalue of the graph Laplacian for a graph defined over the data within that subspace. A rigorous treatment of this objective is presented, and an algorithm is proposed for its optimisation. An approximation method is proposed which allows this method to be applied to much larger problems than would otherwise be possible. An extension of this work deals with the spectral projection pursuit method for semi-supervised classification. iii The third body of work looks at minimising the normalised graph cut using hyperplane separators. This formulation allows for the exact normalised cut to be computed, rather than the spectral relaxation. It also allows for a computationally efficient method for optimisation. The asymptotic properties of the normalised cut based on a hyperplane separator are investigated, and shown to have similarities with the clustering objective based on low density separation. In fact, both the methods in the second and third works are shown to be connected with the first, in that all three have the same solution asymptotically, as their relative scaling parameters are reduced to zero. The final body of work addresses both problems of high dimensionality and incremental clustering in a data stream context. A principled statistical framework is adopted, in which clustering by low density separation again becomes the focal objective. A divisive hierarchical clustering model is proposed, using a collection of low density hyperplanes. The adopted framework provides well founded methodology for determining the number of clusters automatically, and also identifying changes in the data stream which are relevant to the clustering objective. It is apparent that no existing methods can make both of these claims

    Computational methods for the analysis of mass spectrometry imaging data

    Get PDF
    A powerful enhancement to MS-based detection is the addition of spatial information to the chemical data; an approach called mass spectrometry imaging (MSI). MSI enables two- and three-dimensional overviews of hundreds of molecular species over a wide mass range in complex biological samples. In this work, we present two computational methods and a workflow that address three different aspects of MSI data analysis: correction of mass shifts, unsupervised exploration of the data and importance of preprocessing and chemometrics to extract meaningful information from the data. We introduce a new lock mass-free recalibration procedure that enables to significantly reduce these mass shift effects in MSI data. Our method exploits similarities amongst peaklist pairs and takes advantage of the spatial context in three different ways, to perform mass correction in an iterative manner. As an extension of this work, we also present a Java-based tool, MSICorrect, that implements our recalibration approach and also allows data visualization. In the next part, an unsupervised approach to rank ion intensity maps based on the abundance of their spatial pattern is presented. Our method provides a score to every ion intensity map based on the abundance of spatial pattern present in it and then ranks all the maps using it. To know which masses exhibit similar spatial distribution, our method uses spatial-similarity based grouping to provide lists of masses that exhibit similar distribution patterns. In the last part, we demonstrate the application of a data preprocessing and multivariate analysis pipeline to a real-world biological dataset. We demonstrate this by applying the full pipeline to a high-resolution MSI dataset acquired from the leaf surface of Black cottonwood (Populus trichocarpa). Application of the pipeline helped in highlighting and visualizing the chemical specificity on the leaf surface

    A survey of the application of soft computing to investment and financial trading

    Get PDF

    Recommender Systems for Scientific and Technical Information Providers

    Get PDF
    Providers of scientific and technical information are a promising application area of recommender systems due to high search costs for their goods and the general problem of assessing the quality of information products. Nevertheless, the usage of recommendation services in this market is still in its infancy. This book presents economical concepts, statistical methods and algorithms, technical architectures, as well as experiences from case studies on how recommender systems can be integrated
    corecore