817 research outputs found

    다차원 Parametric Min-cut을 응용한 섭동확률모델에서의 예측손실 최적화

    Get PDF
    학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 정교민.We consider the problem of learning perturbation-based probabilistic models by computing and differentiating expected losses. This is a challenging computational problem that has traditionally been tackled using Monte Carlo-based methods. In this work, we show how a generalization of parametric min-cuts can be used to address the same problem, achieving high accuracy of faster than a sampling-based baseline. Utilizing our proposed Skeleton Method, we show that we can learn the perturbation model so as to directly minimize expected losses. Experimental results show that this approach offers promise as a new way of training structured prediction models under complex loss functions.Abstract i Chapter 1 Introduction 1 Chapter 2 Background: Perturbations, Expected Losses 4 Chapter 3 Algorithm: Skeleton Method 6 3.1 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Finding a New Facet . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Updating the Skeleton GY . . . . . . . . . . . . . . . . . . . . . . 10 3.4 Calculating Expected Loss R . . . . . . . . . . . . . . . . . . . . 11 3.5 Example: Two Parameters . . . . . . . . . . . . . . . . . . . . . . 11 Chapter 4 Learning 14 4.1 Computing Gradients: Slicing . . . . . . . . . . . . . . . . . . . . 14 4.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Exploiting the Skeletond Method . . . . . . . . . . . . . . . . . . 17 Chapter 5 Experiments and Discussion 18 5.1 Data and Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 ii 5.2 Calculating Expected Losses . . . . . . . . . . . . . . . . . . . . 19 5.3 Calculating Gradients . . . . . . . . . . . . . . . . . . . . . . . . 20 5.4 Model Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.4.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.4.2 Other Loss Functions . . . . . . . . . . . . . . . . . . . . 23 5.5 Expected Segmentations . . . . . . . . . . . . . . . . . . . . . . . 25 Chapter 6 Conclusion 27 Bibliography 29 초록 32Maste

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

    International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

    Get PDF
    The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

    Multimodel Operability Framework for Design of Modular and Intensified Energy Systems

    Get PDF
    In this dissertation, a novel operability framework is introduced for the process design of modular and intensified energy systems that are challenged by complexity and highly constrained environments. Previously developed process operability approaches are reviewed and further developed in terms of theory, application, and software infrastructure. An optimization-based multilayer operability framework is introduced for process design of nonlinear energy systems. In the first layer of this framework, a mixed-integer linear programming (MILP)-based iterative algorithm considers the minimization of footprint and achievement of process intensification targets. Then, in the second layer, an operability analysis is performed to incorporate key features of optimality and feasibility accounting for the system achievability and flexibility. The outcome of this framework consists of a set of modular designs, considering both the aspects of size and process operability. For this study and throughout this dissertation, the nonlinear system is represented by multiple linearized models, which results in lower computational expense and more efficient quantification of operability regions. A systematic techno-economic analysis framework is also proposed for costing intensified modular systems. Conventional costing techniques are extended to allow estimation of capital and operating costs of modular units. Economy of learning concepts are included to consider the effect of experience curves on purchase costs. Profitability measures are scaled with respect to production of a chemical of interest for comparison with plants of traditional scale. Scenarios in which the modular technology presents break-even or further reduction in cost when compared to the traditional process are identified as a result. A framework for the development of process operability algorithms is provided as a software infrastructure outcome. Generated codes from the developed approaches are included in an open-source platform that will give researchers from academia and industry access to the algorithms. This platform has the purpose of dissemination and future improvement of process operability algorithms and methods. To show versatility and efficacy of the developed approaches, a variety of applications are considered as follows: a membrane reactor for direct methane aromatization conversion to hydrogen and benzene (DMA-MR), the classical shower problem in process operability, a power plant cycling application for power generation with penetration of renewable energy sources, and a newly developed modular hydrogen unit. Applications to DMA-MR subsystems demonstrate employment of the multilayer framework to find a region with modular design candidates, which are then ranked according to an operability index. The most operable design is determined and contrasted with the optimal design with respect to process intensification in terms of footprint minimization, showing that optimality at fixed nominal operations does not necessarily ensure the best system operability. For the modular hydrogen unit application, the developed process operability framework provides guidelines for obtaining modular designs that are highly integrated and flexible with respect to disturbances in inlet natural gas composition. The modular hydrogen unit is also used for demonstration of the proposed techno-economic analysis framework. A comparison with a benchmark conventional steam methane reforming plant shows that the modular hydrogen unit can benefit from the economy of learning. An assembled modular steam methane reforming plant is used to map the decrease in natural gas price that must be needed for the plant to break even when compared to traditional technologies. Scenarios in which the natural gas price is low allow break-even cost for both individual hydrogen units and the assembled modular plant. The economy of learning must produce a reduction of 40% or less in capital cost when the natural gas price is under 0.02 US$/Sm3. This result suggests that the synthesized modular hydrogen process has potential to be economically feasible under these conditions. The developed tools can be used to accelerate the deployment and manufacturing of standardized modular energy systems

    Similarity measures for clustering sequences and sets of data

    Get PDF
    The main object of this PhD. Thesis is the definition of new similarity measures for data sequences, with the final purpose of clustering those sequences. Clustering consists in the partitioning of a dataset into isolated subsets or clusters. Data within a given cluster should be similar, and at the same different from data in other clusters. The relevance of data sequences clustering is ever-increasing, due to the abundance of this kind of data (multimedia sequences, movement analysis, stock market evolution, etc.) and the usefulness of clustering as an unsupervised exploratory analysis method. It is this lack of supervision that makes similarity measures extremely important for clustering, since it is the only guide of the learning process. The first part of the Thesis focuses on the development of similarity measures leveraging dynamical models, which can capture relationships between the elements of a given sequence. Following this idea, two lines are explored: • Likelihood-based measures: Based on the popular framework of likelihood-matrix-based similarity measures, we present a novel method based on a re-interpretation of such a matrix. That interpretations stems from the assumption of a latent model space, so models used to build the likelihood matrix are seen as samples from that space. The method is extremely flexible since it allows for the use of any probabilistic model for representing the individual sequences. • State-space trajectories based measures: We introduce a new way of defining affinities between sequences, addressing the main drawbacks of the likelihood-based methods. Working with state-space models makes it possible to identify sequences with the trajectories that they induce in the state-space. This way, comparisons between sequences amounts to comparisons between the corresponding trajectories. Using a common hidden Markov model for all the sequences in the dataset makes those comparisons extremely simple, since trajectories can be identified with transition matrices. This new paradigm improves the scalability of the affinity measures with respect to the dataset size, as well as the performance of those measures when the sequences are short. The second part of the Thesis deals with the case where the dynamics of the sequences are discarded, so the sequences become sets of vectors or points. This step to be taken, without harming the learning process, when the statical features (probability densities) of the different sets are informative enough for the task at hand, which is true for many real scenarios. Work along this line can be further subdivided in two areas: • Sets-of-vectors clustering based on the support of the distributions in a feature space: We propose clustering the sets using a notion of similarity related to the intersection of the supports of their underlying distributions in a Hilbert space. Such a clustering can be efficiently carried out in a hierarchical fashion, in spite of the potentially infinite dimensionality of the feature space. To this end, we propose an algorithm based on simple geometrical arguments. Support estimation is inherently a simpler problem than density estimation, which is the usual starting step for obtaining similarities between probability distributions. • Classifer-based affinity and divergence measures: It is quite natural to link the notion of similarity between sets with the separability between those sets. That separability can be quantified using binary classifiers. This intuitive idea is then extended via generalizations of the family of f-divergences, which originally contains many of the best-known divergences in statistics and machine learning. The proposed generalizations present interesting theoretical properties, and at the same time they have promising practical applications, such as the development of new estimators for standard divergences. -----------------------------------------------------------------------------------------------------------------------------------------------------------------------El objetivo de esta Tesis Doctoral es la definición de nuevas medidas de similitud para secuencias y conjuntos de datos, con la finalidad de servir de entrada a un algoritmo de agrupamiento o clustering [Xu and Wunsch-II, 2005]. El agrupamiento es una de las tareas más habituales dentro del ámbito del aprendizaje máquina (machine learning) [Mitchell, 1997]. Dicha tarea consiste en la partición de un conjunto de datos en subconjuntos aislados (clusters), de tal forma que los datos asignados a un mismo subconjunto sean parecidos entre sí, y distintos a los datos pertenecientes a otros subconjuntos. Una de sus principales particularidades es que se trata de una tarea no supervisada, lo cual implica que no requiere de un conjunto de ejemplos etiquetados. De esta forma se reduce la interacción humana necesaria para el aprendizaje, haciendo del agrupamiento una herramienta ideal para el análisis exploratorio de datos complejos. Por otro lado, es precisamente esta falta de supervisión la que hace fundamental el disponer de una medida adecuada de similitud entre elementos, ya que es la única guía durante el proceso de aprendizaje. El agrupamiento de secuencias es una tarea cada día más importante debido al reciente auge de este tipo de datos. Cabe destacar el ámbito multimedia, en el que muchos contenidos presentan características secuenciales: señales de voz, audio, vídeo, etc. No es un ejemplo aislado, ya que en muchos otros ámbitos se producen casuísticas similares: desde los datos de bolsa y mercados financieros diversos al problema del análisis de movimiento. En la mayoría de estos casos la complejidad de los datos de entrada se une a la dificultad y elevado coste del etiquetado manual de dichos datos. Es precisamente en este tipo de escenarios en los que el agrupamiento es especialmente útil, debido a que no requiere de un etiquetado previo. En muchos casos es posible prescindir de la dinámica de las secuencias sin perjudicar el proceso de aprendizaje. Son aquellos casos en los que las características estáticas de los datos de entrada son suficientemente discriminativas. Al obviar la dinámica, las secuencias se transforman en conjuntos de datos, que se interpretan como muestras (no necesariamente independientes) de unas determinadas distribuciones de probabilidad subyacentes. Ejemplos prácticos de ámbitos en los que se trabaja con conjuntos de datos incluyen el agrupamiento de locutores [Campbell, 1997], los modelos de bolsa de palabras (bag of words) para texto/imagen [Dance et al., 2004], etc. En la presente Tesis propondremos métodos y, sobre todo, puntos de vista innovadores para la definición de similitudes entre secuencias o conjuntos de datos. Todos los métodos propuestos han sido analizados desde un punto de vista tanto teórico como empírico. Desde la perspectiva experimental se ha tratado de trabajar con la mayor cantidad de datos reales posibles, haciendo especial hincapié en las tareas de agrupamiento de locutores y reconocimiento de género musical

    Suche nach Zwei-Boson-Resonanzen im vollhadronischen Endzustand mit dem CMS-Detektor = Search for Diboson Resonances in the Full-Hadronic Final State with the CMS Detector at s\sqrt{s} = 13 TeV

    Get PDF
    A search for new resonances decaying to WW, WZ or ZZ in the all hadronic final state using 77.3~fb1^{-1} of data taken with the CMS experiment at the CERN LHC at a center-of-mass energy of 13 TeV is presented. The search focuses on potential new particles with a mass at the TeV scale resulting in a high transverse momentum of the produced vector bosons. The subsequent decay products of the vector bosons are therefore highly collimated and reconstructed into a single large-radius jet, which are further classified using jet substructure methods. The analysis presented utilizes a new data-driven background modeling technique based on a template fit in a three-dimensional hyperspace spanned by the dijet invariant mass and the corrected jet masses of the two final state jets. This method allows the utilization of the full available signal yield while simultaneously constraining the background processes by including the mass sideband regions in the fit. This grants the opportunity to easily expand this framework to include VH, HH or more exotic signals with different messenger particles in the future. No significant excess is observed above the estimated standard model background and limits are set at 95% confidence level on the cross section times branching fraction of a new particle, which are interpreted in terms of various models that predict spin-2 gravitons or spin-1 vector bosons. In a heavy vector triplet model, spin-1 Z\u27 and W\u27 resonances with masses below 3.5 and 3.8 TeV, respectively, are excluded at 95% confidence level. In a narrow-width bulk graviton model, upper limits on cross sections times branching fractions are set between 27 and 0.2 fb for resonance masses between 1.2 and 5 TeV, respectively. The limits presented in this thesis are the best to date in the dijet final state

    Deep Neural Networks and Data for Automated Driving

    Get PDF
    This open access book brings together the latest developments from industry and research on automated driving and artificial intelligence. Environment perception for highly automated driving heavily employs deep neural networks, facing many challenges. How much data do we need for training and testing? How to use synthetic data to save labeling costs for training? How do we increase robustness and decrease memory usage? For inevitably poor conditions: How do we know that the network is uncertain about its decisions? Can we understand a bit more about what actually happens inside neural networks? This leads to a very practical problem particularly for DNNs employed in automated driving: What are useful validation techniques and how about safety? This book unites the views from both academia and industry, where computer vision and machine learning meet environment perception for highly automated driving. Naturally, aspects of data, robustness, uncertainty quantification, and, last but not least, safety are at the core of it. This book is unique: In its first part, an extended survey of all the relevant aspects is provided. The second part contains the detailed technical elaboration of the various questions mentioned above
    corecore