6 research outputs found

    How Fast Can We Play Tetris Greedily With Rectangular Pieces?

    Get PDF
    Consider a variant of Tetris played on a board of width ww and infinite height, where the pieces are axis-aligned rectangles of arbitrary integer dimensions, the pieces can only be moved before letting them drop, and a row does not disappear once it is full. Suppose we want to follow a greedy strategy: let each rectangle fall where it will end up the lowest given the current state of the board. To do so, we want a data structure which can always suggest a greedy move. In other words, we want a data structure which maintains a set of O(n)O(n) rectangles, supports queries which return where to drop the rectangle, and updates which insert a rectangle dropped at a certain position and return the height of the highest point in the updated set of rectangles. We show via a reduction to the Multiphase problem [P\u{a}tra\c{s}cu, 2010] that on a board of width w=Θ(n)w=\Theta(n), if the OMv conjecture [Henzinger et al., 2015] is true, then both operations cannot be supported in time O(n1/2−ϔ)O(n^{1/2-\epsilon}) simultaneously. The reduction also implies polynomial bounds from the 3-SUM conjecture and the APSP conjecture. On the other hand, we show that there is a data structure supporting both operations in O(n1/2log⁥3/2n)O(n^{1/2}\log^{3/2}n) time on boards of width nO(1)n^{O(1)}, matching the lower bound up to a no(1)n^{o(1)} factor.Comment: Correction of typos and other minor correction

    Contrastive Bayesian Analysis for Deep Metric Learning

    Full text link
    Recent methods for deep metric learning have been focusing on designing different contrastive loss functions between positive and negative pairs of samples so that the learned feature embedding is able to pull positive samples of the same class closer and push negative samples from different classes away from each other. In this work, we recognize that there is a significant semantic gap between features at the intermediate feature layer and class labels at the final output layer. To bridge this gap, we develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity in a contrastive learning setting. This contrastive Bayesian analysis leads to a new loss function for deep metric learning. To improve the generalization capability of the proposed method onto new classes, we further extend the contrastive Bayesian loss with a metric variance constraint. Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning in both supervised and pseudo-supervised scenarios, outperforming existing methods by a large margin.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Generative Models Based on the Bounded Asymmetric Student’s t-Distribution

    Get PDF
    Gaussian mixture models (GMMs) are a very useful and widely popular approach for clustering, but they have several limitations, such as low outliers tolerance and assumption of data normality. Another problem in relation to finite mixture models in general is the inference of an optimal number of mixture components. An excellent approach to solve this problem is model selection, which is the process of choosing the optimal number of mixture components that ensures the best clustering performance. In this thesis, we attempt to tackle both aforementioned issues: we propose using minimum message length (MML) as a model selection criterion for multivariate bounded asymmetric Student’s t-mixture model (BASMM). In fact, BASMM is chosen as an alternative to improve the GMM’s limitations, as it provides a better fit for the real-world data irregularities. We formulate the definition of MML and the BASMM, and we test their performance through multiple experiments with different problem settings. Hidden Markov models (HMMs) are popular methods for continuous sequential data modeling and classification tasks. In such applications, the observation emission densities of the HMM hidden states are typically modeled by elliptically contoured distributions, namely Gaussians or Student’s t-distributions. In this context, this thesis proposes BAMMHMM: a novel HMM with Bounded Asymmetric Student’s t-Mixture Model (BASMM) emissions. This HMM is destined to sufficiently fit skewed and outlier-heavy observations, which are typical in many fields, such as financial or signal processing-related datasets. We demonstrate the improved robustness of our model by presenting the results of different real-world applications

    Mixture-Based Clustering and Hidden Markov Models for Energy Management and Human Activity Recognition: Novel Approaches and Explainable Applications

    Get PDF
    In recent times, the rapid growth of data in various fields of life has created an immense need for powerful tools to extract useful information from data. This has motivated researchers to explore and devise new ideas and methods in the field of machine learning. Mixture models have gained substantial attention due to their ability to handle high-dimensional data efficiently and effectively. However, when adopting mixture models in such spaces, four crucial issues must be addressed, including the selection of probability density functions, estimation of mixture parameters, automatic determination of the number of components, identification of features that best discriminate the different components, and taking into account the temporal information. The primary objective of this thesis is to propose a unified model that addresses these interrelated problems. Moreover, this thesis proposes a novel approach that incorporates explainability. This thesis presents innovative mixture-based modelling approaches tailored for diverse applications, such as household energy consumption characterization, energy demand management, fault detection and diagnosis and human activity recognition. The primary contributions of this thesis encompass the following aspects: Initially, we propose an unsupervised feature selection approach embedded within a finite bounded asymmetric generalized Gaussian mixture model. This model is adept at handling synthetic and real-life smart meter data, utilizing three distinct feature extraction methods. By employing the expectation-maximization algorithm in conjunction with the minimum message length criterion, we are able to concurrently estimate the model parameters, perform model selection, and execute feature selection. This unified optimization process facilitates the identification of household electricity consumption profiles along with the optimal subset of attributes defining each profile. Furthermore, we investigate the impact of household characteristics on electricity usage patterns to pinpoint households that are ideal candidates for demand reduction initiatives. Subsequently, we introduce a semi-supervised learning approach for the mixture of mixtures of bounded asymmetric generalized Gaussian and uniform distributions. The integration of the uniform distribution within the inner mixture bolsters the model's resilience to outliers. In the unsupervised learning approach, the minimum message length criterion is utilized to ascertain the optimal number of mixture components. The proposed models are validated through a range of applications, including chiller fault detection and diagnosis, occupancy estimation, and energy consumption characterization. Additionally, we incorporate explainability into our models and establish a moderate trade-off between prediction accuracy and interpretability. Finally, we devise four novel models for human activity recognition (HAR): bounded asymmetric generalized Gaussian mixture-based hidden Markov model with feature selection~(BAGGM-FSHMM), bounded asymmetric generalized Gaussian mixture-based hidden Markov model~(BAGGM-HMM), asymmetric generalized Gaussian mixture-based hidden Markov model with feature selection~(AGGM-FSHMM), and asymmetric generalized Gaussian mixture-based hidden Markov model~(AGGM-HMM). We develop an innovative method for simultaneous estimation of feature saliencies and model parameters in BAGGM-FSHMM and AGGM-FSHMM while integrating the bounded support asymmetric generalized Gaussian distribution~(BAGGD), the asymmetric generalized Gaussian distribution~(AGGD) in the BAGGM-HMM and AGGM-HMM respectively. The aforementioned proposed models are validated using video-based and sensor-based HAR applications, showcasing their superiority over several mixture-based hidden Markov models~(HMMs) across various performance metrics. We demonstrate that the independent incorporation of feature selection and bounded support distribution in a HAR system yields benefits; Simultaneously, combining both concepts results in the most effective model among the proposed models

    Online learning on the programmable dataplane

    Get PDF
    This thesis makes the case for managing computer networks with datadriven methods automated statistical inference and control based on measurement data and runtime observations—and argues for their tight integration with programmable dataplane hardware to make management decisions faster and from more precise data. Optimisation, defence, and measurement of networked infrastructure are each challenging tasks in their own right, which are currently dominated by the use of hand-crafted heuristic methods. These become harder to reason about and deploy as networks scale in rates and number of forwarding elements, but their design requires expert knowledge and care around unexpected protocol interactions. This makes tailored, per-deployment or -workload solutions infeasible to develop. Recent advances in machine learning offer capable function approximation and closed-loop control which suit many of these tasks. New, programmable dataplane hardware enables more agility in the network— runtime reprogrammability, precise traffic measurement, and low latency on-path processing. The synthesis of these two developments allows complex decisions to be made on previously unusable state, and made quicker by offloading inference to the network. To justify this argument, I advance the state of the art in data-driven defence of networks, novel dataplane-friendly online reinforcement learning algorithms, and in-network data reduction to allow classification of switchscale data. Each requires co-design aware of the network, and of the failure modes of systems and carried traffic. To make online learning possible in the dataplane, I use fixed-point arithmetic and modify classical (non-neural) approaches to take advantage of the SmartNIC compute model and make use of rich device local state. I show that data-driven solutions still require great care to correctly design, but with the right domain expertise they can improve on pathological cases in DDoS defence, such as protecting legitimate UDP traffic. In-network aggregation to histograms is shown to enable accurate classification from fine temporal effects, and allows hosts to scale such classification to far larger flow counts and traffic volume. Moving reinforcement learning to the dataplane is shown to offer substantial benefits to stateaction latency and online learning throughput versus host machines; allowing policies to react faster to fine-grained network events. The dataplane environment is key in making reactive online learning feasible—to port further algorithms and learnt functions, I collate and analyse the strengths of current and future hardware designs, as well as individual algorithms
    corecore