13,420 research outputs found

    OPML: A One-Pass Closed-Form Solution for Online Metric Learning

    Get PDF
    To achieve a low computational cost when performing online metric learning for large-scale data, we present a one-pass closed-form solution namely OPML in this paper. Typically, the proposed OPML first adopts a one-pass triplet construction strategy, which aims to use only a very small number of triplets to approximate the representation ability of whole original triplets obtained by batch-manner methods. Then, OPML employs a closed-form solution to update the metric for new coming samples, which leads to a low space (i.e., O(d)O(d)) and time (i.e., O(d2)O(d^2)) complexity, where dd is the feature dimensionality. In addition, an extension of OPML (namely COPML) is further proposed to enhance the robustness when in real case the first several samples come from the same class (i.e., cold start problem). In the experiments, we have systematically evaluated our methods (OPML and COPML) on three typical tasks, including UCI data classification, face verification, and abnormal event detection in videos, which aims to fully evaluate the proposed methods on different sample number, different feature dimensionalities and different feature extraction ways (i.e., hand-crafted and deeply-learned). The results show that OPML and COPML can obtain the promising performance with a very low computational cost. Also, the effectiveness of COPML under the cold start setting is experimentally verified.Comment: 12 page

    Regression on fixed-rank positive semidefinite matrices: a Riemannian approach

    Full text link
    The paper addresses the problem of learning a regression model parameterized by a fixed-rank positive semidefinite matrix. The focus is on the nonlinear nature of the search space and on scalability to high-dimensional problems. The mathematical developments rely on the theory of gradient descent algorithms adapted to the Riemannian geometry that underlies the set of fixed-rank positive semidefinite matrices. In contrast with previous contributions in the literature, no restrictions are imposed on the range space of the learned matrix. The resulting algorithms maintain a linear complexity in the problem size and enjoy important invariance properties. We apply the proposed algorithms to the problem of learning a distance function parameterized by a positive semidefinite matrix. Good performance is observed on classical benchmarks

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Artificial intelligence in steam cracking modeling : a deep learning algorithm for detailed effluent prediction

    Get PDF
    Chemical processes can benefit tremendously from fast and accurate effluent composition prediction for plant design, control, and optimization. The Industry 4.0 revolution claims that by introducing machine learning into these fields, substantial economic and environmental gains can be achieved. The bottleneck for high-frequency optimization and process control is often the time necessary to perform the required detailed analyses of, for example, feed and product. To resolve these issues, a framework of four deep learning artificial neural networks (DL ANNs) has been developed for the largest chemicals production process-steam cracking. The proposed methodology allows both a detailed characterization of a naphtha feedstock and a detailed composition of the steam cracker effluent to be determined, based on a limited number of commercial naphtha indices and rapidly accessible process characteristics. The detailed characterization of a naphtha is predicted from three points on the boiling curve and paraffins, iso-paraffins, olefins, naphthenes, and aronatics (PIONA) characterization. If unavailable, the boiling points are also estimated. Even with estimated boiling points, the developed DL ANN outperforms several established methods such as maximization of Shannon entropy and traditional ANNs. For feedstock reconstruction, a mean absolute error (MAE) of 0.3 wt% is achieved on the test set, while the MAE of the effluent prediction is 0.1 wt%. When combining all networks-using the output of the previous as input to the next-the effluent MAE increases to 0.19 wt%. In addition to the high accuracy of the networks, a major benefit is the negligible computational cost required to obtain the predictions. On a standard Intel i7 processor, predictions are made in the order of milliseconds. Commercial software such as COILSIM1D performs slightly better in terms of accuracy, but the required central processing unit time per reaction is in the order of seconds. This tremendous speed-up and minimal accuracy loss make the presented framework highly suitable for the continuous monitoring of difficult-to-access process parameters and for the envisioned, high-frequency real-time optimization (RTO) strategy or process control. Nevertheless, the lack of a fundamental basis implies that fundamental understanding is almost completely lost, which is not always well-accepted by the engineering community. In addition, the performance of the developed networks drops significantly for naphthas that are highly dissimilar to those in the training set. (C) 2019 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and Higher Education Press Limited Company
    corecore