28,308 research outputs found

    Detection, tracking and event localization of jet stream features in 4-D atmospheric data

    Get PDF
    We introduce a novel algorithm for the efficient detection and tracking of features in spatiotemporal atmospheric data, as well as for the precise localization of the occurring genesis, lysis, merging and splitting events. The algorithm works on data given on a four-dimensional structured grid. Feature selection and clustering are based on adjustable local and global criteria, feature tracking is predominantly based on spatial overlaps of the feature's full volumes. The resulting 3-D features and the identified correspondences between features of consecutive time steps are represented as the nodes and edges of a directed acyclic graph, the event graph. Merging and splitting events appear in the event graph as nodes with multiple incoming or outgoing edges, respectively. The precise localization of the splitting events is based on a search for all grid points inside the initial 3-D feature that have a similar distance to two successive 3-D features of the next time step. The merging event is localized analogously, operating backward in time. As a first application of our method we present a climatology of upper-tropospheric jet streams and their events, based on four-dimensional wind speed data from European Centre for Medium-Range Weather Forecasts (ECMWF) analyses. We compare our results with a climatology from a previous study, investigate the statistical distribution of the merging and splitting events, and illustrate the meteorological significance of the jet splitting events with a case study. A brief outlook is given on additional potential applications of the 4-D data segmentation technique

    Localized Lasso for High-Dimensional Regression

    Get PDF
    We introduce the localized Lasso, which is suited for learning models that are both interpretable and have a high predictive power in problems with high dimensionality dd and small sample size nn. More specifically, we consider a function defined by local sparse models, one at each data point. We introduce sample-wise network regularization to borrow strength across the models, and sample-wise exclusive group sparsity (a.k.a., â„“1,2\ell_{1,2} norm) to introduce diversity into the choice of feature sets in the local models. The local models are interpretable in terms of similarity of their sparsity patterns. The cost function is convex, and thus has a globally optimal solution. Moreover, we propose a simple yet efficient iterative least-squares based optimization procedure for the localized Lasso, which does not need a tuning parameter, and is guaranteed to converge to a globally optimal solution. The solution is empirically shown to outperform alternatives for both simulated and genomic personalized medicine data

    Spectral Embedding Norm: Looking Deep into the Spectrum of the Graph Laplacian

    Full text link
    The extraction of clusters from a dataset which includes multiple clusters and a significant background component is a non-trivial task of practical importance. In image analysis this manifests for example in anomaly detection and target detection. The traditional spectral clustering algorithm, which relies on the leading KK eigenvectors to detect KK clusters, fails in such cases. In this paper we propose the {\it spectral embedding norm} which sums the squared values of the first II normalized eigenvectors, where II can be significantly larger than KK. We prove that this quantity can be used to separate clusters from the background in unbalanced settings, including extreme cases such as outlier detection. The performance of the algorithm is not sensitive to the choice of II, and we demonstrate its application on synthetic and real-world remote sensing and neuroimaging datasets

    Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

    Full text link
    Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference
    • …
    corecore