1,316 research outputs found

    Euclidean co-embedding of ordinal data for multi-type visualization

    Get PDF

    Learning multiple maps from conditional ordinal triplets

    Get PDF
    Singapore National Research Foundatio

    Adversarial Unsupervised Representation Learning for Activity Time-Series

    Full text link
    Sufficient physical activity and restful sleep play a major role in the prevention and cure of many chronic conditions. Being able to proactively screen and monitor such chronic conditions would be a big step forward for overall health. The rapid increase in the popularity of wearable devices provides a significant new source, making it possible to track the user's lifestyle real-time. In this paper, we propose a novel unsupervised representation learning technique called activity2vec that learns and "summarizes" the discrete-valued activity time-series. It learns the representations with three components: (i) the co-occurrence and magnitude of the activity levels in a time-segment, (ii) neighboring context of the time-segment, and (iii) promoting subject-invariance with adversarial training. We evaluate our method on four disorder prediction tasks using linear classifiers. Empirical evaluation demonstrates that our proposed method scales and performs better than many strong baselines. The adversarial regime helps improve the generalizability of our representations by promoting subject invariant features. We also show that using the representations at the level of a day works the best since human activity is structured in terms of daily routinesComment: Accepted at AAAI'19. arXiv admin note: text overlap with arXiv:1712.0952

    Euclidean distance geometry and applications

    Full text link
    Euclidean distance geometry is the study of Euclidean geometry based on the concept of distance. This is useful in several applications where the input data consists of an incomplete set of distances, and the output is a set of points in Euclidean space that realizes the given distances. We survey some of the theory of Euclidean distance geometry and some of the most important applications: molecular conformation, localization of sensor networks and statics.Comment: 64 pages, 21 figure

    Greedy routing and virtual coordinates for future networks

    Get PDF
    At the core of the Internet, routers are continuously struggling with ever-growing routing and forwarding tables. Although hardware advances do accommodate such a growth, we anticipate new requirements e.g. in data-oriented networking where each content piece has to be referenced instead of hosts, such that current approaches relying on global information will not be viable anymore, no matter the hardware progress. In this thesis, we investigate greedy routing methods that can achieve similar routing performance as today but use much less resources and which rely on local information only. To this end, we add specially crafted name spaces to the network in which virtual coordinates represent the addressable entities. Our scheme enables participating routers to make forwarding decisions using only neighbourhood information, as the overarching pseudo-geometric name space structure already organizes and incorporates "vicinity" at a global level. A first challenge to the application of greedy routing on virtual coordinates to future networks is that of "routing dead-ends" that are local minima due to the difficulty of consistent coordinates attribution. In this context, we propose a routing recovery scheme based on a multi-resolution embedding of the network in low-dimensional Euclidean spaces. The recovery is performed by routing greedily on a blurrier view of the network. The different network detail-levels are obtained though the embedding of clustering-levels of the graph. When compared with higher-dimensional embeddings of a given network, our method shows a significant diminution of routing failures for similar header and control-state sizes. A second challenge to the application of virtual coordinates and greedy routing to future networks is the support of "customer-provider" as well as "peering" relationships between participants, resulting in a differentiated services environment. Although an application of greedy routing within such a setting would combine two very common fields of today's networking literature, such a scenario has, surprisingly, not been studied so far. In this context we propose two approaches to address this scenario. In a first approach we implement a path-vector protocol similar to that of BGP on top of a greedy embedding of the network. This allows each node to build a spatial map associated with each of its neighbours indicating the accessible regions. Routing is then performed through the use of a decision-tree classifier taking the destination coordinates as input. When applied on a real-world dataset (the CAIDA 2004 AS graph) we demonstrate an up to 40% compression ratio of the routing control information at the network's core as well as a computationally efficient decision process comparable to methods such as binary trees and tries. In a second approach, we take inspiration from consensus-finding in social sciences and transform the three-dimensional distance data structure (where the third dimension encodes the service differentiation) into a two-dimensional matrix on which classical embedding tools can be used. This transformation is achieved by agreeing on a set of constraints on the inter-node distances guaranteeing an administratively-correct greedy routing. The computed distances are also enhanced to encode multipath support. We demonstrate a good greedy routing performance as well as an above 90% satisfaction of multipath constraints when relying on the non-embedded obtained distances on synthetic datasets. As various embeddings of the consensus distances do not fully exploit their multipath potential, the use of compression techniques such as transform coding to approximate the obtained distance allows for better routing performances

    Rigid Transformations for Stabilized Lower Dimensional Space to Support Subsurface Uncertainty Quantification and Interpretation

    Full text link
    Subsurface datasets inherently possess big data characteristics such as vast volume, diverse features, and high sampling speeds, further compounded by the curse of dimensionality from various physical, engineering, and geological inputs. Among the existing dimensionality reduction (DR) methods, nonlinear dimensionality reduction (NDR) methods, especially Metric-multidimensional scaling (MDS), are preferred for subsurface datasets due to their inherent complexity. While MDS retains intrinsic data structure and quantifies uncertainty, its limitations include unstabilized unique solutions invariant to Euclidean transformations and an absence of out-of-sample points (OOSP) extension. To enhance subsurface inferential and machine learning workflows, datasets must be transformed into stable, reduced-dimension representations that accommodate OOSP. Our solution employs rigid transformations for a stabilized Euclidean invariant representation for LDS. By computing an MDS input dissimilarity matrix, and applying rigid transformations on multiple realizations, we ensure transformation invariance and integrate OOSP. This process leverages a convex hull algorithm and incorporates loss function and normalized stress for distortion quantification. We validate our approach with synthetic data, varying distance metrics, and real-world wells from the Duvernay Formation. Results confirm our method's efficacy in achieving consistent LDS representations. Furthermore, our proposed "stress ratio" (SR) metric provides insight into uncertainty, beneficial for model adjustments and inferential analysis. Consequently, our workflow promises enhanced repeatability and comparability in NDR for subsurface energy resource engineering and associated big data workflows.Comment: 30 pages, 17 figures, Submitted to Computational Geosciences Journa
    • …
    corecore