5 research outputs found

    Sparse p-Adic Data Coding for Computationally Efficient and Effective Big Data Analytics

    Get PDF
    We develop the theory and practical implementation of p-adic sparse coding of data. Rather than the standard, sparsifying criterion that uses the L0L_0 pseudo-norm, we use the p-adic norm.We require that the hierarchy or tree be node-ranked, as is standard practice in agglomerative and other hierarchical clustering, but not necessarily with decision trees. In order to structure the data, all computational processing operations are direct reading of the data, or are bounded by a constant number of direct readings of the data, implying linear computational time. Through p-adic sparse data coding, efficient storage results, and for bounded p-adic norm stored data, search and retrieval are constant time operations. Examples show the effectiveness of this new approach to content-driven encoding and displaying of data

    CUDA and OpenMp Implementation of Boolean Matrix Product with Applications in Visual SLAM

    Get PDF
    In this paper, the concept of ultrametric structure is intertwined with the SLAM procedure. A set of pre-existing transformations has been used to create a new simultaneous localization and mapping (SLAM) algorithm. We have developed two new parallel algorithms that implement the time-consuming Boolean transformations of the space dissimilarity matrix. The resulting matrix is an important input to the vector quantization (VQ) step in SLAM processes. These algorithms, written in Compute Unified Device Architecture (CUDA) and Open Multi-Processing (OpenMP) pseudo-codes, make the Boolean transformation computationally feasible on a real-world-size dataset. We expect our newly introduced SLAM algorithm, ultrametric Fast Appearance Based Mapping (FABMAP), to outperform regular FABMAP2 since ultrametric spaces are more clusterable than regular Euclidean spaces. Another scope of the presented research is the development of a novel measure of ultrametricity, along with creation of Ultrametric-PAM clustering algorithm. Since current measures have computational time complexity order, O(n3) a new measure with lower time complexity, O(n2) , has a potential significance

    Random projection ensemble classification

    Get PDF
    We introduce a very general method for high-dimensional classification, based on careful combination of the results of applying an arbitrary base classifier to random projections of the feature vectors into a lower-dimensional space. In one special case that we study in detail, the random projections are divided into disjoint groups, and within each group we select the projection yielding the smallest estimate of the test error. Our random projection ensemble classifier then aggregates the results of applying the base classifier on the selected projections, with a data-driven voting threshold to determine the final assignment. Our theoretical results elucidate the effect on performance of increasing the number of projections. Moreover, under a boundary condition implied by the sufficient dimension reduction assumption, we show that the test excess risk of the random projection ensemble classifier can be controlled by terms that do not depend on the original data dimension and a term that becomes negligible as the number of projections increases. The classifier is also compared empirically with several other popular high-dimensional classifiers via an extensive simulation study, which reveals its excellent finite-sample performance.Both authors are supported by an Engineering and Physical Sciences Research Council Fellowship EP/J017213/1; the second author is also supported by a Philip Leverhulme prize