6,487 research outputs found

    Multivariate Analysis of Orthogonal Range Searching and Graph Distances

    Get PDF
    We show that the eccentricities, diameter, radius, and Wiener index of an undirected n-vertex graph with nonnegative edge lengths can be computed in time O(n * binom{k+ceil[log n]}{k} * 2^k k^2 log n), where k is the treewidth of the graph. For every epsilon>0, this bound is n^{1+epsilon}exp O(k), which matches a hardness result of Abboud, Vassilevska Williams, and Wang (SODA 2015) and closes an open problem in the multivariate analysis of polynomial-time computation. To this end, we show that the analysis of an algorithm of Cabello and Knauer (Comp. Geom., 2009) in the regime of non-constant treewidth can be improved by revisiting the analysis of orthogonal range searching, improving bounds of the form log^d n to binom{d+ceil[log n]}{d}, as originally observed by Monier (J. Alg. 1980). We also investigate the parameterization by vertex cover number

    Multivariate Analysis of Orthogonal Range Searching and Graph Distances Parameterized by Treewidth

    Full text link
    We show that the eccentricities, diameter, radius, and Wiener index of an undirected nn-vertex graph with nonnegative edge lengths can be computed in time O(n(k+lognk)2kk2logn)O(n\cdot \binom{k+\lceil\log n\rceil}{k} \cdot 2^k k^2 \log n), where kk is the treewidth of the graph. For every ϵ>0\epsilon>0, this bound is n1+ϵexpO(k)n^{1+\epsilon}\exp O(k), which matches a hardness result of Abboud, Vassilevska Williams, and Wang (SODA 2015) and closes an open problem in the multivariate analysis of polynomial-time computation. To this end, we show that the analysis of an algorithm of Cabello and Knauer (Comp. Geom., 2009) in the regime of non-constant treewidth can be improved by revisiting the analysis of orthogonal range searching, improving bounds of the form logdn\log^d n to (d+lognd)\binom{d+\lceil\log n\rceil}{d}, as originally observed by Monier (J. Alg. 1980). We also investigate the parameterization by vertex cover number

    Analysis of approximate nearest neighbor searching with clustered point sets

    Full text link
    We present an empirical analysis of data structures for approximate nearest neighbor searching. We compare the well-known optimized kd-tree splitting method against two alternative splitting methods. The first, called the sliding-midpoint method, which attempts to balance the goals of producing subdivision cells of bounded aspect ratio, while not producing any empty cells. The second, called the minimum-ambiguity method is a query-based approach. In addition to the data points, it is also given a training set of query points for preprocessing. It employs a simple greedy algorithm to select the splitting plane that minimizes the average amount of ambiguity in the choice of the nearest neighbor for the training points. We provide an empirical analysis comparing these two methods against the optimized kd-tree construction for a number of synthetically generated data and query sets. We demonstrate that for clustered data and query sets, these algorithms can provide significant improvements over the standard kd-tree construction for approximate nearest neighbor searching.Comment: 20 pages, 8 figures. Presented at ALENEX '99, Baltimore, MD, Jan 15-16, 199

    On the Procrustean analogue of individual differences scaling (INDSCAL)

    Get PDF
    In this paper, individual differences scaling (INDSCAL) is revisited, considering INDSCAL as being embedded within a hierarchy of individual difference scaling models. We explore the members of this family, distinguishing (i) models, (ii) the role of identification and substantive constraints, (iii) criteria for fitting models and (iv) algorithms to optimise the criteria. Model formulations may be based either on data that are in the form of proximities or on configurational matrices. In its configurational version, individual difference scaling may be formulated as a form of generalized Procrustes analysis. Algorithms are introduced for fitting the new models. An application from sensory evaluation illustrates the performance of the methods and their solutions

    Revisiting Guerry's data: Introducing spatial constraints in multivariate analysis

    Full text link
    Standard multivariate analysis methods aim to identify and summarize the main structures in large data sets containing the description of a number of observations by several variables. In many cases, spatial information is also available for each observation, so that a map can be associated to the multivariate data set. Two main objectives are relevant in the analysis of spatial multivariate data: summarizing covariation structures and identifying spatial patterns. In practice, achieving both goals simultaneously is a statistical challenge, and a range of methods have been developed that offer trade-offs between these two objectives. In an applied context, this methodological question has been and remains a major issue in community ecology, where species assemblages (i.e., covariation between species abundances) are often driven by spatial processes (and thus exhibit spatial patterns). In this paper we review a variety of methods developed in community ecology to investigate multivariate spatial patterns. We present different ways of incorporating spatial constraints in multivariate analysis and illustrate these different approaches using the famous data set on moral statistics in France published by Andr\'{e}-Michel Guerry in 1833. We discuss and compare the properties of these different approaches both from a practical and theoretical viewpoint.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS356 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Descriptive methods of data analysis for marketing data – theoretical and practical considerations

    Get PDF
    Marketing has as main objective the guidance of a firm’s activities according to current and future needs – of consumers’. This necessarily assumes the existence of a suitable information system, and also the knowledge of some modern analysis, processing and interpretation of the so complex information in the field of marketing. The descriptive methods of data analysis represent multidimensional analysis tools that are strong and effective, tools based on which important information can be obtained for market research. The paper comparatively presents some of these methods, respectively: factor analysis, main component analysis, correspondence analysis and canonical analysis.factor analysis, marketing, descriptive methods.

    Multivariate Approaches to Classification in Extragalactic Astronomy

    Get PDF
    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono-or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.Comment: Open Access paper. http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>. \<10.3389/fspas.2015.00003 \&g

    P?=NP as minimization of degree 4 polynomial, integration or Grassmann number problem, and new graph isomorphism problem approaches

    Full text link
    While the P vs NP problem is mainly approached form the point of view of discrete mathematics, this paper proposes reformulations into the field of abstract algebra, geometry, fourier analysis and of continuous global optimization - which advanced tools might bring new perspectives and approaches for this question. The first one is equivalence of satisfaction of 3-SAT problem with the question of reaching zero of a nonnegative degree 4 multivariate polynomial (sum of squares), what could be tested from the perspective of algebra by using discriminant. It could be also approached as a continuous global optimization problem inside [0,1]n[0,1]^n, for example in physical realizations like adiabatic quantum computers. However, the number of local minima usually grows exponentially. Reducing to degree 2 polynomial plus constraints of being in {0,1}n\{0,1\}^n, we get geometric formulations as the question if plane or sphere intersects with {0,1}n\{0,1\}^n. There will be also presented some non-standard perspectives for the Subset-Sum, like through convergence of a series, or zeroing of 02πicos(φki)dφ\int_0^{2\pi} \prod_i \cos(\varphi k_i) d\varphi fourier-type integral for some natural kik_i. The last discussed approach is using anti-commuting Grassmann numbers θi\theta_i, making (Adiag(θi))n(A \cdot \textrm{diag}(\theta_i))^n nonzero only if AA has a Hamilton cycle. Hence, the P\neNP assumption implies exponential growth of matrix representation of Grassmann numbers. There will be also discussed a looking promising algebraic/geometric approach to the graph isomorphism problem -- tested to successfully distinguish strongly regular graphs with up to 29 vertices.Comment: 19 pages, 8 figure
    corecore