1,750 research outputs found
On Finding Maximum Cardinality Subset of Vectors with a Constraint on Normalized Squared Length of Vectors Sum
In this paper, we consider the problem of finding a maximum cardinality
subset of vectors, given a constraint on the normalized squared length of
vectors sum. This problem is closely related to Problem 1 from (Eremeev,
Kel'manov, Pyatkin, 2016). The main difference consists in swapping the
constraint with the optimization criterion.
We prove that the problem is NP-hard even in terms of finding a feasible
solution. An exact algorithm for solving this problem is proposed. The
algorithm has a pseudo-polynomial time complexity in the special case of the
problem, where the dimension of the space is bounded from above by a constant
and the input data are integer. A computational experiment is carried out,
where the proposed algorithm is compared to COINBONMIN solver, applied to a
mixed integer quadratic programming formulation of the problem. The results of
the experiment indicate superiority of the proposed algorithm when the
dimension of Euclidean space is low, while the COINBONMIN has an advantage for
larger dimensions.Comment: To appear in Proceedings of the 6th International Conference on
Analysis of Images, Social Networks, and Texts (AIST'2017
On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution
Technology trends are making the cost of data movement increasingly dominant,
both in terms of energy and time, over the cost of performing arithmetic
operations in computer systems. The fundamental ratio of aggregate data
movement bandwidth to the total computational power (also referred to the
machine balance parameter) in parallel computer systems is decreasing. It is
there- fore of considerable importance to characterize the inherent data
movement requirements of parallel algorithms, so that the minimal architectural
balance parameters required to support it on future systems can be well
understood. In this paper, we develop an extension of the well-known red-blue
pebble game to develop lower bounds on the data movement complexity for the
parallel execution of computational directed acyclic graphs (CDAGs) on parallel
systems. We model multi-node multi-core parallel systems, with the total
physical memory distributed across the nodes (that are connected through some
interconnection network) and in a multi-level shared cache hierarchy for
processors within a node. We also develop new techniques for lower bound
characterization of non-homogeneous CDAGs. We demonstrate the use of the
methodology by analyzing the CDAGs of several numerical algorithms, to develop
lower bounds on data movement for their parallel execution
Recommended from our members
A Haystack Heuristic for Autoimmune Disease Biomarker Discovery Using Next-Gen Immune Repertoire Sequencing Data.
Large-scale DNA sequencing of immunological repertoires offers an opportunity for the discovery of novel biomarkers for autoimmune disease. Available bioinformatics techniques however, are not adequately suited for elucidating possible biomarker candidates from within large immunosequencing datasets due to unsatisfactory scalability and sensitivity. Here, we present the Haystack Heuristic, an algorithm customized to computationally extract disease-associated motifs from next-generation-sequenced repertoires by contrasting disease and healthy subjects. This technique employs a local-search graph-theory approach to discover novel motifs in patient data. We apply the Haystack Heuristic to nine million B-cell receptor sequences obtained from nearly 100 individuals in order to elucidate a new motif that is significantly associated with multiple sclerosis. Our results demonstrate the effectiveness of the Haystack Heuristic in computing possible biomarker candidates from high throughput sequencing data and could be generalized to other datasets
Building Clusters with Lower-Bounded Sizes
Classical clustering problems search for a partition of objects into a fixed number of clusters. In many scenarios however the number of clusters is not known or necessarily fixed. Further, clusters are sometimes only considered to be of significance if they have a certain size. We discuss clustering into sets of minimum cardinality k without a fixed number of sets and present a general model for these types of problems. This general framework allows the comparison of different measures to assess the quality of a clustering. We specifically consider nine quality-measures and classify the complexity of the resulting problems with respect to k. Further, we derive some polynomial-time solvable cases for k = 2 with connections to matching-type problems which, among other graph problems, then are used to compute approximations for larger values of k
Robust Branch-Cut-and-Price for the Capacitated Minimum Spanning Tree Problem over a Large Extended Formulation
This paper presents a robust branch-cut-and-price algorithm for the Capacitated Minimum Spanning Tree Problem (CMST). The variables are associated to q-arbs, a structure that arises from a relaxation of the capacitated prize-collecting arbores- cence problem in order to make it solvable in pseudo-polynomial time. Traditional inequalities over the arc formulation, like Capacity Cuts, are also used. Moreover, a novel feature is introduced in such kind of algorithms. Powerful new cuts expressed over a very large set of variables could be added, without increasing the complexity of the pricing subproblem or the size of the LPs that are actually solved. Computational results on benchmark instances from the OR-Library show very signi¯cant improvements over previous algorithms. Several open instances could be solved to optimalityNo keywords;
- …