293,405 research outputs found
Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity
Submodular maximization is a general optimization problem with a wide range
of applications in machine learning (e.g., active learning, clustering, and
feature selection). In large-scale optimization, the parallel running time of
an algorithm is governed by its adaptivity, which measures the number of
sequential rounds needed if the algorithm can execute polynomially-many
independent oracle queries in parallel. While low adaptivity is ideal, it is
not sufficient for an algorithm to be efficient in practice---there are many
applications of distributed submodular optimization where the number of
function evaluations becomes prohibitively expensive. Motivated by these
applications, we study the adaptivity and query complexity of submodular
maximization. In this paper, we give the first constant-factor approximation
algorithm for maximizing a non-monotone submodular function subject to a
cardinality constraint that runs in adaptive rounds and makes
oracle queries in expectation. In our empirical study, we use
three real-world applications to compare our algorithm with several benchmarks
for non-monotone submodular maximization. The results demonstrate that our
algorithm finds competitive solutions using significantly fewer rounds and
queries.Comment: 12 pages, 8 figure
Algorithms for the NJIT turbonet parallel computer
Element selection for arrays, array merging, and sorting are very frequent operations in many of today\u27s important applications. These operations are of interest to scientific, as well as other applications where high-speed database search, merge, and sort operations are necessary and frequent. Therefore, their efficient implementation on parallel computers should be a worthwhile objective. Parallel algorithms are presented in this thesis for the implementation of these operations on the NET TurboNet system, an in-house built experimental parallel computer with TMS320C40 Digital Signal Processors interconnected in a 3-D hypercube structure. The first algorithm considered is selection. It involves finding the k-th smallest element in an unsorted sequence of n elements, where 1≤k≤n. The second algorithm involves the merging of two sequences sorted in nondecreasing order to form a third sequence, also sorted in nondecreasing order. The third parallel algorithm is sorting. For a given unsorted sequence S of size n, we want to sort the sequence such that st\u27≤i+1\u27 for all n elements. Performance results show that the robust structure of TurboNet results in significant speedups
Recommended from our members
Complex Query Operators on Modern Parallel Architectures
Identifying interesting objects from a large data collection is a fundamental problem for multi-criteria decision making applications.In Relational Database Management Systems (RDBMS), the most popular complex query operators used to solve this type of problem are the Top-K selection operator and the Skyline operator.Top-K selection is tasked with retrieving the k-highest ranking tuples from a given relation, as determined by a user-defined aggregation function.Skyline selection retrieves those tuples with attributes offering (pareto) optimal trade-offs in a given relation.Efficient Top-K query processing entails minimizing tuple evaluations by utilizing elaborate processing schemes combined with sophisticated data structures that enable early termination.Skyline query evaluation involves supporting processing strategies which are geared towards early termination and incomparable tuple pruning.The rapid increase in memory capacity and decreasing costs have been the main drivers behind the development of main-memory database systems.Although the act of migrating query processing in-memory has created many opportunities to improve the associated query latency, attaining such improvements has been very challenging due to the growing gap between processor and main memory speeds.Addressing this limitation has been made easier by the rapid proliferation of multi-core and many-core architectures.However, their utilization in real systems has been hindered by the lack of suitable parallel algorithms that focus on algorithmic efficiency.In this thesis, we study in depth the Top-K and Skyline selection operators, in the context of emerging parallel architectures.Our ultimate goal is to provide practical guidelines for developing work-efficient algorithms suitable for parallel main memory processing.We concentrate on multi-core (CPU), many-core (GPU), and processing-in-memory architectures (PIM), developing solutions optimized for high throughout and low latency.The first part of this thesis focuses on Top-K selection, presenting the specific details of early termination algorithms that we developed specifically for parallel architectures and various types of accelerators (i.e. GPU, PIM).The second part of this thesis, concentrates on Skyline selection and the development of a massively parallel load balanced algorithm for PIM architectures.Our work consolidates performance results across different parallel architectures using synthetic and real data on variable query parameters and distributions for both of the aforementioned problems.The experimental results demonstrate several orders of magnitude better throughput and query latency, thus validating the effectiveness of our proposed solutions for the Top-K and Skyline selection operators
Maximum Volume Subset Selection for Anchored Boxes
Let B be a set of n axis-parallel boxes in d-dimensions such that each box has a corner at the origin and the other corner in the positive quadrant, and let k be a positive integer. We study the problem of selecting k boxes in B that maximize the volume of the union of the selected boxes. The research is motivated by applications in skyline queries for databases and in multicriteria optimization, where the problem is known as the hypervolume subset selection problem. It is known that the problem can be solved in polynomial time in the plane, while the best known algorithms in any dimension d>2 enumerate all size-k subsets. We show that:
* The problem is NP-hard already in 3 dimensions.
* In 3 dimensions, we break the enumeration of all size-k subsets, by providing an n^O(sqrt(k)) algorithm.
* For any constant dimension d, we give an efficient polynomial-time approximation scheme
Maximum Volume Subset Selection for Anchored Boxes
Let be a set of axis-parallel boxes in such that each box has a corner at the origin and the other corner in the positive quadrant of , and let be a positive integer. We study the problem of selecting boxes in that maximize the volume of the union of the selected boxes. This research is motivated by applications in skyline queries for databases and in multicriteria optimization, where the problem is known as the hypervolume subset selection problem. It is known that the problem can be solved in polynomial time in the plane, while the best known running time in any dimension is . We show that: - The problem is NP-hard already in 3 dimensions. - In 3 dimensions, we break the bound , by providing an algorithm. - For any constant dimension , we present an efficient polynomial-time approximation scheme
- …