61 research outputs found
Recommended from our members
VIPER : a 25-MHz, 100-MIPS peak VLIW micro-processor
This paper describes the design and implementation of a very long instruction word (VLIW) microprocessor. The VIPER (VLIW integer processor) contains four pipelined functional units, and can achieve 100 MIPS peak performance at 25 MHz. The procesor is capable of performing multiway branch operations, two load/store operations and up to four ALU operations in each clock cycle, with full register file access to each functional unit. VIPER is the first VLIW microprocessor known that can achieve this level of performance. Designed in twelve months, the processor is integrated with an instruction cache controller and a data cache, requiring 450,000 transistors and a die size of 12.9 by 9.1 mm in a 1.2 µm technology
High-Quality Hypergraph Partitioning
This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer , partition its vertex set into disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric.
Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution.
With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) levels, where is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the -level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve -way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine -way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space.
All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time
Scalable High-Quality Graph and Hypergraph Partitioning
The balanced hypergraph partitioning problem (HGP) asks for a partition of the node set
of a hypergraph into blocks of roughly equal size, such that an objective function defined
on the hyperedges is minimized. In this work, we optimize the connectivity metric,
which is the most prominent objective function for HGP.
The hypergraph partitioning problem is NP-hard and there exists no constant factor approximation.
Thus, heuristic algorithms are used in practice with the multilevel scheme as
the most successful approach to solve the problem: First, the input hypergraph is coarsened to
obtain a hierarchy of successively smaller and structurally similar approximations.
The smallest hypergraph is then initially partitioned into blocks, and subsequently,
the contractions are reverted level-by-level, and, on each level, local search algorithms are used
to improve the partition (refinement phase).
In recent years, several new techniques were developed for sequential multilevel partitioning
that substantially improved solution quality at the cost of an increased running time.
These developments divide the landscape of existing partitioning algorithms into systems that either aim for
speed or high solution quality with the former often being more than an order of magnitude faster
than the latter. Due to the high running times of the best sequential algorithms, it is currently not
feasible to partition the largest real-world hypergraphs with the highest possible quality.
Thus, it becomes increasingly important to parallelize the techniques used in these algorithms.
However, existing state-of-the-art parallel partitioners currently do not achieve the same solution
quality as their sequential counterparts because they use comparatively weak components that are easier to parallelize.
Moreover, there has been a recent trend toward simpler methods for partitioning large hypergraphs
that even omit the multilevel scheme.
In contrast to this development, we present two shared-memory multilevel hypergraph partitioners
with parallel implementations of techniques used by the highest-quality sequential systems.
Our first multilevel algorithm uses a parallel clustering-based coarsening scheme,
which uses substantially fewer locking mechanisms than previous approaches.
The contraction decisions are guided by the community structure of the input hypergraph
obtained via a parallel community detection algorithm.
For initial partitioning, we implement parallel multilevel recursive bipartitioning with a
novel work-stealing approach and a portfolio of initial bipartitioning techniques to
compute an initial solution. In the refinement phase, we use three different parallel improvement
algorithms: label propagation refinement, a highly-localized direct -way
FM algorithm, and a novel parallelization of flow-based refinement.
These algorithms build on our highly-engineered partition data structure, for which we propose
several novel techniques to compute accurate gain values of node moves in the parallel setting.
Our second multilevel algorithm parallelizes the -level partitioning scheme used in
the highest-quality sequential partitioner KaHyPar. Here, only a single node
is contracted on each level, leading to a hierarchy with approximately levels where
is the number of nodes. Correspondingly, in each refinement step, only a single node is uncontracted, allowing
a highly-localized search for improvements.
We show that this approach, which seems inherently sequential, can be parallelized efficiently without compromises in solution quality.
To this end, we design a forest-based representation of contractions from which we derive a feasible parallel
schedule of the contraction operations that we apply on a novel dynamic hypergraph data structure on-the-fly.
In the uncoarsening phase, we decompose the contraction forest into batches, each containing
a fixed number of nodes. We then uncontract each batch in parallel and use highly-localized
versions of our refinement algorithms to improve the partition around the uncontracted nodes.
We further show that existing sequential partitioning algorithms considerably struggle to find balanced partitions
for weighted real-world hypergraphs. To this end, we present a technique that enables partitioners based on recursive
bipartitioning to reliably compute balanced solutions. The idea is to preassign a small portion of the
heaviest nodes to one of the two blocks of each bipartition and optimize the objective function on the
remaining nodes. We integrated the approach into the sequential hypergraph partitioner KaHyPar
and show that our new approach can compute balanced solutions for all tested instances without negatively affecting the solution
quality and running time of KaHyPar.
In our experimental evaluation, we compare our new shared-memory (hyper)graph partitioner Mt-KaHyPar
to different graph and hypergraph partitioners on over (hyper)graphs with up to two billion edges/pins.
The results indicate that already our fastest configuration outperforms almost all existing
hypergraph partitioners with regards to both solution quality and running time. Our highest-quality configuration
(-level with flow-based refinement) achieves the same solution quality as the currently best
sequential partitioner KaHyPar, while being almost an order of magnitude faster with ten threads.
In addition, we optimize our data structures for graph partitioning, which improves the running times of both multilevel partitioners by
almost a factor of two for graphs. As a result, Mt-KaHyPar also outperforms most of the existing
graph partitioning algorithms. While the shared-memory graph partitioner KaMinPar is still faster than
Mt-KaHyPar, its produced solutions are worse by in the median. The best sequential graph
partitioner KaFFPa-StrongS computes slightly better partitions than Mt-KaHyPar
(median improvement is ), but is more than an order of magnitude slower on average
Recommended from our members
Data-driven modeling and optimization of sequential batch-continuous process
Driven by the need to lower capital expenditures and operating costs, as well as by competitive pressure to increase product quality and consistency, modern chemical processes have become increasingly complex. These trends are manifest, on the one hand, in complex equipment configurations and, on the other hand, in a broad array of sensors (and control systems), which generate large quantities of operating data. Of particular interest is the combination of two traditional routes of chemical processing: batch and continuous. Batch to continuous processes (B2C), which constitute the topic of this dissertation, comprise of a batch section, which is responsible for preparing the materials that are then processed in the continuous section. In addition to merging the modeling, control and optimization approaches related to the batch and continuous operating paradigms --which are radically different in many aspects-- challenges related to analyzing the operation of such processes arise from the multi-phase flow. In particular, we will be considering the case where a particulate solid is suspended in a liquid ``carrier'', in the batch stage, and the two-phase mixture is conveyed through the continuous stage. Our explicit goal is to provide a complete operating solution for such processes, starting with the development of meaningful and computationally efficient mathematical models, continuing with a control and fault detection solution, and finally, a production scheduling concept. Owing to process complexity, we reject out of hand the use of first-principles models, which are inevitably high dimensional and computationally expensive, and focus on data-driven approaches instead. Raw data obtained from chemical industry are subject to noise, equipment malfunction and communication failures and, as such, data recorded in process historian databases may contain outliers and measurement noise. Without proper pretreatment, the accuracy and performance of a model derived from such data may be inadequate. In the next chapter of this dissertation, we address this issue, and evaluate several data outlier removal techniques and filtering methods using actual production data from an industrial B2C system. We also address a specific challenge of B2C systems, that is, synchronizing the timing of the batch data need with the data collected from the continuous section of the process. Variable-wise unfolded data (a typical approach for batch processes) exhibit measurement gaps between the batches; however, this type of behavior cannot be found in the subsequent continuous section. These data gaps have an impact on data analysis and, in order to address this issue, we provide a method for filling in the missing values. The batch characteristic values are assigned in the gaps to match the data length with the continuous process, a procedure that preserves meaningful process correlations. Data-driven modeling techniques such as principal component analysis (PCA) and partial least squares (PLS) regression are well-established for modeling batch or continuous processes. In this thesis, we consider them from the perspective of the B2C systems under consideration. Specific challenges that arise during modeling of these systems are related to nonlinearity, which, in turn, is due to multiple operating modes associated with different product types/product grades. In order to deal with this, we propose partitioning the gap-filled data set into subsets using k-means clustering. Using the clustering method, a large data set that reflects multiple operating modes and the associated nonlinearity can be broken down into subsets in which the system exhibits a potentially linear behavior. Also, in order to further increase the model accuracy, the inputs to the model need to be refined. Unrelated variables may corrupt the resulting model by introducing unnecessary noise and irrelevant information. By properly eliminating any uninformative variables, the model performance can be improved along with the interpretability. We use variable selection methods to investigate the model coefficients or variable importance in projection (VIP) values to determine the variables to retain in the model. Developing a model to estimate the final product quality poses different challenges. Measuring and quantifying the final product quality online can be limited due to physical and economic constraints. Physically, there are some quantities that cannot be measured due to sensor sizes or surrounding environments. Economically, the offline ``lab'' measurements may lead to destroying the sample used for the testing. These constraints lead to multiple sampling rates. The process measurements are stored and available continuously in real-time, but the quality measurements have much lower sampling rate. In order to account for this discrepancy, the online process measurements are down-sampled to match the sampling frequency of the lab measurements, and subsequently, soft sensors are can be developed to estimated the final product quality. With the soft sensor in place, the process needs to be optimized to maximize the plant efficiency. Using the real-time optimization, the optimal sequence of manipulated inputs that minimizes the off-spec products are calculated. In addition, the optimal sequences of setpoints can be calculated by carrying out the scheduling calculation with the process model. Traditionally, the scheduling calculation is carried out without taking the process dynamics into account, which could result in off-spec products if a disturbance is introduced. Incorporating the process dynamics into the scheduling layer poses many different challenges numerically. The proposed time scale bridging model (SBM) is able to capture the input-output behavior of the process while greatly reducing the computational complexity and time.Chemical Engineerin
Models and algorithms for decomposition problems
This thesis deals with the decomposition both as a solution method and as a problem itself. A decomposition approach can be very effective for mathematical problems presenting a specific structure in which the associated matrix of coefficients is sparse and it is diagonalizable in blocks. But, this kind of structure may not be evident from the most natural formulation of the problem. Thus, its coefficient matrix may be preprocessed by solving a structure detection problem in order to understand if a decomposition method can successfully be applied. So, this thesis deals with the k-Vertex Cut problem, that is the problem of finding the minimum subset of nodes whose removal disconnects a graph into at least k components, and it models relevant applications in matrix decomposition for solving systems of equations by parallel computing. The capacitated k-Vertex Separator problem, instead, asks to find a subset of vertices of minimum cardinality the deletion of which disconnects a given graph in at most k shores and the size of each shore must not be larger than a given capacity value. Also this problem is of great importance for matrix decomposition algorithms.
This thesis also addresses the Chance-Constrained Mathematical Program that represents a significant example in which decomposition techniques can be successfully applied. This is a class of stochastic optimization problems in which the feasible region depends on the realization of a random variable and the solution must optimize a given objective function while belonging to the feasible region with a probability that must be above a given value. In this thesis, a decomposition approach for this problem is introduced.
The thesis also addresses the Fractional Knapsack Problem with Penalties, a variant of the knapsack problem in which items can be split at the expense of a penalty depending on the fractional quantity
Recommended from our members
Security of electric power systems : cascading outage analysis, interdiction model and resilience to natural disasters
textSecure electric power system operation is key to social warfare. However, recent years have seen numerous natural disasters and terrorist attacks that threat the grid security. This dissertation summarizes the efforts to develop a model to analyze cascading outages, an interdiction model to analyze worst-case attacks on power grids, and research on grid resilience to natural disasters. The developed cascading outage analysis model uses outage checkers to systematically simulate the system behavior after an initial disturbance, and calculate the potential cascading outage path and electric load shedding. The new interdiction model combines the previously developed medium-term attack-defense model with the short-term cascading outage analysis model to find worst-case terrorist attack. The dissertation also reviews the research on power grid resilience to natural disaster, and develops a framework to simulate the impacts of hurricanes.Electrical and Computer Engineerin
Combining shape and color. A bottom-up approach to evaluate object similarities
The objective of the present work is to develop a bottom-up approach to estimate the similarity between two unknown objects. Given a set of digital images, we want to identify the main objects and to determine whether they are similar or not. In the last decades many object recognition and classification strategies, driven by higher-level activities, have been successfully developed. The peculiarity of this work, instead, is the attempt to work without any training phase nor a priori knowledge about the objects or their context. Indeed, if we suppose to be in an unstructured and completely unknown environment, usually we have to deal with novel objects never seen before; under these hypothesis, it would be very useful to define some kind of similarity among the instances under analysis (even if we do not know which category they belong to).
To obtain this result, we start observing that human beings use a lot of information and analyze very different aspects to achieve object recognition: shape, position, color and so on. Hence we try to reproduce part of this process, combining different methodologies (each working on a specific characteristic) to obtain a more meaningful idea of similarity. Mainly inspired by the human conception of representation, we identify two main characteristics and we called them the implicit and explicit models. The term "explicit" is used to account for the main traits of what, in the human representation, connotes a principal source of information regarding a category, a sort of a visual synecdoche (corresponding to the shape); the term "implicit", on the other hand, accounts for the object rendered by shadows and lights, colors and volumetric impression, a sort of a visual metonymy (corresponding to the chromatic characteristics).
During the work, we had to face several problems and we tried to define specific solutions. In particular, our contributions are about:
- defining a bottom-up approach for image segmentation (which does not rely on any a priori knowledge);
- combining different features to evaluate objects similarity (particularly focusiing on shape and color);
- defining a generic distance (similarity) measure between objects (without any attempt to identify the possible category they belong to);
- analyzing the consequences of using the number of modes as an estimation of the number of mixture’s components (in the Expectation-Maximization algorithm)
The graduate school of the University of New Hampshire 1965-1966
Includes Graduate School catalog; Title varie
- …