33 research outputs found
An Augmented Index-based Efficient Community Search for Large Directed Graphs
Given a graph G and a query vertex q, the topic of community search (CS),
aiming to retrieve a dense subgraph of G containing q, has gained much
attention. Most existing works focus on undirected graphs which overlooks the
rich information carried by the edge directions. Recently, the problem of
community search over directed graphs (or CSD problem) has been studied; it
finds a connected subgraph containing q, where the in-degree and out-degree of
each vertex within the subgraph are at least k and l, respectively. However,
existing solutions are inefficient, especially on large graphs. To tackle this
issue, in this paper, we propose a novel index called D-Forest, which allows a
CSD query to be completed within the optimal time cost. We further propose
efficient index construction methods. Extensive experiments on six real large
graphs show that our index-based query algorithm is up to two orders of
magnitude faster than existing solutions.Comment: Full version of our IJCAI20 pape
A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets
Decision trees are essential yet NP-complete to train, prompting the
widespread use of heuristic methods such as CART, which suffers from
sub-optimal performance due to its greedy nature. Recently, breakthroughs in
finding optimal decision trees have emerged; however, these methods still face
significant computational costs and struggle with continuous features in
large-scale datasets and deep trees. To address these limitations, we introduce
a moving-horizon differential evolution algorithm for classification trees with
continuous features (MH-DEOCT). Our approach consists of a discrete tree
decoding method that eliminates duplicated searches between adjacent samples, a
GPU-accelerated implementation that significantly reduces running time, and a
moving-horizon strategy that iteratively trains shallow subtrees at each node
to balance the vision and optimizer capability. Comprehensive studies on 68 UCI
datasets demonstrate that our approach outperforms the heuristic method CART on
training and testing accuracy by an average of 3.44% and 1.71%, respectively.
Moreover, these numerical studies empirically demonstrate that MH-DEOCT
achieves near-optimal performance (only 0.38% and 0.06% worse than the global
optimal method on training and testing, respectively), while it offers
remarkable scalability for deep trees (e.g., depth=8) and large-scale datasets
(e.g., ten million samples).Comment: 36 pages (13 pages for the main body, 23 pages for the appendix), 7
figure
Plug-and-Play Document Modules for Pre-trained Models
Large-scale pre-trained models (PTMs) have been widely used in
document-oriented NLP tasks, such as question answering. However, the
encoding-task coupling requirement results in the repeated encoding of the same
documents for different tasks and queries, which is highly computationally
inefficient. To this end, we target to decouple document encoding from
downstream tasks, and propose to represent each document as a plug-and-play
document module, i.e., a document plugin, for PTMs (PlugD). By inserting
document plugins into the backbone PTM for downstream tasks, we can encode a
document one time to handle multiple tasks, which is more efficient than
conventional encoding-task coupling methods that simultaneously encode
documents and input queries using task-specific encoders. Extensive experiments
on 8 datasets of 4 typical NLP tasks show that PlugD enables models to encode
documents once and for all across different scenarios. Especially, PlugD can
save computational costs while achieving comparable performance to
state-of-the-art encoding-task coupling methods. Additionally, we show that
PlugD can serve as an effective post-processing way to inject knowledge into
task-specific models, improving model performance without any additional model
training.Comment: Accepted by ACL 202
Data-driven capacity estimation of commercial lithium-ion batteries from voltage relaxation
Accurate capacity estimation is crucial for the reliable and safe operation of lithium-ion batteries. In particular, exploiting the relaxation voltage curve features could enable battery capacity estimation without additional cycling information. Here, we report the study of three datasets comprising 130 commercial lithium-ion cells cycled under various conditions to evaluate the capacity estimation approach. One dataset is collected for model building from batteries with LiNiCoAlO-based positive electrodes. The other two datasets, used for validation, are obtained from batteries with LiNiCoMnO-based positive electrodes and batteries with the blend of Li(NiCoMn)O - Li(NiCoAl)O positive electrodes. Base models that use machine learning methods are employed to estimate the battery capacity using features derived from the relaxation voltage profiles. The best model achieves a root-mean-square error of 1.1% for the dataset used for the model building. A transfer learning model is then developed by adding a featured linear transformation to the base model. This extended model achieves a root-mean-square error of less than 1.7% on the datasets used for the model validation, indicating the successful applicability of the capacity estimation approach utilizing cell voltage relaxation
Real-time Monitoring for the Next Core-Collapse Supernova in JUNO
Core-collapse supernova (CCSN) is one of the most energetic astrophysical
events in the Universe. The early and prompt detection of neutrinos before
(pre-SN) and during the SN burst is a unique opportunity to realize the
multi-messenger observation of the CCSN events. In this work, we describe the
monitoring concept and present the sensitivity of the system to the pre-SN and
SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is
a 20 kton liquid scintillator detector under construction in South China. The
real-time monitoring system is designed with both the prompt monitors on the
electronic board and online monitors at the data acquisition stage, in order to
ensure both the alert speed and alert coverage of progenitor stars. By assuming
a false alert rate of 1 per year, this monitoring system can be sensitive to
the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos
up to about 370 (360) kpc for a progenitor mass of 30 for the case
of normal (inverted) mass ordering. The pointing ability of the CCSN is
evaluated by using the accumulated event anisotropy of the inverse beta decay
interactions from pre-SN or SN neutrinos, which, along with the early alert,
can play important roles for the followup multi-messenger observations of the
next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure
Parallel algorithms for nonlinear programming and applications in pharmaceutical manufacturing
Effective manufacturing of pharmaceuticals presents a number of challenging optimization problems due to complex distributed, time-independent models and the need to handle uncertainty. These challenges are multiplied when real-time solutions are required. The demand for fast solution of nonlinear optimization problems, coupled with the emergence of new concurrent computing architectures, drives the need for parallel algorithms to solve challenging NLP problems. The goal of this work is the development of parallel algorithms for nonlinear programming problems on different computing architectures, and the application of large-scale nonlinear programming on challenging problems in pharmaceutical manufacturing. The focus of this dissertation is our completed work on an augmented Lagrangian algorithm for parallel solution of general NLP problems on graphics processing units and a clustering-based preconditioning strategy for stochastic programs within an interior-point framework on distributed memory machines. Our augmented Lagrangian interior-point approach for general NLP problems is iterative at three levels. The first level replaces the original problem by a sequence of bound-constrained optimization problems. Each of these bound-constrained problems is solved using a nonlinear interior-point method. Inside the interior-point method, the barrier subproblems are solved using a variation of Newton\u27s method, where the linear system is solved using a preconditioned conjugate gradient (PCG) method. The primary advantage of this algorithm is that it allows use of the PCG method, which can be implemented efficiently on a GPU in parallel. This algorithm shows an order of magnitude speedup on certain problems. We also present a clustering-based preconditioning strategy for stochastic programs. The key idea is to perform adaptive clustering of scenarios inside-the-solver based on their influence on the problem. We derive spectral and error properties for the preconditioner and demonstrate that scenario compression rates of up to 94% can be obtained, leading to drastic computational savings. A speed up factor of 42 is obtained with our parallel implementation on an stochastic market-clearing problem for the entire Illinois power grid system. In addition, we discuss two applications of nonlinear programming in pharmaceutical manufacturing. The first application is to analyze the effect of different policies to deal with drug shortages and the role of emergency supply in pharmaceutical manufacturing. Simulation results indicate that the availability of an emergency production facility can significantly reduce expected government spending even if the unit price of emergency supply is high and the capacity is limited. In addition, we discuss an important application of nonlinear programming in control of pharmaceutical manufacturing processes. First, we focus on the development of real-time feasible multi-objective optimization based NMPC-MHE formulations for batch crystallization processes to control the crystal size and shape distribution. At each sampling instance, based on a nonlinear DAE model, an estimation problem estimates unknown states and parameters and an optimal control problem determines the optimal input profiles. Both DAE-constrained optimization problems are solved by discretizing the system using Radau collocation and optimizing the resulting algebraic nonlinear problem using IPOPT. NMPC-MHE is shown to provide better setpoint tracking than the open-loop optimal control strategy in terms of setpoint change, system noise, and model/plant mismatch. Second, to deal with the parameter uncertainties in the crystallization model, we also develop a real-time feasible robust NMPC formulation. The size of optimization problems arising from the robust NMPC becomes too large to be solved by a serial solver. Therefore, we use a parallel algorithm to ensure real-time feasibility
Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization
Representing the uncertainties with a set of scenarios, the optimization problem resulting from a robust nonlinear model predictive control (NMPC) strategy at each sampling instance can be viewed as a large-scale stochastic program. This paper solves these optimization problems using the parallel Schur complement method developed to solve stochastic programs on distributed and shared memory machines. The control strategy is illustrated with a case study of a multidimensional unseeded batch crystallization process. For this application, a robust NMPC based on min–max optimization guarantees satisfaction of all state and input constraints for a set of uncertainty realizations, and also provides better robust performance compared with open-loop optimal control, nominal NMPC, and robust NMPC minimizing the expected performance at each sampling instance. The performance of robust NMPC can be improved by generating optimization scenarios using Bayesian inference. With the efficient parallel solver, the solution time of one optimization problem is reduced from 6.7 min to 0.5 min, allowing for real-time application
plasmo-dev/Plasmo.jl: v0.5.0
Plasmo v0.5.0
<p><a href="https://github.com/plasmo-dev/Plasmo.jl/compare/v0.4.4...v0.5.0">Diff since v0.4.4</a></p>
<p><strong>Merged pull requests:</strong></p>
<ul>
<li>More efficient OptiGraph backend (#52) (@jalving)</li>
<li>Improve coverage (#53) (@jalving)</li>
<li>Update documentation and API (#55) (@jalving)</li>
<li>Update README.md (#56) (@jalving)</li>
<li>CompatHelper: add new compat entry for MathOptInterface at version 1, (keep existing compat) (#59) (@github-actions[bot])</li>
<li>additional documentation updates (#60) (@jalving)</li>
<li>update url (#61) (@jalving)</li>
</ul>