ABSTRACT
INTRODUCTION
Synthesis of electronic circuits is a process of configuring an assembly of electronic components to achieve a desired circuit behavior. Synthesizing a circuit is a challenging task because it involves both circuit topology and sizing. Circuit topology is determined by component selection and connections between components. The sizing problem is to identify the circuit component parameter values that optimize system performance. A number of methodologies have been proposed for circuit synthesis. Domain knowledge is often used [1] [2] [3] , but requires significant expertise to apply successfully, and has limited effectiveness for complicated circuits. Methods that require minimal initial design knowledge are desirable in that novel topologies can be generated without demanding human expertise requirements [1] [2] [3] [4] [5] . For example, evolutionary algorithms [2, [4] [5] [6] [7] [8] [9] and simulated annealing [10] have been applied to the circuit synthesis problem. Grimbleby used genetic algorithms to synthesize novel and effective circuits that satisfy both frequency-and time-domain specifications [4] . Das and Vermuri developed an automated circuit synthesis framework for an RLC circuit design problem [5] .
Despite these achievements, evolutionary-based approaches are limited because 1) a number of parameters need to be tuned, including crossover probability and initial population size; 2) poor quality of the initial population may result in the convergence and robustness issues [5] ; 3) direct genotype representations are not often successful in making progress toward improved designs for larger-scale systems. Global solutions are 1 Copyright c 2018 by ASME not guaranteed with evolutionary algorithms. Indirect generative design representations, such as grammar-rule based approaches, may be used to map design representations to circuit topologies, but these approaches so far are not generalizable, as they can only be applied successfully when an existing generative design algorithm is congruent with the system topologies to be generated, or when expert intuition can be used to define the generation rules. Enumeration-based synthesis methodologies generate and test all possible circuit topologies under certain specifications, so global optimality is guaranteed. Enumeration-based approaches have a long history and have been used widely in electronic circuits [11, 12] , hybrid powertrains [13] , gear trains [14] , and enzyme network topologies [15] . Naïve enumeration is simple, but impractical when the number of possible candidates is significant. Recent advancements developed by Herber et al. in efficient system architecture enumeration theory and algorithms, based on perfect matching theory and intelligent brute-force search methods, have made practical the use of enumeration methods for generation of all unique and feasible topologies [16] . The majority of designs produced via naïve enumeration are either isomorphic or violate network structure constraints (NSCs), but the generation of these non-value-added designs is minimized using the aforementioned strategies, making possible the enumeration of topologies for architecture design problems that are much larger than previously thought.
Efficient enumeration has been applied to circuit synthesis [17] , and was shown to identify not only the topologies found earlier using GAs, but a much richer set of design data, including many previously unknown, non-dominated designs [16] . This study supported fair comparisons between unique, feasible topologies by using rigorous evaluation of the candidates [16, 17] . A dynamic system model is automatically generated for each candidate circuit topology, and then is used in solving a component sizing optimization problem [17] . Efficient enumeration strategies can be extended to other system architecture design problems, such as active vehicle suspensions [18] .
While these recent advances have improved solution capabilities for circuit synthesis, they are still limited in scale to moderately-sized component catalogs that define the architecture space of interest. Here we seek to use circuit synthesis design data (obtained via efficient enumeration) along with machine learning techniques to enable approximate topology optimization of larger-scale circuit synthesis problems. Consider two possible cases where complete enumeration and evaluation is impractical [19] . Case 1) All unique, feasible topologies can be enumerated in a practical amount of time, but evaluation (e.g., size optimization) is too computationally expensive to perform for all topologies. Case 2) a catalog is too large to enumerate all topologies in a practical amount of time. Here we focus on Case 1. Ongoing work addresses Case 2.
In the circuit synthesis problem, for the catalogs considered here, both enumeration and evaluation can be performed.
Circuit evaluation involves straightforward generation of singleinput single-output transfer functions [17] , and then finding the solution of a nonlinear fitting problem. Having a complete set of data (topologies and evaluation metrics) allows us to study various limited-sampling strategies, and compare against known globally optimal designs. A small extension of this test problem (e.g., inclusion of nonlinear circuit elements) would increase evaluation expense, resulting in a Case 1 problem. Further extension involving a larger catalog would result in a Case 2 problem. Here we present an initial analysis of methods appropriate for the approximate solution of Case 1 problems. Methods for Case 2 problems would have different requirements and are outside the scope of this article.
Here machine learning is applied by constructing a predictive model that approximates the mapping from topological design descriptions to performance metrics (real-valued outputs) . This model is then used to identify potentially desirable topologies to evaluate. This new data is then used to enhance the predictive model. This iterative strategy is known as active learning, which is a semi-supervised machine learning technique that aims to achieve good accuracy with fewer training samples by interactively sampling the data from which it learns [20] . In this situation, the dataset contains a number of unlabeled (unevaluated) data, and labeling (classification) or evaluations (regression) are relatively expensive. The learning algorithms can choose actively which samples to label or evaluate with the goal of reducing the amount of labeling or evaluation. This strategy is distinct from conventional supervised learning, also referred to as "passive learning". A number of research efforts have focused on active learning, including text mining [21] [22] [23] [24] [25] , speech recognition [26] , and computational biology [27] . Very limited research in this context has been focused on circuit synthesis or similar tasks, which motivates this investigation into the gap between active learning and circuit synthesis design.
Here the random forest algorithm is used to construct the predictive model in the iterative process, where model validation and query selection are performed. As an ensemble method, the random forest can prevent overfitting and reduce the variance by training on bootstrap samples of the data. Active learning, in general, considers three scenarios: membership queries, stream-based selective sampling, and pool-based sampling [20] . The pool-based sampling scheme is used here [28] . To decide whether to query or discard instances, several query strategies are examined. We tested the strategies using two real data sets from circuit synthesis problems, and identified the promising active learning strategy for circuit synthesis. This article involves three primary contributions: 1) we introduce active learning as a powerful tool to address the challenges in circuit synthesis: through comparing several query strategies, we were able to build a predictive model with fewer training examples, and thus further reduce true solution cost for circuit synthesis problems; 2) we generate new insights and identify the query strategy specifically for circuit synthesis via the analysis, such as ranking of the prediction; 3) we discuss how the proposed framework can be extended to other architecture design problems, such as active vehicle suspensions [17, 18] and fluid-based thermal management systems [29] . In addition, the relationship between active learning and adaptive surrogate modeling is clarified in this article. The remainder of the article is organized as follows. In the next section, we describe the active learning framework for circuit synthesis. Section 3 describes the frequency response synthesis problem from which the dataset was obtained. Quantitative results are presented in Section 4. Discussion and conclusion are presented in Sections 5 and 6, respectively.
METHODOLOGY
In this section, the proposed active learning strategy for circuit synthesis is described. Figure 1 illustrates the overall active learning framework. The strategy assumes all possible N circuit topologies X = {x i } for i = 1, 2, . . . , N and x i ∈ R d , are relatively easy to obtain, but evaluation of each circuit's performance metric Y = {y i } for i = 1, 2, · · · , N, is relatively expensive. The mapping f : x i → y i is the true response function.
Initial Sampling
All the possible circuit topologies are obtained using the efficient enumeration-based methodology introduced by Herber [16, 17] . Each circuit topology x i is represented by an adjacency matrix; the value of each matrix element indicates whether the two corresponding circuit components are connected (1) or not connected (0). A subset of N l topologies are sampled randomly, denoted as an initial training set X l = {x i } for i = 1, 2, . . . , N l . The training set elements are then evaluated using the true response function to obtain the performance Y l = {y i }, where
Predictive Model Construction
To approximate the performance y i , a regression model is considered. Radial basis functions (RBFs) are often used, where the process can be seen as an artificial neural network to approximate the true unknown function [30] . Kriging is an approach for interpolation, where values are modeled using a Gaussian process [31] [32] [33] . Other methods may include response surface methodology (RSM) [34, 35] , moving leastsquares (MLS) [36] [37] [38] , support vector regression (SVR) [39] , adaptive regression splines [40] , and inductive learning [41] . These approaches have been well-studied in surrogate-based optimization [42, 43] , an engineering method in which a highfidelity model cannot be used directly with optimization due to evaluation expense, but approximation models are be constructed using samples from the original model. Optimization is then performed using the computationally inexpensive surrogate model to make solution of an approximate problem tractable. However, established surrogate modeling methods may not applicable here because the design variables in the surrogate models are often continuous, rather than the discrete variables in the circuit synthesis that are represented in terms of the adjacency matrix.
Here we choose random forests as the model for approximating the response of the true circuit topology performance function. The random forest is an ensemble learning algorithm for regression and classification that can handle both continuous and categorical variables [44] [45] [46] . The random forest is composed of a number of decision trees at the training phase, and the output of the final prediction (regression) or label (classification) is based on the results given by the individual trees. The individual decision trees tend to have a low bias, but a high variance, resulting in overfitting to the training set. The random forest averages the output of multiple decision trees that were trained on different subsets of the training set. This strategy helps to reduce the overall variance and prevent overfitting [47] . The random forest uses bootstrap aggregating (also known as bagging) to improve the stability and accuracy of the learning algorithm [48] . For the given training set For a new instance x, the predictionŷ takes the average of the B individual decision trees [48] :
The estimated standard deviation s reflects uncertainty of the prediction:
The random forest algorithm has several tuning parameters, including the number of trees, depth of the tree, and so on. These hyperparameters can be determined through a Bayesian optimization during the training phase [49] [50] [51] .
Query Synthesis
The query synthesis is a critical element of active learning. To determine whether a new instance x should be queried or not, one needs to define a measure that characterizes x. In other words, we should actively search the data that satisfies a certain criterion such that the training set can be updated iteratively using the query data. The choice of search is a process that involves the balance between exploration and exploitation over the topology space. One way of balancing exploitation of the prediction and exploration using estimated standard deviation is to find the instances with the smallest statistical lower bound (LB) [43] :
whereŷ(x) and s(x) are the prediction and estimated standard deviation, respectively, given at x quantified by Eqs. (1) and (2) . Here parameter A controls the balance between exploitation and exploration; specifically, A → 0 corresponds to pure exploitation, and A → ∞ indicates pure exploration. Because s(x) reflects the uncertainty of the prediction, it can be used as a utility measure for query selection, also referred to as uncertainty sampling [20] . Expected Improvement (EI) is also quantity that computes the amount of improvement to be expected given the meanŷ(x) and estimated standard deviation s(x):
where Φ(·) and φ(·) are the CDF and PDF of the standard normal distribution; y min is the smallest observed value in the training set. Leave-one-out (LOO) cross-validation is an error-based approach that measures the leave-one-out prediction error using the training samples [52] . Given a new instance x and training sample (x i , y i ) ∈ T , the infill function is defined as [53, 54] :
where
is the distance between x and its closest training point x c . The error estimate function is
is the leave-one-out prediction at x c . The query points are the ones with the largest values of v(·). Here the distance term explains exploration, while the error function corresponds to exploitation. This measure is likely to query points with significant prediction uncertainty, helping to improve model accuracy [54] . Similarly, density-weighted heuristics take into account both content information and input space region density for the new instance x [20] :
where |X u | is cardinality of the current unevaluated samples. The first term uses uncertainty sampling measured by the estimated standard deviation; the second term is the information weight that is calculated by averaging the distances to all the instances in X u ; β is a hyperparameter that balances the relative importance of both terms. The distance metric can be chosen as Euclidean distance or cosine similarity, depending on the problem.
Model Evaluation
The predictive model should be evaluated before updating the training set. Cross-validation is a model validation method for parameter and accuracy estimation. It partitions the samples into two subsets: the analysis is performed on one subset (training set), and the analysis is validated using the other subset (validation set). Cross validation measures how the model will generalize to an independent set and prevents overfitting. Due to the random forest properties, out-of-bag (OOB) error is used instead. OOB is computed as the average prediction of those trees in which instance (x i , y i ) did not appear in their bootstrap samples [47, 55] . The OOB error can be used to perform parameter estimation in the Bayesian optimization. The measure for model accuracy here is root mean square error (RMSE) [56] :
where m is the number of validation points. Other metrics such as maximum absolute error and R 2 may also be considered [42] . The true RMSE for the random forest model is approximated using a test set, composed of additional circuit topologies and the corresponding true performances. The error between the true outputs and the predictions given by the random forest approximates the true prediction error. In other words, the test set provides an unbiased evaluation of the final random forest model fit on the entire training set. 
Training Data Set Update
Once the query data (X q , Y q ) have been determined in the previous step, the training set and unevaluated set can be updated respectively:
The stopping condition is either 1) RMS E falls below ǫ, or 2) the active learning process reaches a predefined iteration number n iter .
DESIGN PROBLEM 3.1 Problem Statement
Here we present the case study for circuit design synthesis. A set of circuit elements shown in Fig. 2 with replicates can be combined construct a circuit, such as the one shown in Fig. 3 . Resistors (R), capacitors (C), and inductors (L) are 2-port components, meaning that they must have connections to two other components. The required 1-port components include the input (I), the output (O), and the ground (G). Various multiport (>2) common-voltage nodes are also available (e.g., the 4-port common-voltage node N4). Other component types are possi- ble, but many circuits of practical importance, such as analog filters, can be constructed using this type of component catalog. Example circuit topologies are shown in Figs. 3 and 4 . Figure 4a incorporates three 1-port nodes (I, O, G), one 7-port voltage node (N7), two 4-port voltage nodes (N4), and twelve 2-port nodes (R, C).
In the circuit synthesis problem, the circuit components (I, O, G) are fixed. A model may be constructed for a complete circuit by identifying the transfer function G between the input and output. We would like to synthesize practical circuits that satisfy the following target frequency response [4] :
where the frequency range of interest is:
evaluated over 500 logarithmically-spaced points. Herber successfully enumerated and evaluated circuits for this problem with up to 6 general impedance elements and only (R, C) 2-port components [17] . All circuit topologies represented as labeled graphs were enumerated under a set of specifi-
5
Copyright c 2018 by ASME cations using the enumeration algorithms in Ref. [16] . To evaluate the performance given a desired circuit topology, we consider the following minimization problem [17] : min
subject to: l ≤ z ≤ u (12b) where z is the vector of optimization variables, representing the coefficients for the 2-port elements (R, C); the individual residual r k = g(ω k , z) − f (ω k ) is the difference between the transfer function magnitude g(ω k , z) and desired circuit magnitude response f (ω k ) specified in Eqn. (10); (l, u) are the upper and lower bounds for z.
Data
To test the active learning framework, we first obtained two sets of the circuit synthesis data by specifying different simple bounds on the resistors and capacitors [17] :
The circuit structure space was predefined by a collection of vectors, including distinct component types, the number of ports for each component type , the lower and upper bounds of replicates for each component type. Network structure constraints (NSCs) were used to define the feasibility for the graphs. A collection of 43,249 circuit topologies (denoted as X) with unique transfer functions were enumerated, and thus the corresponding circuit performance could be evaluated. For more details see Refs. [16, 17] .
Data Preprocessing
Each circuit topology x i can be represented as a labeledvertex graph and has a corresponding adjacency matrix. For a given catalog, the number of chosen components may vary across topology candidates. As a result, the adjacency matrix dimension can vary as well. Figures 4a and 4b share the same complexity number (12, computed from the number of 2-port components), but the total number of components is different. The adjacency matrices have dimensions 18 × 18 and 16 × 16, respectively. In addition, adjacency matrix representation is not as compact as other representation possibilities. To address this, a preprocessing step was performed. We identified all possible components using a fixed-dimension adjacency matrix (large enough for the maximum number of each of the component types). However, this results in the corresponding matrix representation (feature space) being fairly sparse. Feature hashing, an efficient way of vectorizing features, is used to overcome this issue, turning the adjacency matrix feature space into a compact vector representation [57] . Feature hashing has been used widely to perform document classification tasks by hashing the features to their hash values. In other words, the hashing-trick transforms the high di- mensional vector into a lower dimensional feature space [58] .
RESULTS

Comparison of Query Strategies
We randomly selected an initial training set of 10,000 circuits that were then evaluated to obtain the true response. Because the true response ranges from 10 −5 to 10 2 , the natural logarithm was applied to scale performances. A set of 500 data points are queried to the training set in each iteration according to the criteria specified in Eqs. (3)- (6) . The learning process is ended after N iter = 20 iterations are completed. We then compare the different query strategies against a benchmark random sampling strategy. In random sampling, the set of 500 data points are randomly sampled and added to the training set during each iteration. In the lower bound, A was chosen as 2 to minimize the lower bound of the 95% confidence interval, and in the density weighted method, β = 1 was used. Figure 5a presents the learning curve for Set 1. In the figure, Copyright c 2018 by ASME the blue solid line corresponds to the random sampling benchmark. The RMSEs for two of the query strategies in consideration, lower bound and expected improvement, are almost identical to random sampling after 20 iterations. The number of training samples needed is about 15,000 at RMSE = 2.55 at Iteration 10 for uncertainty sampling and weighted, but random sampling may require 17,000 samples at Iteration 14. Using one of these two more successful query methods can therefore reduce the number of true function evaluations by 2,000 to achieve the same desired RMSE. It is observed that the uncertainty and density weighted sampling methods are better than the others. The statistical lower bound and EI rely on both the prediction and variance, but perform worse than uncertainty sampling with the variance only. It may indicate that the prediction itself is less critical to the circuit synthesis problem. The space region information in the weighted density method may be useful in improving the model accuracy. Figure 5b in Set 2 exhibits different behaviors. Only uncertainty sampling outperformed random sampling. The other strategies, lower bound and expected improvement, perform worse than random sampling for Set 2. Because both sets were obtained by specifying different bounds on the components, these results also indicate that query strategy performance depends on data set properties.
Ranking Distance
While RMSE is a frequently used measure for the difference between the predicted and true observed values, the ordering or ranking of the predictions are also important. In the circuit synthesis problem, the ranking of the predicted values would facilitate the process of identifying high-performance circuit topologies. Here we specifically investigated the rank distance between the predicted values and observed values.
Kendall tau rank distance K(τ 1 , τ 2 ) is a metric that measures the number of pairwise disagreements or dissimilarity between two ranking lists (τ 1 , τ 2 ) of size n [59] . If two lists are identical, then K = 0; and if two lists are opposite to each other, then K = n(n−1)/2. Often K is normalized by dividing by n(n−1)/2 so the normalized distanceK lies in the interval [0, 1]. A value ofK = 0.5 implies that the ordering of one of the lists was completely randomized.
Figures 6 and 7 report the normalized Kendall tau distances for both data sets with uncertainty sampling applied to the active learning strategy. The distances include the training set, test set, and all set. The training set contains the samples after 20 iterations, and the remainder of the data set is termed the test set. We are also interested in how well the predictive model performs on the ordering of all available 43,249 circuit designs (termed the all set); it is useful for designers to to be able to select the best designs, which is possible if the rankings are preserved. In Set 1, a large portion of the ranking is kept, with the Kendall tau distances between 0.17077 and 0.24021 for the different sets. Similar observations can be found for Set 2 in Fig. 7 . The Kendal distances through the iterations were also investigated in Figs. 8a and 8b. In both sets, the training process remains stable at Iteration 15, indicating a particular convergence property. One may further improve the training process by increasing the number of iterations, but in practice it will require more true circuit evaluations during the model construction. A cost-effectiveness analysis should be performed to assess potential benefit. Copyright c 2018 by ASME 
DISCUSSION
The numerical results show that, for some query strategies, active learning can reduce evaluation cost compared to random sampling for the circuit synthesis problem. Among the tested query methods, uncertainty sampling performs the best. Here the estimated standard deviation given by the random forest algorithm captures the information for the query points. A number of other query strategies could be investigated with the potential for further evaluation cost reduction. For instance, query by disagreement (QBD) [60, 61] and query by committee (QBC) [62, 63] may be appropriate for the ensembling method. A clustering-based method could also be considered [64, 65] . It might be useful to cluster the evaluated data X u before applying either uncertainty sampling or the density weighted approach to each cluster. These methods in the past have primarily been used for classification problems in text mining. It is still worth exploring whether these methods are applicable to certain engineering design problems, such as circuit synthesis.
One drawback to consider is the additional computational cost required by active learning during the query phase; this cost may not be worth the gains. For instance, the LOO error in Eq. (5) computationally expensive to evaluate cost (a topic of ongoing work), because an additional N l random forest models must be retrained to obtain leave-one-out prediction in each loop. It is similar to the expected error reduction, where the future error for each query must be estimated, and a new model has to be retrained for every possible query over the entire unevaluated pool [66] . Variance reduction, often known as optimal experimental design, is also limited [67, 68] . It can only be applied to certain types of models such as linear/nonlinear and logistic regression, and hence is not generalizable. Whether or not the variance reduction method can be extended to tree-or nearest neighbor-based machine learning algorithms is still an open question [20] . Moreover, the variance reduction method involves inversion and manipulation of the Fisher information, and it turns Copyright c 2018 by ASME out to be slow and inefficient when a large number of parameters are to be estimated. While the normalized Kendall tau distance indicates that a large portion of ordering has been preserved, improvements may be expected, as Kendall tau distances for both training sets are only around 0.17. Similar to the RMSEs, it is speculated that more query strategies or learners could be studied to further analyze the appropriateness of different methods for learning in engineering synthesis problems. For example, gradient booting is another ensemble method with the goal of reducing bias [69] [70] [71] ; other learners such as a radial basis function network may be applicable [30] . These studies and analysis are left as future work.
It is observed here that active learning has some similarities with surrogate (or meta) modeling methods for design optimization. Wang and Shan summarized three classes of metamodelbased design optimization (MBDO) strategies [42] . The MBDO techniques also require sampling and construction of approximation models. In particular, adaptive MBDO and direct sampling approaches use an iterative mechanism to build, validate, and optimize metamodels. These MBDO strategies have been extended to multi-objective surrogate modeling [72] [73] [74] . However, there are also some distinctions. First, the vector of design variables x in MBDO are often assumed to be continuous, and they serve as the solutions to the global optimization problem. In active learning, instance x is an observation drawn from a certain probability density distribution and the feature space could be either numerical or categorical. Second, in machine learning or active learning, it is often assumed that observations (X, Y) are available (i.e., they exist in terms of data, rather than design variables), or at least X is easy to obtain but Y involves a high computational expense. An initial set of training samples can be drawn from available data. Surrogate modeling often generates initial samples via space-filling design methods, such as Latin hypercube sampling [33, 75] , and uses them as the training samples for approximate model construction. Finally, query selection in active learning samples points from the unevaluated data pool based on the predictions. Adaptive MBDO performs the optimization and validates the model in the loop to determine re-sampling (or obtain additional samples) and update the metamodels. The direct sampling approach samples toward the optimal solution given by the metamodel. Surrogate modeling utilizes the optimization in the loop to aid the re-sampling scheme and construction of the metamodel, whereas active learning updates the approximation model iteratively using information from the unlabeled or unevaluated data in the pool.
As part of ongoing work to address Case 2 synthesis problems discussed in Section 1, we are investigating techniques for generating circuit topologies that implicitly satisfy NSCs. This eliminates the need to enumerate all unique, feasible topologies, supporting approximate solution of synthesis problems where the catalogs are impractically large for enumeration. This approach narrows the search space through an intelligent mapping. A generative model is a probabilistic model capable of producing both the observations and targets in the data set. Restricted Boltzmann machines [76] , variational auto-encoders (VAEs) [77] , and generative adversarial networks (GANs) [78] are generative models used for modeling observations and targets drawn from a certain joint probability distribution. Guo et al. developed an indirect design representation for topology optimization in heat conduction design using VAEs [79] . For the circuit synthesis problem, GANs could be a promising option to generate circuit topologies by sampling using GANs.
CONCLUSION
In this article, we presented an active learning strategy for reducing topology evaluation cost for a circuit synthesis problem. We aimed to address the problem where it is possible to enumerate all unique and feasible topologies, but exhaustive evaluation is impractical (Case 1 synthesis problem) . Here we constructed a predictive model using a random forest to approximate true circuit topology performance. The active learning strategy interactively queried informative samples from the training set to construct iteratively-improved approximations, while reducing the number of training samples required. A number of query strategies were tested and compared. The active learning strategy helps reduce the evaluation cost for circuit synthesis because 1) we can avoid using the true evaluation function for each circuit topology; 2) we can make more accurate predictions given by the predictive models. The numerical experiments indicate that the uncertainty sampling query strategy is most effective among the tested methods for circuit synthesis. Through the active learning experiments, we found that uncertainty and topology structure may play critical roles in improving the appropriation model accuracy and make a significant contribution to reducing the system evaluation costs.
Future work should involve the exploration of more robust and efficient query criterion. While a number of query strategies have been presented, not all of the strategies outperform the random sampling benchmark. Recent studies indicate no query selection strategy is consistently or clearly the best [20] , because it relies heavily on the learners and applications. For example, Jin et al. compared different metamodeling techniques and observed that the robustness and accuracy of various surrogate models depend on the non-linearity of the problems [80] . As a result, it is worth further investigation into the selection of the learners, as well as understanding the circuit synthesis and other heterogeneous topology optimization problems more deeply with respect to active learning solution methods. Even though only a specific test problem was investigated, it is clear that the active learning strategy could be extended to other architecture system design tasks. For instance, the active learning strategy is applicable to the low-pass filter problem [17] . The active vehicle suspension design can also be posed as an architecture design Copyright c 2018 by ASME problem using labeled graphs, and this active learning strategy may be suited for evaluation cost reduction [18] . In addition, other well-established methods, such as Bayesian inference and MBDO, have contributed to important advancements within and outside the design research community. A comprehensive comparison may be conducted where these methods are applied to the same circuit synthesis problem. The active learning strategies for circuit synthesis and other similar engineering problems
have not yet been well-studied and we hope the work presented here will serve as a basis for productive future research.
ACKNOWLEDGMENTS
This work was supported by National Science Foundation Engineering Research Center for Power Optimization of ElectroThermal Systems (POETS) under Grant EEC-1449548.
