Abstract
When performing time-intensive optimization tasks, such as those in topology or shape optimization, researchers have turned to machine-learned inverse design (ID) methods—i.e., predicting the optimized geometry from input conditions—to replace or warm start traditional optimizers. Such methods are often optimized to reduce the mean squared error (MSE) or binary cross entropy between the output and a training dataset of optimized designs. While convenient, we show that this choice may be myopic. Specifically, we compare two methods of optimizing the hyperparameters of easily reproducible machine learning models including random forest, k-nearest neighbors, and deconvolutional neural network model for predicting the three optimal topology problems. We show that under both direct inverse design and when warm starting further topology optimization, using MSE metrics to tune hyperparameters produces less performance models than directly evaluating the objective function, though both produce designs that are almost one order of magnitude better than using the common uniform initialization. We also illustrate how warm starting impacts both the convergence time, the type of solutions obtained during optimization, and the final designs. Overall, our initial results portend that researchers may need to revisit common choices for evaluating ID methods that subtly tradeoff factors in how an ID method will actually be used. We hope our open-source dataset and evaluation environment will spur additional research in those directions.
1 Introduction
Design optimization, such as topology optimization (TO) or shape optimization, frequently requires expensive (in both time and computing resources) iterations to converge. For example, following the implementation of the governing equations and required parameters in a computational environment, TO problems typically require the iterative solving of these equations followed by updates to the problem conditions after each iteration. These computational expenses become significant, and sometimes prohibitive, in cases that require large numbers of calculations per iteration, large numbers of iterations, where computational resources are limited, or where a good solution is needed in a short duration of time. In response, researchers have tried to circumvent this iterative process via inverse design (ID)—training a machine learning (ML) model to directly output an optimal design for a new problem, given a dataset of past (typically expensive) physics-based optimizations [1,2]. In cases where such a dataset is available and one needs to evaluate many new input conditions or requirements quickly, ID methods can often provide significant time savings compared to optimizing a design for each bespoke input condition [1,2].
Inverse design models are typically trained to minimize the pointwise mean square error (PMSE) of how well the ID model predicts the optimized geometry for the input condition. This standard choice results from formulating ID as a supervised learning problem—input conditions in and optimized designs out—and measuring the output’s discrepancy with training or test samples. Researchers typically optimize any hyperparameters of such models in similar fashion.
However, an ID method’s ultimate goal can differ from the above aim. Are we using the predictions to capture, as accurately as possible, the geometry or design itself? Or do we care just about outputting high-performance designs, irrespective of how closely they match the training set? More importantly, are we using the predicted designs as-is, or using them to accelerate further optimization (i.e., warm starting)? Does warm starting with ID methods actually help, and if so, how and when? Is mean squared error (MSE) always the best thing to optimize? This paper addresses some of these questions.
Specifically, we attempt to demonstrate using different and easily reproducible machine learning models (random forests, k-nearest neighbors, and deconvolutional neural networks) how ID predictions impact the topology optimization problems we considered for this paper.2 These problems include a classical topology optimization problem with structural compliance [3] and both 2D and 3D conduction problems governed by the Poisson equation [4]. We examine multiple measures of ID performance, how the ID predictions modify the optimization process compared to a standard benchmark, and what, if any, effects altering the hyperparameter tuning method has on our results. The overall contributions of this paper are as follows:
We formulate three inverse design problems including the design of two-dimensional and three-dimensional heat conduction based on the problem described in Ref. [4] and shown in Fig. 1, and classical cantilever beam problem which described in Ref. [3] and shown in Fig. 2. This results in datasets and ID evaluation environments that we make available for the research community, along with performance diagnostics that shed light onto how optimizers are affected by the warm start predictions provided by ID methods.
We compare the performance of k-nearest neighbors, random forests, and deconvolutional neural networks models on these inverse design problems across multiple metrics including MSE and objective function value, both for the initial prediction and as a warm start to an adjoint optimizer. We provide both aggregated results (Figs. 3, 5, and 7) as well as illustrative examples (Fig. 9) that shed light on how adjoint optimizers adjust to warm starting by ID methods.
We compare two methods for optimizing the hyperparameters of those ID models, specifically minimizing the PMSE and another that minimizes the objective function value of the ID predictions at iteration 0, which we call the prediction objective function minimization method (POFMM). We show in Figs. 4, 6, and 8 and Tables 2–4 that models optimized with POFMM outperform those trained using PMSE.

The physical layout of the topology optimization problem of a cantilever beam: (left) design domain of a cantilever beam with design parameters of force location (h) and direction () on the free side of the beam, (right) topology-optimized beam

The normalized initial optimality gap of the either PMSEM or POFMM optimized of KNN, RF, and DeCNN models for 2D heat conduction problem. The lines in the middle of each box plot represent the median, and the dashed line represents the optimal value achieved under uniform initialization (the control condition).

The normalized initial optimality gap of the either PMSEM or POFMM optimized of KNN, RF, and DeCNN models for 2D heat conduction problem. The lines in the middle of each box plot represent the median, and the dashed line represents the optimal value achieved under uniform initialization (the control condition).

The normalized initial optimality gap of the PMSEM and POFMM optimized of KNN, RF, and DeCNN models for 3D heat conduction problem. The lines in the middle of each box plot represent the median, and the dashed line represents the optimal value achieved under uniform initialization (the control condition).

The normalized initial optimality gap of the PMSEM and POFMM optimized of KNN, RF, and DeCNN models for 3D heat conduction problem. The lines in the middle of each box plot represent the median, and the dashed line represents the optimal value achieved under uniform initialization (the control condition).

(Left) Median normalized optimization trajectories for the tested initialization techniques and (right) a detailed view of initial ID models’ warm start trajectories

The normalized initial optimality gap of either PMSEM or POFMM optimized of KNN, RF, and DeCNN models for 2D cantilever beam problem. The lines in the middle of each box plot represent the median, and the dashed line represents the optimal value achieved under uniform initialization (the control condition).

The normalized initial optimality gap of either PMSEM or POFMM optimized of KNN, RF, and DeCNN models for 2D cantilever beam problem. The lines in the middle of each box plot represent the median, and the dashed line represents the optimal value achieved under uniform initialization (the control condition).

The evolution of 2D heat conduction designs over the course of the optimization process. Here, the trajectory referenced as “KNN” refers to the trajectory initialized with the prediction of a KNN model optimized using PMSEM. The control trajectory is initialized with a constant distribution set to the volume limit. Note that the mass distributions shown here are plotted in the function form used by the optimizer, and hence display interpolation between points.

The evolution of 2D heat conduction designs over the course of the optimization process. Here, the trajectory referenced as “KNN” refers to the trajectory initialized with the prediction of a KNN model optimized using PMSEM. The control trajectory is initialized with a constant distribution set to the volume limit. Note that the mass distributions shown here are plotted in the function form used by the optimizer, and hence display interpolation between points.
2 Background and Related Work
In this section, we provide background on ID problems in general, the specific ID methods we use, and the needed background on the physical problems the paper addresses.
2.1 Inverse Design.
Typical machine learning approaches to accelerating optimization or automating design under new input conditions (e.g., , such as new boundary conditions, altered constraints, etc.) involve training a surrogate model that takes a set of design variables in (e.g., , such as an airfoil mesh) and produces an estimate of the physical behavior of a device as an output (e.g., , such as a resultant flow field). Optimization can then be accelerated by using the cheaper or faster surrogate to do gradient-based or gradient-free optimization. While widely used, these approaches can be expensive to train well on high-dimensional spaces, since it requires learning a function that maps all inputs to (possibly multiple) objective functions (e.g., in the case of a single objective).
In contrast, a different approach, which we refer to as ID throughout this paper, attempts to directly predict the optimal design (e.g., , such as an optimized airfoil mesh), given target input conditions (e.g., , such as a desired Reynolds or Mach number) [6]. This bypasses the typical need to learn or approximate an entire partial differential equation-based solution field or performance quantity, and instead focuses only on learning the mapping between the input conditions and the final optimized design, (i.e., learning the mapping ). These predicted designs can either be used in place of using an existing optimizer, or they can warmstart or accelerate an existing optimizer by providing a high-quality initial guess [6]. While it may seem a first glance to be more difficult to learn this function compared to traditional surrogate models, note that if one has an existing dataset of optimized designs along with their conditions (i.e., ) and if , then the sample complexity needed to learn this mapping from can be much smaller compared to covering the space of inputs in required by surrogate models.
Currently, in the field of mechanical engineering, a large amount of work has been done investigating the utility of ID in efficiently designing and characterizing materials (particularly nanomaterials and metamaterials) and microstructures [1,6–9]. The area of design for electromagnetic wave manipulation (e.g., nanophotonics) is particularly active [10–14]. ID methods have also been applied to problems in areas including molecular discovery [15], additive manufacturing [16], airfoil design [6,17], and imaging [18]. Due to their typically greater sample efficiency relative to surrogate modeling techniques, ID methods have also been explored as a means of accelerating the optimization process, often by providing good initialization points [6,12,19–21].
Inverse design has emerged as a prominent approach to tackle the limitations associated with classical TO, such as costly iterations and susceptibility to local optima due to their gradient-based nature. Recent advancements in the field have showcased innovative methodologies combining optimization and machine learning techniques. For instance, Nie et al. introduced an end-to-end TO framework named TOPGAN, which exploits various physical fields computed on the original, unoptimized material domain to predict optimal topology domains [22]. In parallel, Wang et al. demonstrated the use of U-net, achieving a substantial reduction in computation cost with minimal impact on the performance of design solutions [23]. Noteworthy contributions include the utilization of the diffusion model by researchers for high-performance inverse design in TO problems [24,25]. Regenwetter et al. conducted a comprehensive review and analysis of deep generative machine learning models in engineering design, specifically focusing on TO problems, providing valuable insights for interested readers [26]. Additionally, our group conducted a study that quantitatively addresses the question of when it is worthwhile to incorporate inverse design compared to optimizing designs without machine learning assistance [27].
Beyond applications to inverse design, prior work across a range of fields has studied formal or automated approaches to model selection. This includes work in what is referred to as automatic machine learning, for which Ref. [28] provides a recent overview relevant to engineering design for interested readers. In addition, the surrogate modeling community has pursued methods for model selection in predicting the forward performance model of a design. This includes, as one exemplar, recent work in concurrent surrogate model selection [29], which uses criteria other than the mean squared error to enable more robust modal selection. For example, using outlier-insensitive measures of location, such as the median or mode produced more robust surrogate models than typical MSE-based measures. In contrast to past work that has focused primarily on automating model search for high-accuracy surrogate models, this paper focuses on predicting design variables directly (inverse design), and specifically on understanding the causal effect of optimizing design-centered error measures (MSE) versus optimizing performance measures (the optimality gap).
While researchers have studied inverse design using a wide variety of predictive models, in this paper, we chose three main model families of non-linear supervised learning approaches ranging from fairly simple prototype-based methods, such as k-nearest neighbors (KNN), to ensembles such as random forests (RF), and to adaptive basis function methods, such as deconvolutional neural networks (DeCNN). We chose these models since they can provide a readily reproducible benchmark for future research in this area while allowing us to rigorously study the fundamental questions on interest regarding the effects of ID metrics and warm start behavior. We also attempted kernel-based methods, such as Gaussian processes, but these methods could not ultimately scale to the larger problems we show later in the paper, and thus we exclude them here. With this in mind, we will now provide some brief background on KNN, RF, and DeCNN models which have been applied to inverse design problems, and in particular, topology optimization problems [30–32]. While length constraints prevent us from providing a thorough description of each algorithm and its myriad variants, we provide for each pointers to additional literature for interested readers who wish to learn more about them.
2.1.1 K-Nearest Neighbors.
K-nearest neighbors is an algorithm that classifies (or, in the case of regression problems, assigns a value to) unknown data points based on combining (typically through averaging) the values of data points in the training set that are closer in proximity to an unknown data point. The Euclidean distance is often used to assess proximity between points, although other distance metrics are frequently used. The selection of the number of neighbors (k), can have a large affect on model performance [33]. It is one of the simplest non-linear ML methods, which makes it an appropriate baseline benchmark, even if it can struggle to extrapolate well beyond nearby training data. Readers interested in finding out more details are directed to Refs. [34,35].
2.1.2 Random Forests.
The random forest technique is an ensemble method that employs several decision trees to classify (or, in the case of regression, assign a value to) a data point [36]. This method generates multiple decision trees by selecting random features to use for each tree and then constructing the decision tree by computing the optimal “split point” via the Gini-index cost [33]. After constructing several trees, the “forest” makes a prediction by aggregating decisions across those trees. It is one of the most widely used and implemented ensemble methods, and interested readers can learn more in the original paper [36], or via Ref. [34] or Ref. [35].
2.1.3 Deconvolutional Neural Network.
Deconvolutional neural networks, also known as transposed convolutional neural networks, have emerged as a powerful tool in the field of inverse design. These models exhibit a structure similar to that of CNNs, but with reversed operations. These models have gained significant popularity in the realm of inverse design, particularly in the domains of topological and shape optimization problems [27,37]. In this paper, we explore the potential and limitations of DeCNNs in tackling complex pattern learning and achieving superior performance compared to simpler models. The concept of DeCNNs was first introduced by Zeiler et al., who proposed an innovative approach to learn feature maps in a reverse manner [38]. Their work laid the foundation for the application of DeCNNs in various tasks, such as image reconstruction and visualization [39,40]. DeCNNs have demonstrated their effectiveness in handling a wide range of functions, allowing them to learn complex patterns and achieve lower test mean squared errors compared to simpler models. However, it is important to acknowledge certain limitations inherent to DeCNNs, including the need for optimizing numerous hyperparameters and requiring a substantial amount of training data to attain desirable results, which may not make them the most cost-effective option [27].
2.2 Background on Chosen Topology Optimization Problems.
One type of problem that lends itself to inverse design approaches is that of topology optimization. By topology optimization, one seeks to determine the distribution of material in a space that best satisfies certain performance criteria [41].
3 Methodology
To address the contributions mentioned in the Introduction, our methodology has the following main steps: (1) defining heat conduction and cantilever beam topology optimization problems and how we generate the data sets, (2) how we train and optimize our specific ID methods, and (3) how we measure and evaluate the results from the ID methods.
Dataset Creation Via 2D and 3D Heat Conduction Topology Optimization.
To create a dataset of realistic yet manageable benchmark problems for our iterative design experiments, we built upon a classical thermal compliance example with Poisson equation constraints from Ref. [5], which we further describe in the following.
The optimization problems minimize the thermal compliance of given geometries while adhering to constraints on the volume of highly conductive material used and the presence of an adiabatic region. The adiabatic region refers to a specified length on the bottom side of the 2D problem space or a prescribed asymmetric area on the bottom surface of the 3D problem space (for further details, refer to the above background section and Fig. 1).
To generate our dataset for the 2D/3D problem set, we explored how the optimal design changed as a function of two input parameters: the upper limit of material volume within the unit square/cube (referred to as volume fraction) and the length/area of the adiabatic region. We selected values for the volume limit ranging from 0.3 to 0.6, as our chosen interior point solver (IPOPT) produced reliable results within that range, and values within that range exhibited sufficient topological variability. As for the adiabatic region length/area, we chose values between 0 (representing the absence of an adiabatic region) and 1 (corresponding to an entire side of the unit square/cube being adiabatic). The adiabatic region for the 3D problem is defined as a square area on the bottom side of the cube with a symmetric distance from the edges. We divided each design input range into 20 segments, resulting in 21 values of interest for each parameter. Consequently, the volume bounds were sampled at , while the lengths/areas were sampled at . By combining each volume limit value with each adiabatic region length/area value, we generated optimized topologies using the interior point solver, iterating until convergence.
For the 2D problem, we employed a mesh, while for the 3D problem, we used a mesh as the design domain. These mesh sizes were chosen to provide fine details within the design space, typical of topologically optimized solutions to these problems (as shown in Fig. 1), without unnecessarily increasing the computational running time for the solver.
For each combination of design parameters, we conducted 100 iterations of optimization, with IPOPT terminating when the tolerance of was satisfied. To perform finite element optimization, we used Dolfin-Adjoint [45,46] in conjunction with IPOPT [43].
Upon completion of the optimization run for a particular set of parameters, the resulting distribution was discretized to transform the data into a format suitable for the regression models used in this study. To achieve this, we divided the unit square into a grid, extracting the mass function value at each intersection point. This process yielded a total of data points per topology in the 2D case. Similarly, for the 3D case, we divided the unit cube into a grid, capturing the mass function value at each intersection point. This grid-based approach enabled us to capture all the pertinent information present in the output of the optimizer, resulting in a dataset of data points per topology.
3.1 Dataset Creation Via Cantilever Beam Topology Optimization.
To test our claim across multiple benchmark problems, we also created a dataset via a classical structure compliance topology optimization problem using the code provided by Andreassen et al. [3], which we detail in the following.
This optimization problem minimizes structural compliance of a cantilever beam (a rectangular domain with a 2:1 ratio) while satisfying the constraints on the volume of material used and the boundary conditions given force location and direction. We used a material volume fraction constraint of and a force magnitude of . The physical layout of this problem is shown in Fig. 2. The parameters and represent the design parameters including the force location on the free side and force direction (for further details, refer to the background section and Fig. 2).
To generate a dataset of optimized topologies needed for our experiment, we explored an input space including the force location and direction. We selected values for the force location ranging from 0 to 1 and force direction ranging from to . We divided each parameter range into 20 segments, resulting in 21 values for each parameter. For example, the force locations were sampled at 0.0, 0.05, 0.1, …, 0.9, 0.95, and the force directions were sampled at , , , …, ,, and . By combining each force location value with each direction value, we generated optimized topologies by running the interior point solver for 1000 iterations or until the optimizer satisfied a tolerance of 0.01 for the displacement.
Upon completion of the optimization run for a particular set of parameters, the resulting distribution was discretized to transform the data into a format suitable for the regression models used in this study. For this problem, we employed an mesh as the design domain. This mesh size was chosen to provide fine details within the design space, without unnecessarily increasing the computational running time for the solver. At the end, we provide a summary of the TO problems information we studied in Table 1.
Summary of TO problems information
Problem name | Inverse design inputs | Design input range | Mesh size |
---|---|---|---|
2D heat conduction | Volume fraction | 0.3–0.6 | |
Adiabatic length | |||
3D heat conduction | Volume fraction | 0.3–0.6 | |
Adiabatic area | 0–1.0 | ||
2D cantilever beam | Force location | 0.0–1.0 | |
Force direction |
Problem name | Inverse design inputs | Design input range | Mesh size |
---|---|---|---|
2D heat conduction | Volume fraction | 0.3–0.6 | |
Adiabatic length | |||
3D heat conduction | Volume fraction | 0.3–0.6 | |
Adiabatic area | 0–1.0 | ||
2D cantilever beam | Force location | 0.0–1.0 | |
Force direction |
3.2 Model Training and Cross-Validation.
Since we discretized each possible input into 21 values, we had to assign some subset of values for training, validation, and test data. To do this, we randomly selected two out of the 21 values from each input for each problem to keep for either validation or testing, and any remaining values were used for training. From these two held-out values for each input, we generate all possible combinations of the input values for each problem and split these combinations 50/50 between validation and testing. This generates mutually exclusive training, validation, and testing datasets. The validation set is specifically chosen for hyperparameter selection during the training phase. On the other hand, the testing set is entirely excluded from the training process and hyperparameter selection for evaluating the final, trained models’ performance. Since all three problems have two input parameters, this corresponds to training, validation, and testing sizes of 361, 40, and 40 respectively.
To implement our models, we used the KNN and RF implementations available in the Scikit-Learn library [47]. For the KNN model, hyperparameter optimization involved finding the optimal settings for both the weighting scheme and the number of neighbors. Similarly, in the RF model, we optimized the number of estimators and the minimum number of samples required in newly created leaves. To implement the DeCNN model, we used Tensorflow [48]. Hyperparameters for the DeCNN included the optimal learning rate and batch sizes for each problem. Throughout this work, we have been using DECNN architecture begins with four dense layers with expanding in capacity 32, 128, 1024, and 4096 nodes, respectively. Each layer is followed by batch normalization and leaky ReLU activation function (with =0.2). Then five transposed convolutional layers with decreasing filter sizes starting from 128 and cascading down to 64, 32, 16, and eventually reaching 1. A consistent kernel size of (4, 4) and strides of (2, 2) are used in each transposed convolutional layer to upsample the feature map to the desired output channels. Moreover, there are two fully connected layers with batch normalization and a sigmoid activation function to resize the output to the desired resolution.
To perform the hyper-parameter optimization, we employed the Bayesian optimization package in Ref. [49], which approximates the objective function using a Gaussian process. We use gp_hedge as the acquisition function which probabilistically chooses one of lower confidence bound, negative expected improvement, and negative probability of improvement acquisition functions at every iteration. We run each hyper-parameter Bayesian optimization runs until the acquisition function converges. Our aim was to cover all possible outcomes by carefully selecting the hyperparameter range for Bayesian optimization. For the KNN mode, we considered the number of neighbors, which ranged from 1 to 100 as integer values, and the weightings, which could be either “uniform” or “distance” as Boolean variables. Similarly, for RF models, we focused on the number of estimators and the minimum samples per leaf, both ranging from 1 to 30 as integer values. In the case of the DecNN model, we explored the learning rate within the range of 0 to 0.1, and the batch size as an integer number between 1 and 362 (except for the 3D heat conduction problem, where the batch size was determined between 1 and 100 due to computational power limitations).
To select the final optimal hyperparameters for each KNN, RF, or DeCNN model, we had to select which metric to optimize for over the cross-validation cases. Herein lies a major difference between the standard way of selecting ID models—picking the model that minimizes the pointwise MSE—and one that evaluates the model directly on the objective function of interest—what we refer to later in the paper as prediction objective function minimization. We will describe each approach in turn, and then show in the results section how they impact ID performance. In addition to the MSE, we also separately tested KNN, RF, and DeCNN models optimized using the log-loss (i.e., binary cross entropy), but its results were similar to that of using the MSE and thus we did not include it in the paper for space reasons.
3.2.1 Hyperparameter Optimization: PMSEM.
Our pointwise mean squared error method (PMSEM) for model performance evaluation uses the mean squared error between models’ predictions and their corresponding points in the topology from the validation set.
Physically, PMSEM compares the similarity of the mass distributions predicted by a model to the ground truth distributions.
3.2.2 Hyperparameter Optimization: POFMM.
In contrast to PMSEM, our POFMM uses an alternative approach to model hyperparameter optimization. As one intention of using a model in an inverse design problem is to produce a prediction that is as close to an optimal design as possible and is therefore a good initialization point for future iterations, it is therefore desirable to find hyperparameter values which enable the model to yield predictions with good objective function values at iteration 0. In our problem, this means minimizing the objective function (thermal compliance for 2D and 3D heat conduction problem and structural compliance for cantilever beam) value of the model’s predictions at iteration 0. Rather than comparing the predictions to the corresponding mass distributions in the validation set, the model hyperparameters H can be optimized solely with respect to this objective function value F. Note that unlike PMSE, a model optimized in this way may not produce designs that are as close in the design space (i.e., have the same geometry) as the training set compared to the PMSE method, yet should in principle still be able to produce results with high performance. In practice, because the primary parameters of the model are trained via MSE, these differences only affect model choice at the hyper-parameter level.
3.3 Evaluating the Model Predictions for Warmstart Optimization.
For a given hyperparameter setting, we can now train the correspondent models via the train-test split scheme described above. We then use the optimized KNN, RF, and DeCNN models to generate predictions for each tested combination of design inputs.
We initialized the solvers (IPOPT for 2D and 3D conduction problems, and CVXOPT for 2D structural problems) with the corresponding predicted designed from the KNN, RF, and DeCNN models. The IPOPT optimizer exited each run if it reached a tolerance of or iterations. The CVXOPT solver exited each run if it reached iterations or a displacement tolerance of less than .
As a control condition for the 2D and 3D heat conduction problems, we used a uniform initialization with a constant mass distribution equal to the volume fraction, since this is the most common initialization for SIMP-based density TO methods. Similarly, for the cantilever beam topology optimization problem we used a constant uniform distribution equal to the constant volume () fraction.
3.3.1 Data Post-Processing.
Following the conclusion of the optimized model evaluation process, thermal and structural compliance trajectory results were normalized, and the median value calculated. This was done to prevent any individual combinations of design inputs from having a disproportionately large or small effect on the thermal compliance trajectory. Specifically, the post-processing procedure is as follows:
Normalize each value in each objective function trajectory with respect to the optimal (minimum) value obtained in said trajectory.
Find the median of these normalized values at each iteration number for the PMSEM-optimized KNN models, the PMSEM-optimized RF models, PMSEM-optimized DeCNN models, POFMM-optimized KNN models, the POFMM-optimized RF models, and the POFMM-optimized DeCNN models. We chose to report the median value for the optimization trajectories since it is less sensitive to outliers.
Render all trajectories uniform in the number of iterations considered by extending runs that terminate before the maximum number of iterations attained among any runs. This extension is achieved by conservatively extrapolating the final value reached in each run over the remaining iterations.
4 Results
Following the above methodology, this section first reviews the optimal models that we found for our specific inverse design problems, then presents the main quantitative results on the impact of the initialization methods on ID performance. Following these, we provide qualitative comparisons of the final designs produced under each method, and an example trajectory that helps shed light how the ID method influences the warm start behavior of further topology optimization.
Note that in all subsequent plots with shaded regions depict the 95% empirical confidence intervals of the median for their corresponding plotted functions. For example, in the case of Fig. 3, the shading represents the 95% bootstrapped empirical confidence interval on the median at each iteration for each combination of model type and hyperparameter optimization method (with 100 bootstrap resamples).
4.1 Optimal Model Hyperparameters.
We employed a Bayesian model to optimize the hyperparameters using the PMSEM and POFMM methods. The optimal hyperparameters using the PMSEM and POFMM are reported in the following paragraphs.
4.1.1 PMSEM.
Using PMSEM for the 2D heat conduction problem, we discovered that the KNN models performed best with 31 neighbors and a “distance” weighting scheme. Similarly, RF models performed best with 23 estimators and a minimum of 22 samples per leaf. Lastly, the DeCNN performed best with a learning rate of and a batch size of 350.
For the 3D heat conduction problem, the KNN models performed best with two neighbors and a “uniform” weighting approach. The RF models performed best with 23 estimators and a minimum of two samples per leaf. Lastly, the DeCNN performed best with a learning rate of and a batch size of 100.
For the 2D cantilever beam problem, the KNN models performed best with two neighbors and a “distance” weighting strategy. The RF models performed best with 23 estimators and a minimum of 13 samples per leaf. Lastly, the DeCNN performed best with a learning rate of and a batch size of 361.
4.1.2 POFMM.
Using POFMM metric for the 2D heat conduction problem, we discovered that the KNN models performed optimally with one neighbor and a “distance” weighting scheme. Similarly, RF models showed the best results with 30 estimators and a minimum of one samples per leaf. Furthermore, the DeCNN performed its best performance with a learning rate of 0.0523 and a batch size of 114.
For the 3D heat conduction problem, the KNN models achieved the best outcomes with one neighbor and a “uniform” weighting approach. Similarly, the RF models displayed optimal performance with one estimator and a minimum of one sample per leaf. Additionally, for the DeCNN, we found that a learning rate of 0.0158 and a batch size of 30 yielded the most favorable results.
For the case of the 2D cantilever beam problem, the KNN models performed best with one neighbor with a “uniform” weighting strategy. Likewise, the RF models demonstrated their highest performance with 27 estimators and a minimum of one sample per leaf. Moreover, the DeCNN performed best with a learning rate of and a batch size of 150.
4.2 Impact of Different Initialization Methods on Prediction Performance and Trajectory Acceleration for 2D Heat Conduction Problem.
Using these optimized models, we can now compare how they perform at both predicting the optimal geometry as well as how they act as a warm start to further topology optimization, compared to a control (uniform initialization).
We found that, on average, all model types tested with either hyperparameter optimization method (POFMM or PMSEM) produced predictions with thermal compliance values significantly less than that of the control (Fig. 3). We also found that initializing the IPOPT optimization process using these methods also, on average, offered an acceleration for low evaluation numbers, despite the fact that the optimizer increases the thermal compliance in early iterations of warm starting (Fig. 3)—we show why this occurs later in the paper. Beyond around 20 iterations, the control begins to reach comparable performance to that of the warm-started optimizers.
Specifically, the median of the KNN models optimized using POFMM generated predictions that, at iteration 0, achieved 12.3% of the minimum thermal compliance reached in the corresponding control run, whereas predictions generated using KNN models optimized using PMSEM achieved 76.9%. For the RF models, these values were 3.6% and 57.8% using POFMM and PMSEM, respectively. Furthermore, for the DeCNN models, these values were 5.8% and 54.2% using POFMM and PMSEM. For comparison, a constant mass distribution set the volume limit (the control for this experiment) had a median performance of 602% (Table 2), almost an order of magnitude larger than the ID methods.
Median normalized thermal compliance (MNTC) at the zeroth iteration for tested initialization schemes (2D heat conduction problem)
Model type | MNTC at iteration 0 |
---|---|
KNN PMSEM | 1.769 |
RF PMSEM | 1.578 |
DeCNN PMSEM | 1.542 |
KNN POFMM | 1.123 |
RF POFMM | 1.036 |
DeCNN POFMM | 1.058 |
Control | 7.020 |
Model type | MNTC at iteration 0 |
---|---|
KNN PMSEM | 1.769 |
RF PMSEM | 1.578 |
DeCNN PMSEM | 1.542 |
KNN POFMM | 1.123 |
RF POFMM | 1.036 |
DeCNN POFMM | 1.058 |
Control | 7.020 |
To further illustrate the impact of using ML models for predicting the final design, we have depicted the relationship between the model used and the normalized initial optimality gap (IOG) in Fig. 4. The normalized initial optimality gap measures how closely the initial prediction of the ID model matches the performance of the control (TO) solution before undergoing further optimization.
4.3 Impact of Different Initialization Methods on Prediction Performance and Trajectory Acceleration for 3D Heat Conduction Problem.
As with the above, for the 3D heat conduction problem, we evaluated how each model’s warm start performance was affected by the choice of hyper-parameter optimization metric. As before, all models consistently yielded designs with lower final thermal compliance values compared to those obtained through the control method (uniform initialization), regardless of how we optimized the hyperparameters (refer to Fig. 5).
To compare the models in terms of their initial design predictions (prior to warm starting), Fig. 6 plots how the normalized IOG changes across each model and hyper-parameter optimization strategy. Our findings reveal that the median predictions of KNN models optimized using POFMM and PMSEM share the same hyperparameters, and thus perform the same at 238.6% of the minimum thermal compliance achieved in the corresponding control run at iteration 0. Similarly, RF models achieve values of 186.6% and 185.4% using POFMM and PMSEM, respectively. The DeCNN models also demonstrate improved performance, with values of 207.3% and 160.6% using POFMM and PMSEM, respectively. In comparison, the control method, represented by a constant mass distribution, exhibits a significantly higher median value of 690.5% (Table 3), nearly five times larger than the ML-based methods.
MNTC at the zeroth iteration for tested initialization schemes (3D heat conduction problem)
Model type | MNTC at iteration 0 |
---|---|
KNN PMSEM-POFMM | 2.386 |
RF PMSEM | 2.856 |
DeCNN PMSEM | 2.073 |
RF POFMM | 2.854 |
DeCNN POFMM | 1.606 |
Control | 7.905 |
Model type | MNTC at iteration 0 |
---|---|
KNN PMSEM-POFMM | 2.386 |
RF PMSEM | 2.856 |
DeCNN PMSEM | 2.073 |
RF POFMM | 2.854 |
DeCNN POFMM | 1.606 |
Control | 7.905 |
After warmstarting, we can also observe that hyperparameters optimized through the POFMM method tend to yield lower thermal compliance values (e.g., see the DeCNN POFMM model).
4.4 Impact of Different Initialization Methods on Prediction Performance and Trajectory Acceleration for the Cantilever Beam Problem.
For the cantilever beam problem, we again see in both Figs. 7 and 8 that irrespective of whether the POFMM or PMSEM hyperparameter optimization methods were employed, the ML models consistently yield initial thermal compliance values significantly lower than those obtained through the control method (refer to Fig. 7). We show in Fig. 8 that the median performance of KNN models, optimized using POFMM, yielded predictions at only 8.4% of the minimum thermal compliance achieved in the corresponding control run during the initial iteration. In contrast, KNN models optimized using PMSEM had an average performance of 52.2%. Similarly, RF models achieve 11.4% and 56.3% with POFMM and PMSEM optimizations, respectively. Lastly, for DeCNN models achieve 22.8% and 33.6% using POFMM and PMSEM approaches. In contrast, a constant mass distribution, which served as the control for this experiment, exhibited a median of 3200%, nearly two orders of magnitude larger than the performance of the ID methods (as shown in Table 4).
MNTC at the zeroth iteration for tested initialization schemes (2D cantilever beam problem)
Model type | MNTC at iteration 0 |
---|---|
KNN PMSEM | 1.522 |
RF PMSEM | 1.563 |
DeCNN PMSEM | 1.336 |
KNN POFMM | 1.084 |
RF POFMM | 1.114 |
DeCNN POFMM | 1.228 |
Control | 33.878 |
Model type | MNTC at iteration 0 |
---|---|
KNN PMSEM | 1.522 |
RF PMSEM | 1.563 |
DeCNN PMSEM | 1.336 |
KNN POFMM | 1.084 |
RF POFMM | 1.114 |
DeCNN POFMM | 1.228 |
Control | 33.878 |
When we use the ML predictions to warm start optimization the ID models notably accelerate convergence, particularly during early iterations.
4.5 Why Does Thermal Compliance Increase After Warmstarting?.
A typical normalized optimization trajectory for the warmstarted optimizations of 2D heat sink problem is shown in Fig. 9. In particular, Fig. 9 displays the trajectory taken by the optimization of a mass distribution subjected to a volume limit of 0.36, an adiabatic region length of 0.45, and an initialization produced by a KNN model optimized using PMSEM (Fig. 9).
In this case for the KNN-initialized trajectory, it appears that the increase in thermal compliance that peaks at iteration 4 is which can be attributed to a reduction in the specificity of the predicted material distribution. This reduction occurs when the optimizer, at and around this iteration, tends to adjust the values of towards intermediate levels, deviating from the strict bounds of the SIMP formulation in Eq. (2). In other words, the optimizer is subtracting some of the material predicted by the model.
In contrast, the control trajectory is significantly smoother and lacks this degree of loss of definition in its mass distribution. In our other study, we compared the IPOPT and SciPy solvers, utilizing the sequential least squares programming method [27]. The outcomes of this comparison can be found in Supplemental Material available in the Supplemental Materials on the ASME Digital Collection. We observed that the SciPy solver exhibited a gradual and monotonic decrease in compliance. In contrast, the warm start optimization with IPOPT displayed an intriguing behavior, initially showcasing an increase in compliance. We hypothesize that IPOPT’s solution method, which involves numerical approximation of the Hessian matrix during the initial iterations of the solver, may be responsible for this behavior. It is plausible that these early sub-optimal steps taken by IPOPT, stemming from the approximation process, contribute to the initial “bump” witnessed in the convergence trajectory.
4.6 Impact of Different Initializations on Final Optimized Designs.
We also observed qualitative differences between the 2D heat conduction designs produced from optimization processes run with different initializations. This is expected due to the non-convexity of the problem, as in such problems differing initializations are often expected to result in a numerical optimizer arriving at different local minima. Likewise, we saw that the final designs in each case had similar final objective function values, including the cases presented in Fig. 10.

There are numerous structural differences between optimized designs that used different initializations. Note that the distributions shown here are direct plots of the mass distribution function inputs and outputs to the initialization process, and hence display interpolation between points.
Figure 10 displays the results of optimization runs using different initializations. All runs used a volume fraction of 0.36 and an adiabatic region length of 0.45. Note that the optimizer exited upon the satisfaction of the tolerance of . There are small differences in the structures of the designs that are visible upon inspection. The designs nevertheless appear to retain their dendritic character in this case and share a substantial resemblance to each other.
4.7 Tradeoff Between the Objective Value and Constraints.
One important parameter in constrained inverse design optimization is conserving the constraint. In order to gain insights into the impact of using ML models optimized through either POFMM or PMSEM on the deviation from the constraint volume in a 2D heat conduction problem, we have generated a plot showcasing the deviation of mass used for the ML models’ predictions in this study (refer to Fig. 11). Figure 11 illustrates the disparity in mass used for each model prediction. Positive values imply that there is remaining mass before exceeding the constraint limits, while negative values indicate that mass should be removed to meet the constraint limit. Upon observation, Fig. 11 makes evident that, on average, ML models optimized using the POFMM method better preserve the constraint limits compared to PMSEM models. In both cases, however, there is no guarantee that any ID model will exactly preserve the constraints, and empirically we see that the effect on the volume fraction constraint can vary.

The initial volume deviation of either PMSEM or POFMM optimized of KNN, RF, and DeCNN models for 2D heat conduction problem. The lines in the middle of each box plot represent the median, and the dashed line represents the value when the mass used for design is equal to the constraint volume of the problem.

The initial volume deviation of either PMSEM or POFMM optimized of KNN, RF, and DeCNN models for 2D heat conduction problem. The lines in the middle of each box plot represent the median, and the dashed line represents the value when the mass used for design is equal to the constraint volume of the problem.
Nevertheless, this outcome raises intriguing questions for future research into ID models. Namely, to what extent can ID models both predict high-performance designs while simultaneously adhering to constraints?
5 Discussion, Limitations, and Future Work
While the specific ID models we tested were simple, they nevertheless highlight interesting phenomena that may generalize to other problems or ID methods. Here we review possible limitations or areas of future work that may affect our stated contributions.
5.1 Alternative Training and Testing Metrics.
Changing how we optimized the hyper-parameters of the ID models (from MSE-driven to objective function-driven) affected not only quantitative convergence measures (per Fig. 3), but also qualitative predictions (per Fig. 10), even though the underlying primary training measures were identical (i.e., the KNN, RF, and DeCNN models minimize the design’s MSE reconstruction error during training).
This raises questions not only about how we evaluate ID methods but also points toward alternative training procedures. For example, strategies such as training over joint losses that include both reconstruction error-type losses as well as objective-function-derived losses. Moreover, if we know that the goal of an ID method is to act as a warm start for additional TO, then perhaps predicting the “peak” distribution seen in Fig. 9 may be faster than either of the PMSEM or POFMM approaches used in this paper. In other words, the best prediction might not be the best performing initially, but one that is most helpful to downstream optimization. Understanding under what conditions and ID applications different methods excel would be a fruitful avenue for future work.
5.2 Variations in ID Methods and Dataset Size and Problems.
Lastly, using more advanced ID methods may alter some of our observations, for example, by eliminating the need for an adjoint optimizer to redo portions of the ID prediction as shown in Fig. 3 if the ID predictions lie sufficiently close to the global optima.
We are currently exploring whether more advanced ID methods—such as diffusion models—might yield further improved performance or change the behavior noted in the trajectory figures (refer to Figs. 3, 5, and 7). This said the DeCNN model that we tested had a sufficiently large model capacity to be competitive with state-of-the-art methods, so it is not clear whether more advanced models would significantly affect these results.
Likewise, we did not investigate here how ID performance is modulated by the size of the training data set; uncovering transition points where ID methods become performant remains an open question worthy of future study in general, and could be studied using techniques introduced in Ref. [27]. Furthermore, while we got better performance in using the POFMM method (refer to Figs. 3, 5, and 7) in comparison to the PMSEM method this does not come without a cost. One notable drawback is the computational cost associated with the objective evaluation for each model. Due to the nature of the POFMM algorithm, which involves computing an objective function evaluation for each training design, the computational time, and resources required to evaluate each model can be significant. Furthermore, the return on investment of employing the POFMM method should be carefully considered. While improved performance is desirable, it is crucial to assess whether the benefits outweigh the computational cost. Factors such as the specific application, the available computational resources, and the desired level of accuracy need to be taken into account, and recent work provides some guidance on how to address these tradeoffs [27].
6 Conclusions
We compared several inverse design models (KNNs, RFs, and DeCNNs) and two approaches for model hyperparameter optimization with standard uniform initialization using the SIMP-based TO on three topology optimization problems. We described a benchmark set of environments and datasets and showed how those models affected both the initial predictions from the ID methods as well as downstream acceleration when warm starting optimization.
Our findings indicate that both methods of hyperparameter optimization yield KNN, RF, and DeCNN models that can substantially accelerate the optimization process when their predictions initialize an interior point solver. These predictions also tend to have objective values close to the minimum obtained in a corresponding control optimization run, and tend to significantly outperform initialization with a uniform mass distribution—a common TO initialization method. Furthermore, our study challenges the conventional approach of optimizing ID methods solely based on the MSE in the reconstruction of test set designs. We demonstrate that optimizing for models that produce lower objective function values can outperform standard MSE-derived hyperparameter optimization methods.
Although we investigated specific physical problems, two model hyperparameter optimization methods, and three ID model types (KNN, RF, and DeCNN) there remains a large space for future work in both different physical problems and in different computational approaches to modeling and model optimization. Overall, our results highlight the nuances in evaluating ID methods—that your end goal in inverse design, whether that be direct prediction, distribution matching, or warm starting an optimizer, can affect both your evaluation approach and how you optimize your models.
Footnotes
Code to reproduce the results in this paper is located at: https://github.com/IDEALLab/JMD_MSE_ID.
See Note 2.
Acknowledgment
This research was supported in part by funding from the U.S. Department of Energy’s Advanced Research Projects Agency-Energy (ARPA-E) DIFFERENTIATE funding opportunity through award DE-AR0001216 and from the National Science Foundation through award #1943699.
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The data and information that support the findings of this article are freely available online.3