In order to obtain a “good” model from a set of acquired or existing data, three subproblems must be solved:
These subproblems are addressed by applying a model design framework consisting of two existing tools. The first one, denoted as ApproxHull, performs data selection from the data available for design. Feature and topology search are solved by the evolution part of MOGA (MultiObjective Genetic Algorithm), while parameter estimation is performed by the gradient part of MOGA.
2.1.1. Data selection
In order to design a datadriven model like RBF (Radial Basis Function), the training set must contain samples covering the entire inputoutput range over which the underlying process should operate. To identify such samples, called convex hull (CH) points, from the entire data set, the convex hull algorithm can be applied.
$$ds\left(x,H\right)=\frac{{a}_{1}{x}_{1}+\cdots +{a}_{d}{v}_{d}+b}{\sqrt{{a}_{1}^{2}+\cdots {a}_{d}^{n}}}$$
where $n={\left[{a}_{1},\dots {a}_{d}\right]}^{T}$ and d are the normal vector and the offset of H, respectively.
$$\begin{array}{l}\underset{a}{\mathrm{min}}\left(\frac{{a}^{T}Qa}{2}{c}^{T}a\right)\\ s.t.\hspace{1em}{e}^{T}a=1,a\ge 0\end{array}$$
where $e={\left[1,\dots 1\right]}^{T}$, $Q={X}^{T}X$, and $c={X}^{T}x$. Assuming that the optimal solution of Equation (2) is a*, the distance of point x to conv(X) is given by:
$$dc\left(x,conv(X)\right)=\sqrt{{x}^{T}x2{c}^{T}a*+a{*}^{T}Qa*}$$
ApproxHull consists of five main steps. In Step 1, each dimension of the input dataset is scaled to the range [−1, 1]. In step 2, the largest and smallest samples of each dimension are identified and considered as the vertices of the initial convex hull. In step 3, a population of k faces based on the current vertex of the convex hull is generated. In step 4, equation (1) is used to identify the furthest points to each face in the current population and, if they have not been detected before, treat them as new vertices of the convex hull. Finally, in step 5, the current convex hull is updated by adding the newly found vertices to the current vertex set. Iteratively perform steps 3 to 5 until no vertex is found in step 4 or the newly found vertex is very close to the current convex hull and thus does not contain useful information. Under an acceptable userdefined threshold, the convex hull distance shown in (3) is used to identify the closest point to the current convex hull.
In previous steps before determining CH points, ApproHull eliminates copies and linear combinations of samples/features. After determining the CH points, ApproxHull generates training sets, test sets, and validation sets for MOGA to use according to user specifications, but merges the CH points into the training set.
2.1.2. Parameter separability
$$\widehat{y}\left({X}_{k},w\right)={\mathrm{you}}_{0}+{\displaystyle \underset{I=1}{\overset{n}{\Sigma}}{\mathrm{you}}_{I}{\mathrm{Phi}}_{I}\left({X}_{k},{v}_{I}\right)}=\mathrm{Phi}\left({X}_{k},v\right)\mathrm{you}$$
$$\mathsf{oh}\left(X,w\right)=\frac{{\Vert y\u2013\widehat{y}\left(X,w\right)\Vert}_{2}^{2}}{2}$$
Where ${\Vert .\Vert}_{2}$ Represents the Euclidean norm. Replace (4) in (6) with:
$$\mathsf{oh}\left(X,w\right)=\frac{{\Vert y\u2013C\left(X,v\right)\mathrm{you}\Vert}_{2}^{2}}{2}$$
Where $C\left(X,v\right)={}^{}$
$$\mathsf{\Psi}\left(X,v\right)=\frac{{\Vert y\Gamma \left(X,v\right){\Gamma}^{+}\left(X,v\right)y\Vert}_{2}^{2}}{2}$$
The advantages of using (9) instead of (7) are threefold:

It lowers the problem dimensionality, as the number of model parameters to determine is reduced;

The initial value of $\mathsf{\Psi}$ is much smaller than $\mathsf{\Omega}$

Typically, the rate of convergence of gradient algorithms using (9) is faster than using Equation (7).
2.1.5. MOGA
$${\widehat{y}}_{k+1k}=F\left({z}_{k}\right)=F\left({y}_{k\u2013{d}_{{0}_{1}}},\dots ,{y}_{k\u2013{d}_{{0}_{n}}},{X}_{k\u2013{d}_{{I}_{1}}},\dots ,{X}_{k\u2013{d}_{{I}_{\mathrm{rice}}}}\right)$$
Where ${\widehat{y}}_{k+1k}$ represents the time step prediction k + 1 Given the measurement data at that time kand ${d}_{{I}_{j}}$ this j^{th} variable delay I. This represents a onestep forecast within the forecast horizon.When we iterate over (17) on PH, some or all of the indices on the righthand side will be greater than k, which means that corresponding forecasts must be adopted. What was said for the NARX model also applies to the NAR model (no exogenous inputs).
The first component corresponds to the number of neurons. The next d_{m} represent the minimum number of features, while the last white ones are a variable number of inputs, up to the predefined maximum number. The ${\lambda}_{j}$ values correspond to the indices of the features f_{j} in the columns of F.
$$\mathrm{Second}\left(sI\mathrm{rice},\mathrm{phosphorus}H\right)=\left[\begin{array}{cccc}e\left[1,1\right]& e\left[1,2\right]& \cdots & e\left[1,PH\right]\\ e\left[2,1\right]& e\left[2,2\right]& \cdots & e\left[2,PH\right]\\ \vdots & \vdots & \ddots & \vdots \\ e\left[pph,1\right]& e\left[pph,2\right]& \cdots & e\left[pph,PH\right]\end{array}\right],$$
where e[i,j] is the immediate model prediction error I of simulationin step j within the forecast range.Indicates that the RMS function is in I^{th} columns of matrix Secondgo through ${r}_{sI\mathrm{rice}}\left(.,I\right)$the prediction performance criterion is the sum of the RMS of each column Second:
$${r}_{sI\mathrm{rice}}\left(\mathrm{phosphorus}H\right)={\displaystyle \underset{I=1}{\overset{\mathrm{phosphorus}H}{\Sigma}}r\left(\mathrm{Second}\left(sI\mathrm{rice},\mathrm{phosphorus}H\right),I\right)}$$
Note that in the MOGA formula, each performance criterion can be minimized, or set to a limit.
After formulating the optimization problem and setting other hyperparameters, such as the number of elements in the population (n_{Pop music}), the total number of iterations (n_{Iterator}) and genetic algorithm parameters (random immigration proportion, selection pressure, crossover rate, and survival rate), a hybrid evolutionary gradient method was performed.
The performance of MOGA models is evaluated on either a nondominated model set or a prioritized model set. If a single solution is pursued, it will be selected based on the target values for these model sets, the performance criteria applied to the validation set, and possibly other criteria.
The problem definition steps should be modified when the analysis of solutions provided by MOGA requires repeating the process. In this case, two main operations can be performed: redefining the input space by removing or adding one or more features (variables and lagged inputs in the modeling problem), and improving the tradeoffs by changing the objective or redefining Surface coverage target. This process can be advantageous because typically the output of a single run allows us to reduce the number of input terms (and possibly the variables used to model the problem) by eliminating terms that are not present in the resulting population. Furthermore, faced with the results obtained in a single run, it is often possible to narrow down the number of neurons. This results in a smaller search space in subsequent runs of MOGA, potentially achieving faster convergence and better Pareto front approximation.
Typically, for a specific problem, an initial MOGA execution is performed minimizing all objectives. Then, a second execution is run, which typically sets some targets as limits.