As already mentioned, non-parametric data analysis tools such as ACE and MARS can be profitably used to aid DACS-GP in efficient searching for models. Both of these tools are employed here as data pre-processing tools. The ACE method (Breiman and Friedman, 1985) models a smooth function of the response variable as a summation of smoothed functions of the independent variables. The ACE “model” is:
(5.15) e
x t y
f
p
i i
i +
=∑
=1
) ( )
(
where f(y) is a smooth function of the output, xi (i = 1 to p) are the independent predictor variables and ti(xi) denotes the smooth transformation of xi and e is the noise
term. The method is non-parametric because the mathematical forms of the functions are never determined. The results of the ACE program are in the form of graphs for the functions f and ti. The user may look at these graphs and propose suitable mathematical equations for them. The standard deviations and the range (maximum value – minimum value) of the transformed variable ti(xi) can be used to gauge the “importance” of the predictor variable xi. The ACE method can therefore be used to suggest model forms and also to pick up the important predictor variables from a large pool. To make full use of the power of this method, the data set must be fairly large and of good quality. It must be emphasized again that powerful mathematical and statistical tools cannot compensate for poor data.
MARS (Friedman, 1991) is a more involved procedure compared to ACE. The main advantage that MARS has over ACE is its ability to automatically consider second and higher order interactions among variables. While ACE generates curves (only transformations of the x variables and y), MARS can generate curves, surfaces and higher dimensional objects. Again the outputs of MARS are in the form of graphs and measures such as relative importance of variables from which the user can benefit tremendously. With MARS, the requirements on the quality and quantity of the data set are bound to be even higher. The use of MARS is not considered in the case studies here; MARS has however been integrated into DACS-GP.
Figure 5.23 Graphical user interface of ACE Case Study 12: Simulated data
This case study is based on a problem described in McKay (1997). The objective is to check if there was an advantage in using ACE as a pre-processing tool prior to running the GP. The data consisted of one hundred samples of five randomly generated input variables within the range [0, 1], and one output variable (y) generated using the following (assumed unknown) relationship:
5 4 3 2
1,u ,u ,u ,u u
3 2
1 5.0 3
exp
1000 u
u u
y ⎟⎟⎠+
⎜⎜ ⎞
⎝
= ⎛ −
(5.16)
Note that the output is only a function of three of the five input variables. The two remaining inputs do not affect the output and are “nuisance variables” for modeling.
Such variables must not figure in the final model.
Table 5.24 GP Configuration for runs without “ACE advice”
Parameter Value Terminal set 1,u1,u2,u3,u4,u5
Functional set +,−,÷,×,exp,ln
Number of generations 30
Population size 64
Probability of genetic operator (Mutation, Crossover, Reproduction, Permutation)
[0.26 0.52 0.087 0.043 0.043 0.043]
Constants used in computing
fitness measure: k1:k2 :k3 0.5:0.5:0.5 Initialization tree depth 8
Maximum tree depth 36
Maximum number of parameters 6
Optimizer Random; Quasi-Newton;
Evolution policy lumpsum_partimes; arg_de_parminus;
Use algebraic simplification Yes
On using the ACE feature in DACS-GP, the ACE function gave the result shown in Figure 5.23. Using the standard deviation and the range of the transformed variables obtained, it was seen that the variables u4 and u5 are of relatively lower significance than the other three variables. Even u3 is not that important but may have some effect on y and therefore can be considered for inclusion in the model. It is seen that ACE can help us to identify and discard the unimportant input variables.
Table 5.25 Result of GP models without “ACE advice”
Fit VRMSE Model
-118.77 0.265555 exp((2.0785*(((u3+(0.52125*((u1+u3)*u2)))-u1)-((- 0.86472*((u1+1)-u2))+(0.72538*((u3-u2)*(u2+
(u2*(u2*(u1*u2))))))+(1.1709*1)+(-0.37293*u2))))) -258.82 0.0654576 ((-3.5509*(u2*u2))+(3.0133*u3)+(0.010943*1)+(-0.20696*
(u1*((u2+u1)*exp(((u2+u1)*u2)))))+(-1.0728*u2)+
(1.9627*(u2*exp(((u2+u1)*u2)))))
-232.32 0.0914177 ((0.61032*((u2+u1)*((((u2+u1)*((((u2+u1)*u2)+u2)*u2))*u2 )*u2)))+ (3.0448*u3)+(-1.1823*u2)+u2)
-121.55 0.264302 ((-1.6709*((u1-0)-0))+u1+(0.01081*exp((6.3719*u2)))+
(3.4593*((u2-u3)-(u2*(u2-u1))))+(3.2758*(u3-(u2-u3)))) -92.07 0.346826 ((-5.9934*u2)+(2.8524e-006*exp(exp(exp(u2))))+
(3.0652*u3)+ (4.418*exp(u2))+exp(exp((0.20051*u1)))+(- 41.682*0.1747))
-241.45 0.0796857 ((1.2371*(((u2+u1)*((((1+(((u2+u1)*u2)*u2)+u1)*u2)*u2)+u 3+(-0.89922*u3)+(-
0.59257*((u2+1)*u2))))*u2))+u3+(2.1767*u3)+(- 0.14363*((u2+1)*u3)))
-325.80 0.0334999 (0.37594*((u3+(0.0969*u2)+(0.034774*u1)+(7.0719*u3)+(0 .089835*u2)+(u2*u2)+(6.7472*(((u2*u2)*u2)*(((u2*u1)*ex p(u2))*u2))))-u2))
-135.61 0.224395 ((-3.0775*(u1-u3))+(0.76986*exp(u1))+(0.020456*
exp((5.6928*u2)))+(-5.5405*exp((2.2502*((u2-1)-u1))))+u1) -192.13 0.130487 (exp(log((5.8368*exp((1.9544*((log(u2)*exp(exp(exp((u3-
exp(exp(log(exp((-
1.5595*(log(u2)*exp(u1))))))))))))*(1/u1)))))))+
(2.9259*u3)+(0.25547*u2))
-271.45 0.0576895 ((-0.022154*(u5*u4))+(-0.06594*((u2+u1+1+(-0.16738*
exp((6.5096*u2))))*u1))+(3.0114*u3)+(0.01343*(u1*u4)))
Table 5.26 Result of GP models with “ACE advice”
Fit VRMSE Model
-132.55 0.247917 ((2.9422*u3)+(0.27497*((exp((1.3184*(u2*exp(((u2- 0)*u1)))))-1)-u1)))
-305.21 0.0441034 ((3.0332*u3)+(0.07077*((u2+exp((u2+u2+exp(u2))))*(u2*u1 )))+ (0.94659*(u2*u1)))
-228.85 0.0968511 ((0.18493*(u2*((u1*(u2*exp(((u2*exp(u2))+u2))))+u2)))+(2.
9553*u3))
-328.62 0.0348978 ((-1.0041*u2)+u2+(7.0952*((u2*(u2*u2))*((u2*(u1*u2))
*u2)))+ (2.9977*u3))
-376.97 0.0215183 ((7.7647*(u1*(u2*(u2*((u2*u2)*u2)))))+(3.0192*u3)+
(0.95916*(u1*(u2*u2))))
-219.58 0.101471 (-3.0239*(((u1/((-0.68539*u2)+1))-u3)-(u1/((0.90937*(1- u2))+ (0.1741*1)))))
-127.53 0.248948 ((2.0998*u3)+u3+(-0.15247*(exp((-3.3883*(u1-u2)))-u2))+
(0.0035212*((exp((7.3807*u2))-u1)-(u3-u2)))) -399.13 0.0164655 ((0.50405*(u1*u2))+(16.416*((u2+(-0.61743*1))*
((u1*(u2*u2))*u2)))+ (0.0076678*u2)+(2.9942*u3)) -328.62 0.0348978 ((-1.0041*u2)+(7.0952*((((u2*(u2*u1))*(u2*u2))*u2)*u2))
+ u2+(2.9977*u3))
-244.2 0.07933 (0.23192*(((1.9955*(((((exp(exp(u2))*u1)*u2)+(0.32828*u3) )*u2)*u2))+ (12.942*u3))-u2))
The GP optimization is now run in two configurations for 60 generations and a population size of 25: one with “ACE advice” (i.e., with u1, u2 and u3 only as input variables) and another without “ACE advice” (i.e., use all the five input variables to model y). The configuration for the GP runs (without “ACE advice”) is shown in Table 5.24. A similar configuration file was used for the GP runs with “ACE advice” with change made only to the terminal set (u4 and u5 are removed in these runs). The success rate of the GP program is then examined. A successful run is defined as one with a root
It is found that the success rate with “ACE advice” was 4 out of 10 runs and the success rate without “ACE advice” was 1 out of 10 runs. It is therefore concluded that ACE is able to improve the success rate of GP modeling for algebraic systems. The best-fit model obtained with “ACE advice” had an RMSE of 0.0164655 as compared to the best-fit model obtained for a model without “ACE advice” (RMSE of 0.0334999). The
“quality” of the model is also better indicating that the removal of spurious variables using ACE, MARS or otherwise is a required step in the modeling effort.
The modeling results from these runs are summarized in Tables 5.25 and 5.26. The best models are shown in bold script. The model predictions from the best model (obtained with “ACE advice”) are compared with the “true” data in Figure 5.24.
0 10 20 30 40 50 60 70 80 90 100
0 1 2 3 4 5 6 7 8
sample
Data Model ((0.50405*(u1*u2))+(16.416*((u2+
(-0.61743*1))*((u1*(u2*u2))*u2)))+
(0.0076678*u2)+(2.9942*u3))
Figure 5.24 Data vs. model prediction for Case Study 12