Identification of discrete dynamic algebraic syste- 123docz.net

The identification of dynamic system can often be cast as the identification of a static system (steady state modeling) if lagged (past values) process variables are employed.

This is the case if the noise is assumed to be Gaussian white noise. For modeling such dynamic systems, a variable is split into several genes having several delays. For example, a variable u1 may be split into four variables: u1(t-1), u1(t-2), u1(t-3), u1(t-4) etc. - we consider these four variables as being independent as far as the modeling is concerned.

In tree representation, genetic operations for these four variables are essentially treated separately. However, in our representation, genetic operation can recognize the accentor since we use the 3rd layer to handle delays. The adaptive operator manipulates the 3rd layer only, so model structure is not altered.

Case study 4: Simulated nonlinear dynamical system

For this case study, the data was generated using the system model:

(5.1) noise

k u k

y k u k

y k

y( )=3 ( −1)+0.35 ( −1)2 −0.4 ( −1) ( −1)−2.5 ( −1)+

The middle terms introduce significant nonlinearity and makes the system identification procedure challenging. The data set consisted of five hundred samples. Table 5.7 contains the details of the parameters used in the configuration file. The results are shown in Table 5.8. The expansion of the GP model from the first run yields

, which is the exact system model. Number of successful runs is expected to increase if we use a larger population size or more number of generations.

) 1 ( 5 . 2 ) 1 ( 3 ) 1 ( 3490 . 0 ) 1 ( ) 1 ( 3988 .

0 u k− y k− + y k− 2 + y k− − u k−

Table 5.7 GP configuration script file for Case Study 4

Parameter Value Terminal set

) 4 ( ), 3 ( ), 2 ( ), 1 (

) 4 ( ), 3 ( ), 2 ( ), 1 ( , 1

−

k u k u k u k u

k y k y k y k y

Functional set +,−,÷,×,^2,^3

Number of generations 20

Population size 200

Probability of genetic operator (Mutation, Crossover, Reproduction, Permutation)

0.3:0.5:0.1:0.1

Constants used in computing

fitness measure: k1:k2 :k3 0.45:0.45:0.1

Table 5.8 Results of GP runs for Case Study 4

Fitness RMSE Model

1630.949 4.191e-008 (((((-1.067)*u1B1)+(1.027*yB1)))^2-(((((-

1.067)*u1B1)+(0.8401*yB1)))^2-((3*yB1)-(2.5*u1B1)))) 1630.949 4.191e-008 (((((-1.067)*u1B1)+(1.027*yB1)))^2-(((((-

1.067)*u1B1)+(0.8401*yB1)))^2-((3*yB1)-(2.5*u1B1)))) 400.143 1.033e-002 (((((-0.3231)*u1B1)+(0.5927*yB1)))^2-(((-3.009)*yB1)-((-

2.416)*u1B1)))

1630.949 4.191e-008 (((((-1.067)*u1B1)+(1.027*yB1)))^2-(((((-

1.067)*u1B1)+(0.8401*yB1)))^2-((3*yB1)-(2.5*u1B1)))) 1630.949 4.191e-008 (((((-0.1388)*u1B1)+(0.8418*yB1)))^2-(((((-0.1388)*u1B1)-

(0.5988*yB1)))^2-((3*yB1)-(2.5*u1B1))))

Case study 5: Experimental heat exchanger system

In this case study, we identified a dynamical model using data obtained from an experimental heat exchanger system. The data set was provided by Dr. Eskinat and is also considered in the work by Eskinat et al. (1991) in the context of identification of a block oriented nonlinear model (Hammerstein Model: the model is characterized by a nonlinear static element followed by a linear dynamic element). The experimental setup and the details of the hardware equipment are available in the above reference. The nonlinearity in the system is caused by the presence of two distinct operating regions corresponding to the high and low process water flow rates. A dataset containing 334 input-output samples was available. Sixty-five percent of this data set was used to obtain the model using DACS-GP. The remaining data was used for model validation.

Table 5.9 shows the configuration details and Table 5.10 shows the results from several GP runs. Figure 5.6 shows the model fit (first 200 samples) and the model validation (last 134 samples) for this data set. It is seen that there is no degradation of prediction quality as we move from the training data set to the validation set.

It must be pointed out that this same data set was analyzed by Lakshminarayanan et. al.

(1995) using multivariate statistical tools to arrive at a Hammerstein model. The Hammerstein model, having RMSE of 0.01943, was determined as:

(5.2)

3 2 2

2 1 9.1072 2 7.1592 1

4343 . 6 1 80072 .

0 yB u B u B u B

y= − + +

It was noticed that all the best GP models provide marginally better prediction than the Hammerstein model. The model selected as “best” from among those obtained using GP is quite similar to the Hammerstein model except that the GP model employs the term u2B4instead of u2B13.

Table 5.9 GP configuration for Case Study 5

Parameter Value Terminal set

) 4 ( ), 3 ( ), 2 ( ), 1 (

) 4 ( ), 3 ( ), 2 ( ), 1 ( , 1

−

k u k u k u k u

k y k y k y k y

Functional set +,−,÷,×,2,3

Number of generations 20

Population size 150

Probability of genetic operator [Mutation, Crossover, Reproduction, Permutation, Adaptation, Super-crossover]

[0.15 0.5 0.1 0.05 0.15 0.05]

Constants used in computing

fitness measure: k1:k2 :k3 0.5:0.5:0.5

Table 5.10 Results of GP runs for Case Study 5

Fitness RMSE Model

-181.641 0.01888 (u2B4+((2.7483*u2B1)^2)+(-6.1655*u2B1)+(0.8356*yB1)) -182.881 0.01910 (u2B4+(0.83703*yB1)+u2B1+(-7.1662*(u2B1-(u2B1^2)))) -181.692 0.01765 ((1.995*((u2B2+u2B1)^2))+u2B4+(-6.1899*u2B1)+

(0.83346*yB1))

-183.008 0.01737 (0.82799*((u2B2^2)+yB1+u2B4+((2.9253*u2B1)^2) +(- 7.4522*u2B1)))

-184.825 0.01810 ((-6.1737*u2B1)+((2.5823*u2B1)^2)+(u2B2^2)+u2B4+

(0.83482*yB1))

-186.402 0.01801 ((u2B4*u2B2)+(7.1767*((u2B1-1)*u2B1))+

(0.83323*yB1)+u2B4+u2B1)

-180.891 0.19539 ((0.8356*yB1)+(7.5532*(u2B1*u2B1))+(-6.1655*u2B1)+

u2B4)

-181.641 0.19539 ((0.8356*yB1)+u2B4+(-6.1655*u2B1)+(7.5532*(u2B1^2))) -181.881 0.01910 (u2B4+(0.83703*yB1)+u2B1+(7.1662*((u2B1-1)*u2B1))) -181.641 0.01887 ((0.8356*yB1)+u2B4+(-6.1655*u2B1)+(7.5532*(u2B1^2)))

200 220 240 260 280 300 320 340 -6

-4 -2 0 2 4 6 8 10 12

Sample

u2B4+((2.7483*u2B1)2)+(-6.1656*u2B1)+(0.8356*yB1) Measured values GP model prediction

Figure 5.6 Model fit and prediction of the “best” GP model for Case Study 5

Case study 6: Modeling of an acid-base neutralization system

Modeling and control of a pH process (neutralization of an acid with a base) has long been employed as a benchmark problem due to the highly nonlinear behavior of this system. First principles modeling of the neutralization process results in highly nonlinear equations that involve equilibrium constants that are often unavailable. In such a situation, black-box modeling is often the only choice.

In this example, we consider an acid-base neutralization process performed in a single tank. The system description, the nonlinear process model and the operating conditions can be found in Henson and Seborg (1994). Identification of the same system using multivariate statistical tools can also be found in Lakshminarayanan et al. (1995).

The process consists of an acid (HNO3) stream, buffer (NaHCO3) stream and base (NaOH) stream that are mixed in a stirred tank. The chemical equilibria are modeled by introducing two reaction invariants for each inlet stream:

(5.3)

i i

ai H OH HCO CO

W =[ +] −[ −] −[ 3−] −2[ 32−]

(5.4)

i i

bi H CO HCO CO

W =[ 2 3] +[ 3−] +[ 32−]

where i = 1 for the acid stream, i = 2 for the buffer stream and i = 3 for the base stream.

By combining mass balances on each of the ionic species in the system, the following differential equations for the effluent reaction invariants can be derived (Hall and Seborg, 1989).

3 4 3 2

4 2 1

4 1

4 1 ( )

) 1 (

q W Ah W

dt dW

a a a

a a

a = − + − + −

(5.5)

3 4 3 2

4 2 1

4 1

4 1 ( )

) 1 (

q W Ah W

dt dW

b b b

b b

b = − + − + −

(5.6) In the above equations, ‘q1’, ‘q2’ and ‘q3’ are the volumetric flow rates of the acid, buffer and base streams respectively. ‘A’ is the cross sectional area of the mixing tank and ‘h’ is the liquid level.

The effluent pH is determined from Wa4 and Wb4 using the following relation

10 0 10

10 2 10 1

2 1

4 14

4 =

+ +

× + +

−

+ pH− −pH b pK−pH pH−pKpH−pK

a W

(5.7) where pK1 and pK2 are the negative base-10 logarithms of the equilibrium constants associated with H2CO3 and HCO3-

dissociation respectively (e.g. pK1 = -log10 K1).

Because the pH probe is located downstream from the mixing tank, there is a time delay of approximately 10 seconds associated with the pH measurement.

(Level)

(Acid Stream)

tream)

y1 (pH)

(Buffer Stream) (Base S

Figure 5.7 Schematic of the acid-base neutralization system The liquid level is modeled as:

] 1 [

3 2 1

n v h C q q A q

dh = + + −

(5.8) where Cv is the valve coefficient and n is the valve exponent (valve is on the exit line).

The nominal operating conditions (steady state values) and parameter values are given in Table 5.11. The schematic of the system is depicted in Figure 5.7.

The simulated input-output data (three flow rates are the inputs and pH is the output) were employed to identify linear models using Matlab® System Identification toolbox.

None of the linear models (ARX, ARMAX, N4SID) was found to be satisfactory.

Figure 5.8 shows the predictions obtained with the best ARX model. Inadequateness of the linear dynamic model leads us to believe that nonlinear models could work.

Table 5.11 Nominal operating conditions for the acid-base neutralization system

Variable Value

Wa1 0.003 M

Wa2 - 0.03 M

Wa3 - 3.05 x 10-3 M

Wb1 0 M

Wb2 0.03 M

Wb3 5 x 10-5 M

K1 4.47 x 10-7

K2 5.62 x 10-11

A 207 cm2

Q1 16.6 ml / s

Q2 0.55 ml / s

Q3 15.6 ml / s

Wa4 - 4.32 x 10-4 M Wb4 5.28 x 10-4 M

h 14 cm

pH 7 n 0.607

Parameterization of the Hammerstein structure for the case of SISO is straightforward.

However, structure for the MIMO models are complex due to too many possibilities of combination of input polynomials. The inputs can be considered interactive or non- interactive with each other. Depending on this fact, two parameterization schemes can be realized: separated parameterization and combined parameterization (Lakshminarayanan et al., 1995). The predictions with the identified separately parameterized Hammerstein model are given in Figure 5.9. Even this model was not of acceptable quality. Such a problem is a good candidate to test the ability of the GP method in handling complex nonlinearity. The GP parameter set up for this problem is shown in Table 5.12.

Figure 5.8 Prediction with the best ARX model (plant: noisy trajectory; model: smooth trajectory)

0 500 1000 1500 2000 2500 3000

4 5 6 7 8 9 10

ARX [3 3 10]

Time

2 order separated Hammerstein Model 10

Figure 5.9 Predictions obtained with Hammerstein model (plant: noisy trajectory; model: smooth trajectory)

Table 5.12 Configuration for Case Study 6 (acid-base system)

Parameter Value Terminal set

5 , 4 , 3 , 2 , 1

5 , 4 , 3 , 2 , 1 , 1

yB yB yB yB yB

uB uB uB uB uB

Functional set +,−,÷,×,^2,^3 Number of generations 20

Population size 200

Probability of genetic operator (Mutation, Crossover,

Reproduction, Permutation)

0.3:0.5:0.1:0.1

Constants used in computing

fitness measure: k1:k2 :k3 0.45:0.45:0.1

25004 3000 3500 4000 4500 5000 5500

5 6 7 8

Original Simulated

Table 5.13 Result of GP runs on Case Study 6

Fitness RMSE Model

977.115 0.00734447

0.013018 (u1B1 * yB1) + -0.0031163 (yB2 * u1B2) + 0 u1B1 + 0.0035623 (yB3 * u1B3) + 0.080392 u2B2 + -1.1756 (u1B1 / yB1) + -0.036653 yB3 + -0.0705 u3B2 + -2.2356 u2B1 + 0.17191 17 + 0 yB2 + 1320.0871

(1/u1B1^2) + -0.016273 u3B8 + 0 16 + 0.070096 (yB2 * yB2) + 0 yB3 + 0 0.11111 + 0.5101 u1B1 + -0.0045406 (u3B5 * u1B5) + -0.52374 u3B7 + 0.14122 u3B5 + 0.26807 (u2B1 * yB1) + -1.0181 yB2 + 0.016388 (u3B7 * u3B7) + 0.0072806 (u3B2 * yB2)

906.209 0.00453963 0.0524 u3B6 + 0.44573 16 + -0.36249 u1B1 + -0.012566 u3B7 + -0.45702 u2B1 + 0.049591 (u1B1 * yB1) + -0.015443 u3B8

809.359 0.00105849 (0.91337 yB1 + (-1.1615 + (((u3B5 + 4) + u3B7) / (u1B1 + (((1 + (1/((((1 + 11.222 u1B2) - (u3B7)^2) + yB4) + yB5)^2)) + 1) + 1)))))

The data was modeled using an older version of DACS-GP (version 0.5.4). We have two data sets with 500 samples in each of them. One of the data sets is used for modeling and the other is used for validation.

The results of the GP runs are shown in Table 5.13. The fit and the validation of the best GP model are shown in Figure 5.10. It is seen that the GP model provides much better prediction of the pH as it responds to changes in the flow rates of the three process streams.

Estimation data 10

Figure 5.10 Plot of model vs. data (x-axis: Samples, y-axis: pH)

Case Study 7: Modeling of rainfall-runoff data

Davidson et al. (2003) describe a state-of-the-art implementation of GP and include modeling of rainfall runoff data as one of the applications of their tool. We consider the same data and example to illustrate how DACS-GP fares as compared to the implementation of Davidson et al. (2003). The description of the data and their GP models can be found in the above reference. The data set has 1702 samples out of which we employ the first 1200 samples for modeling and retain the rest for validation purpose. Table 5.14 shows the configuration file for this problem and Table 5.15 contains the results from the GP runs.

0 50 100 150 200 250 300 350 400 450 500

5 6 7 8 9

0 50 100 150 200 250 300 350 400 450 500

4 6 8 10 12

Fresh data validation

Actual Model Actual Model

Table 5.14 Configuration for rainfall runoff modeling (Case Study 7)

Parameter Value Terminal set

3 , 2 , 1

2 , 1 , , 1

yB yB yB

uB uB u

Parameter estimation data 1 to 800 sample Fitness evaluation data 801 to 1200 sample Functional set +,−,÷,×,2,3

Number of generations 20

Population size 200

Probability of genetic operator (Mutation, Crossover, Reproduction, Permutation, Adaptation, Super- crossover)

[0.2 0.56 0.17 0.034 0.013 0.024]

Constants used in computing

fitness measure: k1:k2 :k3 0.5:0.5:0.5

In their paper, Davidson et. al. (2003) reported a RMSE of 0.2846 for the “best” model.

The model obtained with DACS-GP shows similar or slightly superior RMSE values.

The validation of the “best” model is shown in Figure 5.11 below.

Table 5.15 Result of GP runs for Case Study 7

No Fitness RMSE Model

1 -2134.66 0.279483

((-0.17734*(yB1^2))+(0.0004408*((-0.0025082*

((-4.687*(yB1^2))^2))^2))+(0.72724*yB1)+

(-0.075504*(yB3^2))+(0.046036*(((0.55024*u1B1)+

yB3+u1+(0.55177*1))^2)))

2 -2162.2 0.277294

((0.034798*(((((u1B1+u1)-(yB2+(0.34514*

(yB1*yB1))))*yB3)+ u1B1+1+u1)*u1))+

(((0.10078*u1B1)+(0.051432*u1))-((-0.6158*yB1)+

(0.11814*(yB1*yB1))))+(0.093665* yB2)) 3 -573.735 0.302172

((0.066685*(u1B1*u1))+((0.23146*u1)^2)+(0.047766*(

u1B1-((2.1034*yB1)^2)))+yB1+(-0.059359*u1B2)+

(0.071113*(yB3*u1)))

1250 1300 1350 1400 1450 1500 1550 1600 1650 1700 1

2 3 4 5 6

sample

runoff (millimeter)

Data Model

Figure 5.11: Fresh validation of model no 2

Identification of discrete dynamic algebraic system

Identification of ordinary differential equation systems

Integration of nonparametric regression techniques into DACS-GP