New relationship for estimation of wave overtopping on vertical walls

Soft computing tools in the form of combination of multiple nonlinear regression and M5′ model tree were used for estimation of overtopping rate at the vertical coastal structures. For reliable and precise estimation of overtopping rate, the experimental data available in the database CLASH were used. The dimensionless overtopping rate was estimated in terms of conventional dimensionless parameters including the relative crest freeboard Rc/Hs, seabed slope tanθ, deep water wave steepness Som, surf similarity ξom and local relative water depth ht/Hs. The accuracy of the new model was compared with other existing models and also evaluated with some field measurements. The results indicated that the model presented in this paper is more accurate than other existing models. With statistical parameters, it is shown that the accuracy of predictions in the new model is better than that of other models.


Introduction
Seawalls are usually massive, vertical structures which are used to protect backshore areas from heavy wave action, and in lower wave energy environments, to separate land from water. For example, caisson type structure as a quaywall was used before construction of breakwaters exposed to the open sea in Iran (Alikhani et al., 2003).
Reliable estimation of wave overtopping rate is required for the planning and safety considerations of coastal structures. Overestimation of overtopping rate will cause additional cost, and underestimation will result in irreparable damage. The wave overtopping rate (q, in m 3 /s per unit width) is a fundamental design factor for coastal structures that are designed to restrict q below an allowable amount. Therefore, the prediction of wave overtopping has been the study topic for so many researchers (van der Meer and Bruce, 2014). Verhaeghe (2005) presents a summary of various proposed formulae for the prediction of overtopping rate. Goda et al. (1975) prepared design diagrams for evaluation of overtopping by using the equivalent deep water wave height H 0 ′ as the main parameter. EurOtop (2007) presented several wave overtopping formulae for various kinds of coastal structures by employing the significant wave at the toe of the structure, H s as the key parameter.
Recently, artificial neural network (ANN) as a soft computing method has been employed to predict the mean wave overtopping rate for coastal structures. These studies have been fulfilled within the European project CLASH (de Rouck et al., 2009). Van Gent et al. (2007) presented an ANN model for various types of coastal structures. Verhaeghe et al. (2008) developed a two-phase neural model to predict the wave overtopping rates.
Here, by using soft computing tools in the form of combination of multiple nonlinear regression and M5′ model tree, the overtopping rate at vertical coastal structures was studied with the data from the CLASH database and the results were compared with others as well as available real field data.

Existing formulas for estimation of overtopping
The most well-known experimental models have been proposed for vertical caisson breakwaters by Franco et al. (1994)  ( 2) where h s is local water depth and T m is average wave period.
In fact, this number is the product of two dimensionless parameters: the ratio of the water depth to wave height and the ratio of the water depth to deep water wavelength, being the wavelength in deep water. Then the formulae for estimation of overtopping rate are as follows: Eq.
Another formula is provided by Goda (2009) The constant coefficients in these equations are determined according to Table 1.

M5′ model tree and multiple regression
Decision tree is one of the most powerful and popular tools for classification and prediction of the data. Unlike neural networks in which input data are considered only as numbers, in model tree there is no limit to the type of data. The most important advantage of the decision tree to the neural network is about producing of rules. Decision tree reveals the predictions in the form of a few rules. For validation of rules, a series of test data which are not used to build the tree, are utilized.
The basic form of the model tree was presented by Quinlan (1986) as a new soft computing tool. These models fit linear function to the subset of data. The most important advantage of model trees compared with other soft computing tools is that they provide clear mathematical relationships. Recently these models have been used in the field of sediment transport (Bhattacharya et al., 2007), prediction of wave spectrum (Sakhare and Deo, 2009), and prediction of the significant wave height (Etemad-Shahidi and Mahjoobi, 2009). In this study, with M5' model tree, separate models are presented to predict the average rate of wave overtopping on the vertical structures.

Data selected from the database CLASH
The data used in this study are selected from the database created in the project CLASH (de Rouck et al., 2009). In this project an extensive database on the wave overtopping including field measurements and laboratory data has been developed. There are many experimental data in the database that have been obtained from 163 independent experiments performed in institutes and laboratories around the world. Further details about the collection of the database are presented by Verhaeghe (2005) and van der Meer et al. (2009).
In this study selected are the data involving wave overtopping on the vertical structures, without any upper structure with reverse slope or protecting rubble mound structure in front of the caisson. By using correction factor for the permeability and roughness (γ f ) and the slope of the front face of the structure (cotα), the data for this type of structure are classified. As a part of the project CLASH, a series of tests have been conducted to evaluate the combined effect of roughness and permeability of various types of structures. Structures with γ f =1 have been considered as smooth and impermeable (data used in this study) and structures with γ f ≤0.6 have been considered as rubble mound structures.
Overtopping wave studies within the project CLASH show that, because of the effects of the model, scale, and errors caused by different measurement techniques, the real condition cannot be simulated in a laboratory accurately. For example, the effects of sea currents or winds are generally not taken into account in the lab. If the large-scale as well as small-scale data are used for model training, scale effects may affect the accuracy the model. To avoid misleading model, data for large-scale tests (including tests with the significant wave height larger than 5.0 m and field measurements) is not used for training model.
The Complexity-Factor (CF) and Reliability-Factor (RF) are allocated for each experiment in CLASH database. The complexity factor changes from one for structures with a very simple cross-section to four for structures with a very complex cross-section. Similarly, reliability factor varies from one to four for the very reliable to non-reliable tests, respectively. For this study the data with CF=4 and RF=4 have been excluded. Data concerning simple structures (without platform), single slope, without the reverse slope of the upper structure and the structure of the stone without protective structures have been selected.

Modeling
The experiments available in CLASH were performed in different scales (such as 1:20, 1:30, and 1:40). Therefore the measured overtopping rates are not comparable to each other, without converting to dimensionless parameters. Further- Consequently we should use dimensionless parameters such as the relative crest freeboard R c /H s , seabed slope tanθ, deep water wave steepness S om , surf similarity ξ om , local relative water depth h t /H s and dimensionless width of structure crest G c /H s . The output dimensionless parameter is according to most of the existing formulae.
The models are trained and tested using small-scale ln ³ q= p gH 3 sĺ aboratory data selected from the database CLASH. EurOtop (2007) and some of the previous empirical formulae for the estimation of wave overtopping rate have an exponential form. Hence, the exponential form was used for the present model. The M5′ model tree can only produce linear relationships between the input and output parameters. To remove this restriction, the model was developed using as the output parameter.
Using just M5′ model tree, the best results will be Eqs. (8) and (9): In order to overcome the limitation of the M5' model which can only give the linear relationships, we used SPSS software to make nonlinear regression modeling. Using just this method, the best formula is: Because in the nonlinear fitting method there is only one limited model which may not entirely cover the range, from the combination of two above mentioned methods, we can obtain: In fact M5' model uses a special algorithm to divide the domain and presents only linear models. This disadvantage is eliminated by using a combination of M5' model and nonlinear regression of SPSS. The method is illustrated in Fig. 1.
Data range used to build the model is given in Table 2.

Results
Although the slope angle is embedded in the surf similarity parameter (Eq. (7)), the wave runup is separately proportional to the surf similarity parameter and slope angle due to the interaction between individual runup bores. There often exists significant interaction between subsequent waves, such that when a wave reaches the shoreline and travels up a beach face it cannot always complete a full swash cycle before the next wave comes along. The second wave either overtakes the first wave during its uprush stage (catch up) or collides with the first wave during the backwash stage. This interaction between waves continues with each incoming wave, while following the swash zone hydrodynamics; as a result the maximum runup will not correspond to the uprush of the highest wave in the train. This is particularly true for the mild foreshore slopes, where the time for a swash lens to travel up and down is longer than that for steeper slopes. Thus, the slope angle has a significant effect in predicting the maximum runup height if the swash interaction is accounted for (Bakhtyar et al., 2008   According to Eqs. (11) and (12), reduction in the angle of slope at the toe of the structure, result in the increase of wave overtopping rate. Because when the angle decreases, the reduction of water depth rate decreases and therefore waves are less affected by the roughness of the bed.
On the other hand, when the angle is reduced, if the water depth is smaller than the depth of wave breaking, wave breaks farther from the crest. Thus, a few volumes of water reach behind the wall and small overtopping rate is recorded.
Similarly according to Eqs. (11) and (12), with the increasing depth of structures the overtopping rate is reduced. With the increasing depth of structure, the wave is less affected by the bed and placed under the shoaling phenomenon and its height is not increased. Moreover, the probability of breaking at the vicinity of structure highly decreases and thus water droplets caused by the break are not dispersed in the air and does not appear as the part of the overtopping rate. Instead, standing waves are formed in the toe of the structure, which leads to decreasing of runup and overtopping rate.
Also S om with regard to the relationship is a function of the significant wave height and incident wave period. By increasing the height of the waves, the waves break at larger depths, thus, the distance between the breaking place and the structure increases and the probability of overtopping becomes less likely. Thus increasing the wave height results in increasing wave steepness and this causes the reduction of overtopping.
In EurOtop (2007) formula, the parameter of ξ om is the surf similarity parameter (Iribarren number) defined as: Thus, the wave steepness (S om ), is inversely proportional to the surf similarity parameter. With regard to the relationships presented in CEM (2006), the wave runup is directly related to the surf similarity parameter. It is means that the reduction of ξ om reduced runup. Thus, by reducing the runup levels, a few waves can achieve the crest level and overflow. By increasing the wave steepness, ξ om decreases. Reduction of ξ om results in the reduction of overtopping.
Eqs. (11) and (12) have been obtained based on the average of input and output data. In other words, the exceedance level of the measured values from the predicted values is about 50%. Sometimes it is necessary to predict the overtopping rates based on less exceedance level. Therefore, the above equations' coefficients for cases of less exceedance level can be modified. To correct the coefficient, confidence interval is used. The confidence interval of 90% means that the exceedance level of the measured values in this range is 10%. If the line of the best fit is defined as y=ax, the line equations of the confidence intervals can be defined as follows: in which a is the fit factor and σ is the standard deviation of the data. n depends on the distribution type of data and the confidence interval. For example, assuming a normal distribution of the data and the confidence interval of 90%, n is equal to 1.65. If we assume that the distribution of data around the line of the best fit is approximately symmetrical, for a confidence interval of 90%, the probability that the measured values exceed the predicted value using is 5%. According to the above mentioned items and considering the standard deviation of the data, Eqs. (11) and (12) are rewritten as follows for the design conditions: Values of n for the normal distribution and for various values of confidence intervals and exceedance level are given in Table 3. Putting the appropriate amount of n in Eqs. (15) and (16), we obtain the desired prediction equation based on the exceedance level. The above equations can be used as design relationships for the vertical structures.
For better understanding consider Fig. 2. In Fig. 2, the selected data from CLASH database (for data range R c /H s ≤1.31) are indicated by blue dots. The predicted values using Eq. (11) are shown in black continuous line. that is the approximately average of data, as seen. However, if we are to obtain the overtopping rate with the exceedance level of 33% for the design consideration, putting n=0.44 (from Table 3 for exceedance level of 33%) in Eq. (15), the values shown by black dashed-line is obtained. As seen in Fig. 2, about 33% of the measured data is larger than the data predicted by the black dashed-line.

Discussion
The comparison between the measured dimensionless overtopping discharge and the ones predicted by using the   KAMALIAN Ulrich Reza China Ocean Eng., 2017, Vol. 31, No. 1, P. 48-54 new formulas (15) and (16) is shown by Fig. 3. The figure indicates that the new formulas yield accurate predictions of the overtopping. Especially for the high overpassing discharges which are more dangerous and may cause damage, the new model is more accurate.
To assess the accuracy of the models, statistical indicators such as the geometric mean (x G ), geometric standard deviation (σ xG ) (Goda, 2009), the index of dispersion (SI), BI-AS, correlation coefficient (R 2 ), root mean square error (RMSE) and discrepancy ratio (DR) are used. These indicators are defined as follows: in which, and are the estimated and measured dimensionless overtopping rates. n is the number of measurements. Table 4 shows the calculated errors for various models.
As it is seen in Table 4, all the error parameters indicate the improvement of the overtopping rate forecasted by the new model. Fig. 4 shows the changes of DR versus relative crest freeboard (R c /H s ) for the different models. It's clear that discrepancy ratio of the new model is not as sensitive as other models for changes of R c /H s and therefore is more reliable.
To assess the applicability of new model to real cases, the model is evaluated with field data of a vertical wall breakwater in Samphire Hoe, UK (Pullen et al., 2009). Distributions of the measured values and the predicted overtopping rate by different models for this case are indicated in Fig. 5. Compared to other formulas, the new model shows more accurate predictions. Franco's predictions are underestimated and the formulas of EurOtop and Goda give overestimated outputs, especially for low overtopping rates that will result in uneconomical designs.

Conclusions
New relationship for the prediction of overtopping rate for vertical walls introduced using combination of multiple   nonlinear regression and M5′ model tree. For this purpose, the related data from database CLASH for caisson type structures and other vertical walls were investigated. Data for breakwaters with a smooth surface were used and effective parameters were described. Among several output relationships created using different formations of effective parameters, finally the most accurate and the simplest one was presented with pronouncing nonlinearity of the relative water depth effect on wave overtopping of vertical structures for two classes of the relative freeboard namely 1.31 being the threshold value of the change. This was interpreted for the effect of wave breaking by interaction of reflected waves by incident one and the amount of energy losses caused by overtopped water. The accuracy of the model to experimental data has been compared with that of other existing models and it shows that the proposed model is more accurate than other existing models. The performances of the models with field measurements are evaluated using statistical parameters and show that the accuracy of these predictions has been improved. The coefficients of the models for the design likely exceeding the lesser of 50% were also modified. As the model provided more accurate results than other existing models and also the simplicity and convenience of use compared with other soft computing tools such as network and nervous, can the case of these relationships be well used. Fig. 4. shows the changes of DR versus relative crest freeboard (R c /H s ) for the different models. It is clear that discrepancy ratio of the new model is not as sensitive as other models for changes of R c /H s and therefore is more reliable.