This thesis provides novel solutions for regression modeling in the topics related to the two themes, namely water and energy, which are critical to the sustainable development in the United Arab Emirates (UAE). Ensemble-based nonlinear regression modeling is applied to produce improved performance and generalization. A new framework integrating the concepts of diversity, bias-variance decomposition and robust fusion is proposed. A novel two-stage resampling technique, facilitating homogeneity in sub-samples, is introduced. This allows controlling the degree of similarity between the sub-samples, thereby controlling the diversity between the individual learners in the ensemble model and hence producing models with improved performance and generalization ability. The methodology is applied over two case studies and compared to earlier results in the literature. The first case study is in regional flood frequency analysis where models to estimate flood quantiles at ungauged sites are developed. Canonical correlation analysis (CCA) is used to form a canonical physiographic space using the site characteristics from gauged sites. Artificial Neural Network (ANN) based ensemble models are applied to identify the functional relationship between the physiographic variables in the CCA space and the flood quantiles. Jackknife validation procedure is used to evaluate the performance of the proposed models. The second case study is solar resource mapping. Ensemble models are applied to identify the functional relationship between the data acquired using geostationary satellite images and the solar irradiance components. Treating cloud-free and cloudy days as separate cases, different ensemble models are validated and tested to estimate diffuse horizontal irradiance (DHI) and direct normal irradiance (DNI). The global horizontal irradiance (GHI) is then computed from the models' outputs. For the purpose of demonstration, models have been trained using ground-truth data from three ground measurement stations for the year 2010 and tested against data from two other independent stations for the year 2009. Three-fold cross validation technique is used to choose the best ensemble models, which are tested against the two independent stations to assess their performance and generalization ability. Generalized ensemble combination is employed to further improve the estimation. In the two case studies, the generalized ensemble models show improved performance and generalization ability when compared to the results observed in previous work on the two case studies.
| Date of Award | Jun 2013 |
|---|
| Original language | American English |
|---|
| Supervisor | Taha B. M. J. Ouarda (Supervisor) |
|---|
- Storm sewers; Urban runoff-Management; Ensemble; Sustainable Development; United Arab Emirates; Nonlinear Regression Modeling.
Ensemble-Based Nonlinear Regression Modeling Using a Novel Resampling & Model Integration Framework
Alobaidi, M. H. (Author). Jun 2013
Student thesis: Master's Thesis