With the exponential growth in the amount of data generated and collected and its varying types and formats, texts, images, video, audio, etc., researchers are trying to make better use of these data to enhance decision support and decision-making through machine learning techniques. Machine learning methods have proven to be effective in analyzing this vast amount of data in various formats to obtain patterns, detect trends, gain insight, and predict outcomes based on historical data. However, generating models from scratch from these vast amounts of data across various real-world applications is costly in terms of time and data consumption. Model adaptation from one application domain to another is an efficient methodology for solving this problem. Model adaptation can reuse the knowledge embedded in an existing model to train another model. However, model adaptation is challenging due to dataset bias or domain shift. In addition, data access from both the original (source) domain and the destination (target) domain is another issue in the real-world due to data privacy and cost issues (gathering additional data may cost money). Several domain adaptation algorithms and methodologies have been introduced in recent years; they reuse trained models from one source domain for a different but related target domain. From state of the art, we know that most existing domain adaptation approaches aim at modifying the trained model structure or adjusting the latent space of the target domain using data from the source domain. Domain adaptation approaches can be evaluated over several criteria: accuracy, knowledge transfer, training time, and budget. In some real-world scenarios, the owner of the trained model restricts access to the model structure and the source dataset. To solve this problem, we propose a methodology to efficiently select data from the target domain (minimizing consumption of the target domain data) to adapt the existing model more economically while achieving acceptable accuracy. Our approach is designed for supervised and semi-supervised learning and is extendable to unsupervised learning.
| Date of Award | Apr 2023 |
|---|
| Original language | American English |
|---|
| Supervisor | ERNESTO Damiani (Supervisor) |
|---|
- Domain Adaptation
- Supervised Learning
- Distribution Independence
- Data Selection
Grey Box Model Reuse via Domain Adaptation
Alshehhi, M. (Author). Apr 2023
Student thesis: Doctoral Thesis