Hand Gesture Recognition Under Dynamic Illumination Conditions

  • Buti Al Delail

Student thesis: Doctoral Thesis

Abstract

This thesis develops methods that result in hand gesture recognition under dynamic illumination conditions for assisting a user in interacting with a Smart TV via hand gestures. Vision-based hand gesture recognition systems are employed as human-computer interfaces to increase the comfort of the user, and they provide a more intuitive interaction. In this thesis, we propose to solve the problem of varying illumination conditions in different pipelines of a recognition system. A typical hand gesture recognition system is composed of three subcomponents: 1) detection: to automatically identify the hand subregion within an image frame, 2) tracking: to determine the new location of the hand in consecutive frames, and 3) recognition: to recognize various hand poses, movements, and gestures for a specific action or command. The problem of varying illuminations is investigated within the three main components of the recognition system. We address the problem from a normal video camera of a Smart TV, where the sequences captured include the presence of illumination variations, in such a way that the recognition system performance is affected. Some of the challenges that the system can address include the following: low light conditions, diverse hand colors, and disappearing targets, which are all caused by varying the illumination conditions. It is possible that advances in this area will make many applications of human recognition systems practical, cheap and widely used. With regard to detection, this thesis experiments with deep learning methods to achieve hand detection that is robust to illumination changes. We propose a novel approach for hand detection using multi-feature convolutional neural networks (CNNs). The algorithm incorporates CNNs to use color, shape, texture and motion features to detect the hand. Furthermore, to address the changes in the illumination conditions, the network is trained on scenes that have varying illuminations to learn features from hand images that are insensitive to movement, scaling, rotation and illumination variations. The proposed features are easily and quickly learned using 3-layer convolutional networks, and thus, they require less computational power and are faster than a deep network model. Moreover, in our experiments, we integrated our model with different bounding box detectors, and in addition, we have results on fully training other deep learning models to detect hand gestures under varying illumination conditions. With regard to the tracking, we solve the problem of illumination-invariant tracking using likelihood estimation within a particle filter. Existing particle filter-based tracking frameworks mainly address changes in illumination through different choices of color-space or features, which are still affected by various changes in the illumination. First, a simpler model based on homomorphic filtering is introduced, and further, a complex model is introduced based on the luminosity and direction of the light. In the beginning, we present an alternate likelihood estimation algorithm that helps to address illumination changes using a homomorphic filteringbased weighted illumination model. In other words, a homomorphic filter is first used to separate the illumination and reflectance components from the image, and further, by associating appropriate weights to the illumination, the target image is reconstructed to accurately measure the likelihood. Then, we propose a novel particle filter-based approach for robust object tracking uner dynamic illumination conditions, in other words, to address the changes in the illumination that are caused by variations in the ambient lighting in the scene. Our tracking technique model accommodates illumination variations by predicting changes in the illumination intensity and the direction of the light source. An appropriate update strategy for the template dictionary is used along a sparse representation model to solve the problem of drifting due to appearance changes during the tracking. The proposed algorithm is examined within various particle filter-based tracking algorithms on scenarios from public datasets as well as on our gesture illumination variation dataset. Finally, with regard to the recognition, a convolutional long short-term memory (LSTM) network is used to classify features that were extracted from video sequences of hand gestures using our previous models. Therefore, we conclude with methods for a hand gesture recognition that improve the robustness of the gesture recognition systems to different illumination conditions.
Date of AwardJul 2018
Original languageAmerican English
SupervisorMohamed Zemerly (Supervisor)

Keywords

  • Computer vision
  • signal processing
  • gesture recognition
  • object detection.

Cite this

'