Digital Forensic Analysis for Digital Files by Using Deep Learning

  • Mohammed A. Alnuaimi

Student thesis: Master's Thesis

Abstract

This thesis attempted to look at existing approaches to identify file types for digital forensic purposes, as law enforcement seeks to understand evidence within a given digital device. Criminals tend to cover any information which may lead to them, including digital files. This thesis also has highlighted possible direction to analyze any digital evidence that maybe destroyed by looking at existing methods on digital forensic that earlier researchers have investigated and offered a comparison between machine learning and deep learning techniques. Most of the current approaches have been identified to have challenges, a factor that influences how file types are defined, that has prompted to look for new systems that will address this gap. The attempt to introduce artificial intelligence to digital forensic is considered first in the literature. Furthermore, this thesis will propose a methodology that utilizing deep learning, as most traditional forensic tools use the file signature database to identify the file type, which ultimately fails to detect any changes. The proposed methodology aims to utilize a dataset of different file types that will feed into our model. As a result, the proposed model will determine the file type based on the hexadecimal value, as each file has a unique signature. The proposed model will able to detect different file types, even if there is a change in their hexadecimal values. In order to prove our model functionality, three testing scenarios were introduced. The experiments show that 99% accuracy was achieved for the first scenario, followed by 99.1%, 99.5% for the second and third scenarios, respectively. To further improve our project so that it's possible to find useful information on site, we would deploy our model to be portable. The development of a mobile application would be an idea in which we upload a copy of all files we want to identify on the crime scene. Secondly, we can have a built-in USB that is loaded with our model. A detailed report of all file types within that digital device could be extracted using that USB, and it will clearly identify any damaged files. Besides, it's possible to add more file type into our model training process in order to identify a wider range of files.
Date of AwardDec 2020
Original languageAmerican English

Keywords

  • Digital Forensic
  • File Classification
  • Hexadecimal Value
  • Deep Learning.

Cite this

'