Image-based spams is one of the latest techniques used in spamming to avoid detection by text-based spam filters. Image-based spams are based on text embedded in an image. Optical Characters Recognition (OCR) has been used in many image-based filters. It can be integrated with other text-based spam filters to create a filter that can classify both text and images. However, obscuring techniques pose a challenge on these types of filters. In our experiments we show the effect of image obscuring on OCR based filters. We tested three text-based filters on over five thousands ENRON text emails and also tested the same text embedded on images. We used Asprise OCR to extract the text and fed the output to the spam filters. The comparison showed the effectiveness of using OCR for image-based spam filtering.
| Date of Award | May 2019 |
|---|
| Original language | American English |
|---|
- OCR
- spam
- image-based spam
- spam filter
- accuracy
- true positive
- false positive.
Protection Against Illicit Email Tracking
Al Zahmi, K. (Author). May 2019
Student thesis: Master's Thesis