Full Program »
DitDetector: Bimodal Learning based on Deceptive Image and Text for Macro Malware Detection
Macro malware has always been a severe threat to cyber security although the Microsoft Office suite applies the default macro-disabling policy.
Among the defense solutions at different stages of the attack chain, document analysis is more targeted through detecting malicious documents with macro malware.
It is effective, especially with machine learning methods, but still faces problems handling malware variants, supporting file formats, and attack countermeasures with advanced attack techniques (e.g., Excel 4.0 macro and remote template injection).
In this paper, we find it promising to detect deceptive information embedded in documents to trick users into enabling macros instead of based on file metadata or extracted macro codes. Thus, we propose a novel solution named DitDetector, which leverages bimodal learning based on deceptive images and text for macro malware detection. Specifically, we extract preview images of documents based on an image export SDK of Oracle and extract textual information from preview images based on an open-source OCR engine. And bimodal model of DitDetector contains a visual encoder, a textual encoder, and a forward neural network, which learns based on the joint representation of the two encoders' outputs. We evaluate DitDetector on three datasets, including an open-source malicious document dataset (i.e., MalDoc) and two collected real-world adversary datasets (i.e., a database of Excel macros and a database of remote template injection samples). Our experiments show that DitDetector outperforms four existing macro code-based machine learning methods and five reputable Anti-Virus engines. Especially in the real-world test of advanced macro malware, DitDetector gets the F1 score of 99.93\% which is at least 3.16\% higher than compared solutions.