Droid-MCFG: Android malware detection system using manifest and control flow traces with multi-head temporal convolutional network

F Ullah, S Ullah, G Srivastava, JCW Lin - Physical Communication, 2023 - Elsevier
Physical Communication, 2023Elsevier
Android is the most popular mobile operating system, making it the main target of malware
attacks. Machine learning-based attack detection techniques have recently emerged as
promising methods that relies heavily on particular features to classify malware. Despite
machine learning-based malware detectors having hundreds of features, attackers can use
feature-related expertise to generate malware variants to avoid detection. Therefore, the
Android security team must constantly develop novel features to detect suspicious attacks …
Abstract
Android is the most popular mobile operating system, making it the main target of malware attacks. Machine learning-based attack detection techniques have recently emerged as promising methods that relies heavily on particular features to classify malware. Despite machine learning-based malware detectors having hundreds of features, attackers can use feature-related expertise to generate malware variants to avoid detection. Therefore, the Android security team must constantly develop novel features to detect suspicious attacks. This paper proposes a novel malware detection method called Droid-MCFG that combines the Android features of manifest and Control Flow Graph (CFG). First, reverse engineering tools are used to mine manifest files and Java source codes from Android Package Kit (APK). Second, to represent Android apps with elevated features, we develop a features selection method that retrieves API calls and API sequences from CFGs. The API calls and manifest information are then combined to produce digital fingerprints of Android app actions. Third, a transfer learning approach based on word2vec is developed to extract trained features from digital fingerprints. To thoroughly analyze the novel features, the word2vec is fine-tuned with random, static, and dynamic strategies. Finally, the multi-head Temporal Convolutional Network (TCN) is designed to identify malware based on fine-tuned features. The TCN employs casual convolutions and dilations due to its temporality and broad receptive fields, making it very responsive to API-call sequences and malware activities in the manifest file. The proposed method achieves a classification accuracy of 96.24% using the CICInvesAndMal2019 dataset.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果