DOI: https://doi.org/10.18517/ijods.4.1.10-25.2023

Adaptive Android APKs Reverse Engineering for Features Processing in Machine Learning Malware Detection

Benjamin Aruwa Gyunka (1) , Aro Taye Oladele (2) , Ojeniyi Adegoke (3)
(1) Department of Branch Operations Central Bank of Nigeria, Kano, Nigeria
(2) Department of Mathematical and Computing Sciences, KolaDaisi University, Ibadan, Nigeria
(3) Department of Computer Science, Maldives National University, Male, Maldives
Fulltext View | Download

Abstract

The key component that makes the detection of android malware possible is the availability of the right triggers and pointers, which are found in the Android application packages, known as features or attributes. These are fundamental in the training of the different machine learning algorithms to produce the required detection model. The process of extracting these attributes or features, from the Android application packages, is known as reverse engineering. This paper delved into the experimental detail processes of applying reverse engineering procedure, using Sublime Text 2 and Androguard Plugin, on Android Application packages for the extraction of, particularly permissions, which are the targeted features. The study further discussed the cleaning stages, using NotePad++, Microsoft Excel Worksheet, and MS Word, to sort out all the relevant and important features by removing all the noisy ones. A total of 1500 Android apps were downloaded from both benign and malicious sources and used for the experiment. The cleaned or important features extracted from these application packages at the end of the reverse engineering processes are 162 in total and these were further used to form a feature binary matrix of size 1500 by 163 (including the class features).

Article Details

How to Cite
[1]
B. A. Gyunka, A. T. Oladele, and O. Adegoke, “Adaptive Android APKs Reverse Engineering for Features Processing in Machine Learning Malware Detection”, Int. J. Data. Science., vol. 4, no. 1, pp. 10-25, May 2023.
Section
Articles

References

B. Mellars, A forensic examination of mobile phones. Digital Investigation, 1(4), 266–272, 2006, https://doi.org/10.1016/j.diin.2004.11.007

United Nations. World Population Prospects 2019. In Department of Economic and Social Affairs. World Population Prospects 2019.

K. Bankmycell. How Many Phones Are In The World? 2019.

GSMA-Intelligence. Definitive data and analysis for the mobile industry, 2019

H. Andrew, Android Forensice: Investigation, Analysis and Mobile Security for Google Android (1st ed.). Amsterdam: Syngress Publishing, 2011

L. Tung, Android fragmentation: There are now 24,000 devices from 1,300 brands.2015.

Gartner. Worldwide Smartphone Sales Grew 3.9 Percent in First Quarter of 2016. Retrieved on June 16, 2021 from https://www.gartner.com/newsroom/id/3323017

Y. Feng, S. Anand, L. Dillig, and A. Aiken. Apposcopy : Semantics-Based Detection of Android Malware Through Static Analysis. Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14), 16–22, 2014. https://doi.org/10.1145/2635868.2635869

N. Aquilina. Cross-Platform Malware Contamination Cross-Platform Malware Contamination (Vol. 11). 2015, London.

D. Arp, M. Spreitzenbarth, H., Malte, H., Gascon, and K. Rieck. Drebin, Effective and Explainable Detection of Android Malware in Your Pocket. Symposium on Network and Distributed System Security (NDSS), (February), 23–26, 2014, https://doi.org/10.14722/ndss.2014.23247

S. Arzt, S., Rasthofer, C, Fritz, E., Bodden, A, Bartel, J, Klein, and P, Mcdaniel. FlowDroid : Precise Context, Flow , Field , Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps. PLDI ’14 Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, 259–269, 2014, https://doi.org/10.1145/2594291.2594299

M, Lindorfer, S, Volanis, A, Sisto, M, Neugschwandtner, E, Athanasopoulos, F, Maggi, and S. Ioannidis. AndRadar: Fast discovery of Android applications in alternative markets. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 8550 LNCS, 51–71, 2014, https://doi.org/10.1007/978-3-319-08509-8_4

A, Narayanan, L, Yang, L Chen, and L, Jinliang, Adaptive and scalable android malware detection through online learning. IJCNN 2016, 2484–2491, 2016, https://doi.org/10.1109/IJCNN.2016.7727508

R, Raveendranath, V, Rajamani, A. J, Babu, and S. K. Datta, Android malware attacks and countermeasures: Current and future directions. 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies, ICCICCT 2014, 137–143, 2014, https://doi.org/10.1109/ICCICCT.2014.6992944

M, Spreitzenbarth, F. C, Freiling, F, Echtler, T, Schreck, and J, Hoffmann, Mobile-sandbox: Having a Deeper Look into Android Applications. Proceedings of the 28th Annual ACM Symposium on Applied Computing, 1808–1815, 2013, https://doi.org/10.1145/2480362.2480701

R. J. Whelan, T. R., Leek, J. E, Hodosh, P. A., Hulin, and B. Dolan-gavitt, Repeatable Reverse Engineering with the Platform for Architecture-Neutral Dynamic Analysis. 22(1), 2016.

E. Eilam, REVERSING Secret of Reverse Engineering (1st ed.), 2005, https://doi.org/10.1007/s13398-014-0173-7.2

V. D. Aguilera, Android reverse engineering: understanding third-party applications. OWASP EU Tour. Bucharest: The OWASP Foundation, 2013.

S. Y. Yerima, .S. Sezer, and G. McWilliams, Analysis of Bayesian Classification-based Approaches for Android Malware Detection. Information Security, IET, 8(July 2013), 25–36, 2014, https://doi.org/10.1049/iet-ifs.2013.0095

K. Alfalqi, R. Alghamdi, and M. Waqdan, Android Platform Malware Analysis. 6(1), 140–146, 2015.

B. Baskaran, and A. Ralescu, A Study of Android Malware Detection Techniques and Machine Learning. Proceedings of the 27th Modern Artificial Intelligence and Cognitive Science Conference 2016, Dayton, OH, USA, April 22-23, 2016., 15–23, 2016.

S. Chen, M. Xue, L. Fan, S. Hao, L, Xu, H. Zhu, and B. Li, Automated poisoning attacks and defences in malware detection systems: An adversarial machine learning approach. Computers and Security, 73(Bo Li), 326–344, 2018, https://doi.org/10.1016/j.cose.2017.11.007

Y. Dong, Android Malware Prediction by Permission Analysis and DataMining. The University of Michigan-Dearborn, 2017.

O. S. Adebayo, Android-Based Malware Classification Using Apriori Algorithm with Particle Swarm Optimization. International Islamic University of Malaysia, 2017.

L. Apvrille, and A. Apvrille, Identifying unknown android malware with feature extractions and classification techniques. Proceedings - 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015, 1, 182–189, 2015, https://doi.org/10.1109/Trustcom.2015.373

S. Y. Yerima, S. Sezer, and I. Muttik, A New Android Malware Detection Approach Using Bayesian Classification. In Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on, 121–128, 2013, https://doi.org/10.1109/AINA.2013.88

J. Sahs, and L. Khan, A Machine Learning Approach to Android Malware Detection. Intelligence and Security Informatics Conference, 141–147, 2012, https://doi.org/10.1109/EISIC.2012.34

F. Idrees, M. Rajarajan, M. Conti, T. M. Chen, and Y. Rahulamathavan, PIndroid: A novel Android malware detection system using ensemble learning methods. Computers & Security, 68, 36-46, 2017.

A. H. Mostafa, M. M. Elfattah, and A. A. Youssif, Reduced Permissions Schema for Malware Detection in Android Smartphones. In Proc. Recent Advances in Computer Science, 19th Int. Conf. on Circuits, Systems, Communications and Computers (CSCC 2015) pp. 406-413), 2015.