Back to datasets
Dataset assetOpen Source CommunityAndroid MalwareInformation Security

Android Malware Datasets

Contains multiple popular Android malware datasets for research and analysis of malware on the Android platform.

Source
github
Created
Oct 31, 2019
Updated
Oct 31, 2019
Signals
545 views
Availability
Linked source ready
Overview

Dataset description and usage context

Dataset Overview

1. Android Malware Genome Project

  • Description: This project collected over 1,200 Android malware samples, covering most Android malware families from August 2010 to October 2011.
  • Publication: Dissecting Android Malware: Characterization and Evolution. Yajin Zhou, Xuxian Jiang. Proceedings of the 33rd IEEE Symposium on Security and Privacy (Oakland 2012).
  • Homepage: http://www.malgenomeproject.org (dataset sharing discontinued)

2. M0Droid Dataset

  • Description: M0Droid is a tool for identifying and classifying Android malware by generating behavioral signatures through capturing system call requests.
  • Publication: M0droid: An android behavioral‑based malware detection model. Damshenas M, Dehghantanha A, Choo K K R, et al. Journal of Information Privacy and Security, 2015, 11(3): 141‑157.
  • Homepage: http://cyberscientist.org/m0droid-dataset/

3. The Drebin Dataset

  • Description: This dataset contains 5,560 applications from 179 different malware families, collected from August 2010 to October 2012.
  • Publication: Drebin: Efficient and explainable detection of android malware in your pocket. Arp D, Spreitzenbarth M, Hubner M, et al. Proc. of 17th Network and Distributed System Security Symposium, NDSS. 14.
  • Homepage: http://user.informatik.uni-goettingen.de/~darp/drebin/

4. A Dataset based on ContagioDump

5. AndroMalShare

6. Kharon Malware Dataset

  • Description: The Kharon dataset is a fully reverse‑engineered and documented malware collection for evaluating research experiments.
  • Publication: Kharon dataset: Android malware under a microscope. CIDRE, EPI. Learning from Authoritative Security Experiment Results (2016): 1.
  • Homepage: http://kharon.gforge.inria.fr/dataset/

7. AMD Project

  • Description: AMD contains 24,553 samples, divided into 135 types, covering 71 malware families, spanning from 2010 to 2016.
  • Publication: Android malware clustering through malicious payload mining. Li Y, Jang J, Hu X, et al. International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, Cham, 2017: 192‑214.
  • Homepage: http://amd.arguslab.org

8. AAGM Dataset

  • Description: The AAGM dataset is semi‑automatically generated by installing Android applications on real smartphones, containing 1,900 applications.
  • Publication: Towards a Network‑Based Framework for Android Malware Detection and Characterization. Arash Habibi Lashkari, Andi Fitriah A.Kadir, Hugo Gonzalez, Kenneth Fon Mbah and Ali A. Ghorbani. PST, 2017.
  • Homepage: http://www.unb.ca/cic/datasets/android-adware.html

9. Android PRAGuard Dataset

  • Description: This dataset includes 10,479 samples, obfuscated using seven different techniques derived from the MalGenome and Contagio Minidump datasets.
  • Publication: Stealth attacks: an extended insight into the obfuscation effects on Android malware. Davide Maiorca, Davide Ariu, Igino Corona, Marco Aresu and Giorgio Giacinto. Computers and Security, 2015.
  • Homepage: http://pralab.diee.unica.it/en/AndroidPRAGuardDataset

10. AndroZoo

  • Description: AndroZoo is a collection of 5,781,781 distinct APKs, each analyzed by multiple antivirus products to determine maliciousness.
  • Publication: AndroZoo: Collecting Millions of Android Apps for the Research Community. K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon. Mining Software Repositories (MSR) 2016.
  • Homepage: https://androzoo.uni.lu/
Need downstream help?

Pair the dataset with AI analysis and content workflows.

Once the source passes your review, move straight into summarization, transformation, report drafting, or presentation generation with the JuheAI toolchain.

Explore AI studio