Datasets

DroidMalVet Dataset

This dataset provides a curated collection of 20 code-level metrics extracted from Android malware applications, each labeled with its corresponding malware family. It comprises 29,162 malicious apps categorized into 202 malware families. The metrics capture structural and behavioral properties of malicious apps and are grouped into four categories: complexity, dimensional, object-oriented, and Android-oriented metrics.

These metrics were extracted by decompiling each malicious application into smali code and analyzing it with a modified static analysis tool. This dataset enables researchers to study malware family detection, characterization, and evolution using a compact and efficient feature set.

Further details can be found in our paper “Lightweight, Effective Detection and Characterization of Mobile Malware Families” [PDF], IEEE Transactions on Computers, 2022.

Citation

If you use this dataset in a project or publication, please cite:

@ARTICLE {DroidMalVet,
  author  = {Elish, Karim and Elish, Mahmoud and Almohri, Hussain},
  journal = {IEEE Transactions on Computers},
  title   = {Lightweight, Effective Detection and Characterization of Mobile Malware Families},
  year    = {2022},
  volume  = {71},
  number  = {11},
  pages   = {2982-2995}
}

Download Policy

We are happy to share this dataset. Please email kelish@floridapoly.edu stating your identity and research scope. We will then provide a download link.

  • Please do not share the data with others, except co-authors on the same project.
  • We are happy to share with other researchers upon request.
  • Please contact us from your institutional/organization email and provide your name, title, and affiliation.

We will maintain a public list of requesting organizations at the bottom of this page.