Senanayake, J.; Kalutarage, H.; Al-Kadri, M.O.; Piras, L.; Petrovski, A.

Labelled vulnerability dataset on Android source code (LVDAndro) to develop AI-based code vulnerability detection models

Conference paper

Senanayake, J., Kalutarage, H., Al-Kadri, M.O., Piras, L. and Petrovski, A. 2023. Labelled vulnerability dataset on Android source code (LVDAndro) to develop AI-based code vulnerability detection models. Vimercati, S. and Samarati, P. (ed.) International Conference on Security and Cryptography (SECRYPT) 2023. Rome, Italy 10 - 12 Jul 2023 SCITEPRESS - Science and Technology Publications. pp. 659-666 https://doi.org/10.5220/0012060400003555

Publication dates
Type	Conference paper
Title	Labelled vulnerability dataset on Android source code (LVDAndro) to develop AI-based code vulnerability detection models
Authors	Senanayake, J., Kalutarage, H., Al-Kadri, M.O., Piras, L. and Petrovski, A.
Abstract	Ensuring the security of Android applications is a vital and intricate aspect requiring careful consideration during development. Unfortunately, many apps are published without sufficient security measures, possibly due to a lack of early vulnerability identification. One possible solution is to employ machine learning models trained on a labelled dataset, but currently, available datasets are suboptimal. This study creates a sequence of datasets of Android source code vulnerabilities, named LVDAndro, labelled based on Common Weakness Enumeration (CWE). Three datasets were generated through app scanning by altering the number of apps and their sources. The LVDAndro, includes over 2,000,000 unique code samples, obtained by scanning over 15,000 apps. The AutoML technique was then applied to each dataset, as a proof of concept to evaluate the applicability of LVDAndro, in detecting vulnerable source code using machine learning. The AutoML model, trained on the dataset, achieved accuracy of 94% and F1-Score of 0.94 in binary classification, and accuracy of 94% and F1-Score of 0.93 in CWE-based multi-class classification. The LVDAndro dataset is publicly available, and continues to expand as more apps are scanned and added to the dataset regularly. The LVDAndro GitHub Repository also includes the source code for dataset generation, and model training.
Keywords	Android Application Security; Code Vulnerability; Labelled Dataset; Artificial Intelligence; Auto Machine Learning.
Sustainable Development Goals	9 Industry, innovation and infrastructure
Middlesex University Theme	Creativity, Culture & Enterprise
Research Group	Software Engineering, Theory & Algorithms (SETA)
Conference	International Conference on Security and Cryptography (SECRYPT) 2023
Page range	659-666
Proceedings Title	Proceedings of the 20th International Conference on Security and Cryptography, SECRYPT - Volume 1
Series	SECRYPT
Editors	Vimercati, S. and Samarati, P.
ISSN	2184-7711
ISBN	9789897586668
Publisher	SCITEPRESS - Science and Technology Publications
Print	10 Jul 2023
Publication process dates
Accepted	23 Apr 2023
Deposited	18 Jul 2023
Output status	Published
Publisher's version	SECRYPT-23_Labelled_LVDAndro.pdf License CC BY-NC-ND 4.0 File Access Level Open
Copyright Statement	Senanayake, J., Kalutarage, H., Al-Kadri, M., Piras, L. and Petrovski, A., Labelled Vulnerability Dataset on Android Source Code (LVDAndro) to Develop AI-Based Code Vulnerability Detection Models. DOI: 10.5220/0012060400003555 In Proceedings of the 20th International Conference on Security and Cryptography (SECRYPT 2023), pages 659-666 ISBN: 978-989-758-666-8; ISSN: 2184-7711 Copyright © 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
Digital Object Identifier (DOI)	https://doi.org/10.5220/0012060400003555
Web of Science identifier	WOS:001072829100063
Web address (URL) of conference proceedings	https://doi.org/10.5220/0000167900003555
Language	English

Permalink -

https://repository.mdx.ac.uk/item/8q739

Log in to edit

Download files

Publisher's version

	SECRYPT-23_Labelled_LVDAndro.pdf
License: CC BY-NC-ND 4.0
File access level: Open

Fetching citation counts from Clarivate.

162
total views
14
total downloads
12
views this month
0
downloads this month

Export as

Related outputs

A risk assessment of information security in a diet centre business: a case study

Annahdi, T., Alkubaisy, D. and Piras, L. 2025. A risk assessment of information security in a diet centre business: a case study. Mannion, M., Mannisto, T. and Maciaszek, L. (ed.) 20th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE). Porto (Portugal) 04 - 06 Apr 2025 SCITEPRESS - Science and Technology Publications. pp. 858-867 https://doi.org/10.5220/0013488300003928

Enhancing privacy, censorship resistance, and user engagement in a blockchain-based social network

Thiha, M., Yetgin, H., Piras, L. and Al-Obeidallah, M.G. 2025. Enhancing privacy, censorship resistance, and user engagement in a blockchain-based social network. Mannion, M., Mannisto, T. and Maciaszek, L. (ed.) 20th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE). Porto (Portugal) 04 - 06 Apr 2025 SCITEPRESS - Science and Technology Publications. pp. 67-79 https://doi.org/10.5220/0013238500003928

Assuring privacy of AI-powered community driven Android code vulnerability detection

Senanayake, J., Kalutarage, H., Piras, L., Al-Kadri, M.O. and Petrovski, A. 2025. Assuring privacy of AI-powered community driven Android code vulnerability detection. Garcia-Alfaro, J., Kalutarage, H., Yanai, N., Kozik, R., Ksieniewicz, P., Woźniak, M., Abie, H., Ranise, S., Verderame, L., Cambiaso, E., Ugarelli,, R., Praça, I., Katt, B., Pirbhulal, S., Shukla, A., Pawlicki, M. and Choraś, M. (ed.) 3rd International Workshop on System Security Assurance. Bydgoszcz, Poland 19 - 20 Sep 2024 Springer. pp. 457-476 https://doi.org/10.1007/978-3-031-82362-6_27

Formalizing federated learning and differential privacy for GIS systems in IIIf

Kammueller, F., Piras, L., Fields, B. and Nagarajan, R. 2025. Formalizing federated learning and differential privacy for GIS systems in IIIf. Garcia-Alfaro, J., Kalutarage, H., Yanai, N., Kozik, R., Ksieniewicz, P., Woźniak, M., Abie, H., Ranise, S., Verderame, L., Cambiaso, E., Ugarelli,, R., Praça, I., Katt, B.., Pirbhulal, S., Shukla, A., Pawlicki, M. and Choraś, M. (ed.) 3rd International Workshop on System Security Assurance. Bydgoszcz, Poland 19 - 20 Sep 2024 Springer. pp. 477-487 https://doi.org/10.1007/978-3-031-82362-6_28

Model-based gamification design with Web-Agon: an automated analysis tool for gamification

Zaw, H.K., Piras, L., Calabrese, F. and Al-Obeidallah, M.G. 2024. Model-based gamification design with Web-Agon: an automated analysis tool for gamification. 50th Euromicro Conference Series on Software Engineering and Advanced Applications. Paris, France 28 - 30 Aug 2024 IEEE. pp. 168-171 https://doi.org/10.1109/SEAA64295.2024.00033

Defendroid: real-time Android code vulnerability detection via blockchain federated neural network with XAI

Senanayake, L., Kalutarage, H., Petrovski, A., Piras, L. and Al-Kadri, M. 2024. Defendroid: real-time Android code vulnerability detection via blockchain federated neural network with XAI. Journal of Information Security and Applications. 82. https://doi.org/10.1016/j.jisa.2024.103741

Gamification of E-Learning apps via acceptance requirements analysis

Calabrese, L., Piras, L., Al-Obeidallah, M., Egbikuadje, B. and Alkubaisy, D. 2024. Gamification of E-Learning apps via acceptance requirements analysis. 19th International Conference on Evaluation of Novel Approaches to Software Engineering. Angers, France 28 - 29 Apr 2024 SCITEPRESS - Science and Technology Publications. pp. 291-298 https://doi.org/10.5220/0012550400003687

FedREVAN: real-time detection of vulnerable Android source code through federated neural network with XAI

Senanayake, J., Kalutarage, H., Petrovski, A., Al-Kadri, M.O. and Piras, L. 2024. FedREVAN: real-time detection of vulnerable Android source code through federated neural network with XAI. ESORICS Workshop on Attacks and Software Protection (WASP). The Hague, The Netherlands 25 - 29 Sep 2023 Springer. pp. 426-441 https://doi.org/10.1007/978-3-031-54129-2_25

Android code vulnerabilities early detection using AI-powered ACVED plugin

Senanayake, J., Kalutarage, H., Al-Kadri, M.O., Petrovski, A. and Piras, L. 2023. Android code vulnerabilities early detection using AI-powered ACVED plugin. Atluri, V. and Ferrara, A. (ed.) 37th Annual IFIP WG 11.3 Conference (DBSec 2023). Sophia-Antipolis, France 19 - 21 Jul 2023 Cham, Switzerland Springer. pp. 339–357 https://doi.org/10.1007/978-3-031-37586-6_20

Goal-modeling privacy-by-design patterns for supporting GDPR compliance

Al-Obeidallah, M., Piras, L., Iloanugo, O., Mouratidis, H., Alkubaisy, D and Dellagiacoma, D. 2023. Goal-modeling privacy-by-design patterns for supporting GDPR compliance. Fill, H.-G., Domínguez-Mayo, F.J., van Sinderen, M. and Maciaszek, L. (ed.) International Conference on Software Technologies (ICSOFT). Rome, Italy 10 - 12 Jul 2023 SCITEPRESS - Science and Technology Publications. pp. 361-368 https://doi.org/10.5220/0012080700003538

Android source code vulnerability detection: a systematic literature review

Senanayake, J., Kalutarage, H., Al-Kadri, M.O., Petrovski, A. and Piras, L. 2023. Android source code vulnerability detection: a systematic literature review. ACM Computing Surveys. 55 (9). https://doi.org/10.1145/3556974

A framework for privacy and security requirements analysis and conflict resolution for supporting GDPR compliance through privacy-by-design

Alkubaisy, D., Piras, L., Al-Obeidallah, M., Cox, K. and Mouratidis, H. 2022. A framework for privacy and security requirements analysis and conflict resolution for supporting GDPR compliance through privacy-by-design. Ali, R., Kaindl, H. and Maciaszek, L. (ed.) 16th International Conference on Evaluation of Novel Approaches to Software Engineering. Virtual 26 - 27 Apr 2021 Cham Springer. pp. 67-87 https://doi.org/10.1007/978-3-030-96648-5_4

Developing secured Android applications by mitigating code vulnerabilities with machine learning

Senanayake, J., Kalutarage, H., Al-Kadri, M., Petrovski, A. and Piras, L. 2022. Developing secured Android applications by mitigating code vulnerabilities with machine learning. ACM Asia Conference on Computer and Communications Security (ASIA CCS '22). Nagasaki, Japan 30 May - 03 Jun 2022 Association for Computing Machinery (ACM). pp. 1255–1257 https://doi.org/10.1145/3488932.3527290

Supporting the individuation, analysis and gamification of software components for acceptance requirements fulfilment

Calabrese, F., Piras, L. and Giorgini, P. 2022. Supporting the individuation, analysis and gamification of software components for acceptance requirements fulfilment. Barn, B. and Sandkuhl, K (ed.) IFIP Working Conference on The Practice of Enterprise Modeling. London, UK 23 - 25 Nov 2022 Springer. pp. 33-48 https://doi.org/10.1007/978-3-031-21488-2_3

Confis: a tool for privacy and security analysis and conflict resolution for supporting GDPR compliance through privacy-by-design

Alkubaisy, D., Piras, L., Al-Obeidallah, M., Cox, K. and Mouratidis, H. 2021. Confis: a tool for privacy and security analysis and conflict resolution for supporting GDPR compliance through privacy-by-design. Ali, R., Kaindl, H. and Maciaszek, L. (ed.) 16th International Conference on Evaluation of Novel Approaches to Software Engineering. Virtual 26 - 27 Apr 2021 SCITEPRESS - Science and Technology Publications. pp. 80-91 https://doi.org/10.5220/0010406100800091

Privacy, security, legal and technology acceptance requirements for a GDPR compliance platform

Tsohou, A., Magkos, M., Mouratidis, H., Chrysoloras, G., Piras, L., Pavlidis, M., Debussche, J., Rotoloni, M. and Gallego-Nicasio Crespo, B. 2020. Privacy, security, legal and technology acceptance requirements for a GDPR compliance platform. 2019 International Workshop on Security and Privacy Requirements Engineering. Luxembourg City, Luxembourg 26 - 27 Sep 2019 Springer. https://doi.org/10.1007/978-3-030-42048-2_14

DEFeND DSM: a data scope management service for model-based privacy by design GDPR compliance

Piras, L., Al-Obeidallah, M., Pavlidis, M., Mouratidis, H., Tsohou, A., Magkos, E., Praitano, A., Iodice, A. and Gallego-Nicasio Crespo, B. 2020. DEFeND DSM: a data scope management service for model-based privacy by design GDPR compliance. 17th International Conference on Trust and Privacy in Digital Business. Bratislava, Slovakia 14 - 17 Sep 2020 Springer. https://doi.org/10.1007/978-3-030-58986-8_13

Design thinking and acceptance requirements for designing gamified software

Piras, L., Dellagiacoma, D., Perini, A., Susi, A., Giorgini, P. and Mylopoulos, J. 2019. Design thinking and acceptance requirements for designing gamified software. 13th International Conference on Research Challenges in Information Science. Brussels, Belgium 29 - 31 May 2019 IEEE. pp. 1-12 https://doi.org/10.1109/rcis.2019.8876973

Goal-oriented requirements engineering: an extended systematic mapping study

Horkoff, J., Aydemir, F., Cardoso, E., Li, T., Mate, A., Paja, E., Salnitri, M., Piras, L., Mylopoulos, J. and Giorgini, P. 2019. Goal-oriented requirements engineering: an extended systematic mapping study. Requirements Engineering. 24 (2), pp. 133-160. https://doi.org/10.1007/s00766-017-0280-z

DEFeND architecture: a privacy by design platform for GDPR compliance

Piras, L., Al-Obeidallah, M., Praitano, A., Tsohou, A., Mouratidis, H., Gallego-Nicasio Crespo, B., Bernard, J., Fiorani, M., Magkos, E., Castillo Sanz, A., Pavlidis, M., D'Addario, R. and Zorzino, G. 2019. DEFeND architecture: a privacy by design platform for GDPR compliance. 16th International Conference on Trust, Privacy and Security in Digital Business. Linz, Austria 26 - 29 Aug 2019 Springer. https://doi.org/10.1007/978-3-030-27813-7_6

Goal models for acceptance requirements analysis and gamification design

Piras, L., Paja, E., Giorgini, P. and Mylopoulos, J. 2017. Goal models for acceptance requirements analysis and gamification design. Mayr, H.C., Guizzardi, G., Ma, H. and Pastor, O. (ed.) 36th International Conference on Conceptual Modeling. Valencia, Spain 06 - 09 Nov 2017 Cham Springer. pp. 223-230 https://doi.org/10.1007/978-3-319-69904-2_18

Gamification solutions for software acceptance: a comparative study of requirements engineering and organizational behavior techniques

Piras, L., Paja, E., Giorgini, P., Mylopoulos, J., Cuel, R. and Ponte, D. 2017. Gamification solutions for software acceptance: a comparative study of requirements engineering and organizational behavior techniques. 11th International Conference on Research Challenges in Information Science. Brighton, UK 10 - 12 May 2017 IEEE. pp. 255-265 https://doi.org/10.1109/rcis.2017.7956544

Acceptance requirements and their gamification solutions

Piras, L., Giorgini, P. and Mylopoulos, J. 2016. Acceptance requirements and their gamification solutions. IEEE 24th International Requirements Engineering Conference. Beijing, China 12 - 16 Sep 2016 IEEE. pp. 365-370 https://doi.org/10.1109/RE.2016.43

Using gamification to incentivize sustainable urban mobility

Kazhamiakin, R., Marconi, A., Perillo, M., Pistore, M., Valetto, G., Piras, L., Avesani, F. and Perri, N. 2015. Using gamification to incentivize sustainable urban mobility. IEEE First International Smart Cities Conference. Guadalajara, Mexico 25 - 28 Oct 2015 IEEE. https://doi.org/10.1109/ISC2.2015.7366196

A portable wireless-based architecture for solving minimum digital divide problems

Fenu, G. and Piras, L. 2008. A portable wireless-based architecture for solving minimum digital divide problems. 4th International Conference on Wireless and Mobile Communications. Athens, Greece 27 Jul - 01 Aug 2008 IEEE. pp. 130-136 https://doi.org/10.1109/icwmc.2008.21

Labelled vulnerability dataset on Android source code (LVDAndro) to develop AI-based code vulnerability detection models

Download files

Publisher's version

162

14

12

0

Export as

Related outputs