Smelling Secrets: Leveraging Machine Learning and Language Models for Sensitive Parameter Detection in Ansible Security Analysis

Published in Proceedings of the 25th IEEE International Conference on Source Code Analysis and Manipulation (SCAM'25), 2025

Infrastructure as Code is an emerging paradigm to automate the configuration of cloud infrastructures. Infrastructure code often processes secret information, such as passwords or private keys. Mishandling such secrets can lead to information disclosure vulnerabilities, yet existing efforts to detect them rely on pattern matching of parameter and variable names, causing false positives and negatives due to suboptimal string patterns.

This paper aims to address these limitations by assessing the effectiveness of traditional Machine Learning (ML) and transformer-based Language Model (LM) classifiers to predict sensitive module parameters in Ansible, one of the most popular IaC tools. We collect a dataset of over 160,000 Ansible module parameters and their documentation, containing more than 16,000 parameters that expect secret data. Then, we train several ML algorithms and find that the Random Forest algorithm performs best, achieving 93.5% precision but limited recall (72.7%). In parallel, we evaluate multiple pretrained zero-shot language models, which achieve a recall of up to 90.4% at the expense of a lower precision of up to 88.5%. We subsequently fine-tune the language models, resulting in nearly perfect precision (99.8%) and recall (99.8%) on the ground truth dataset.

We compare the best performing ML and LM classifiers to two baselines that use string patterns. We find that the ML classifier achieves a performance comparable to the two baselines, while the fine-tuned LM outperforms all approaches. A qualitative comparison reveals that the approaches are complementary to the baselines, motivating future work to use prediction models to reduce false positives in reports generated by inexpensive baselines. However, we also find that the fine-tuned LM misses several secrets caused by noise in the dataset, highlighting the importance of fine-tuning on a high-quality ground truth.

Recommended citation: Opdebeeck, R., Pontillo, V., Velázquez-Rodríguez, C., De Meuter, W., & De Roover, C. (2025). Smelling Secrets: Leveraging Machine Learning and Language Models for Sensitive Parameter Detection in Ansible Security Analysis In Proceedings of the 25th IEEE International Conference on Source Code Analysis and Manipulation (SCAM'25) [Accepted].