Ansible, a popular Infrastructure-as-Code platform, provides reusable collections of tasks called roles. Roles are often contributed by third parties, and like general-purpose libraries, they evolve. As such, new releases of roles need to be tagged with version numbers, for which Ansible recommends adhering to the semantic versioning format. However, roles significantly differ from general-purpose libraries, and it is not yet known what constitutes a breaking change or the addition of a feature to a role. Consequently, this can cause confusion for clients of a role and new role contributors. To alleviate this issue, we perform an empirical study on semantic versioning in Ansible roles to uncover the types of changes that trigger certain types of version bumps. We collect a dataset of over 70 000 version increments spanning upwards of 7 800 Ansible roles. Moreover, we design a novel structural model for these roles, and implement a domain-specific structural change extraction algorithm to calculate structural difference metrics. Afterwards, we quantitatively investigate the state of semantic versioning in Ansible roles and identify the most commonly changed components. Then, using the structural difference metrics, we train a Random Forest classifier to predict applicable version bumps for Ansible role releases. Lastly, we confirm our empirical findings with a developer survey. Our observations show that although most Ansible role developers follow the semantic versioning format, it appears that they do not always consistently follow the same rules when selecting the version bump to apply.
Recommended citation: Opdebeeck, R., Zerouali, A., Velazquez Rodriguez, C. E., & De Roover, C. (2020). Does Infrastructure as Code Adhere to Semantic Versioning? An Analysis of Ansible Role Evolution. In Proceedings of the 20th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2020) (pp. 238-248). IEEE. doi: 10.1109/SCAM51674.2020.00032