A global government services contractor recently migrated to the cloud with the intent of modernizing their application and increasing their business agility. But the effort introduced deployment delays and security issues as the teams adapted to the new environment. Lab Zero helped them implement DevSecOps best practices resulting in a more secure, agile and responsive infrastructure.
Recently we were approached by a client to help them implement DevSecOps best practices after their recent migration to the cloud. As with any adoption of new technologies and methods, the engineering teams experienced pain points in their day-to-day practice. Infrastructure requests were taking longer to fulfill, and there was growing concern from the organization that the security of their infrastructure was not appropriately considered in their new process flow.
Our role was to dive deep into their process, pinpointing where things were falling short. Through detailed interviews with various teams, we charted out their existing process and uncovered key issues
The client had lifted and shifted all on-prem services into AWS, utilizing Terraform and adopting infrastructure as code (IaC). IaC involves writing configuration files that serve as a “recipe” for infrastructure environments. These files can then be treated as code, enabling developers to implement, test and deploy infrastructure change requests more securely and automatically.
In reality, that wasn’t happening. Terraform was new to the operations team, and they were struggling to fulfill requests while maintaining existing infrastructure. As a result, they continued to make manual changes through the AWS console, with the intent to capture those changes later in Terraform.
This practice resulted in constant need for infrastructure drift remediation, slowing down fulfillment of infrastructure requests until remediation efforts were completed. It also effectively broke the intent of Terraform being the single source of truth.
Our first recommendation was to enforce the Terraform repository as the source of truth for the infrastructure. Deployment of infrastructure should only be done through automation, Terraform Enterprise in this case, after passing scans and tests as part of the continuous integration and deployment pipeline. This would ensure all changes were enshrined in code and minimize security flaws during the fulfillment of new infrastructure requests. It would also prevent deployment stoppage due to remediation efforts to consolidate manual configuration changes. To this effort, we recommend tightening of role-based access control (RBAC) to prevent further resource drift effectively locking out manual configuration changes.
We also recommended a git-based development process for the Terraform repository, encouraging peer programming and reviews that would upskill team members on Terraform configurations. As part of the checks for pull request (PR) review, we recommended static analysis tools such as tfsec, terrascanner, and checkcov that can detect common terraform misconfigurations for cloud resources. These tools can be incorporated as part of the Operation team’s IDE as a pre-commit hook and as a check in the continuous integration automation. Commonly provisioned resources can be modularized in Terraform configuration and unit tested to ensure they’re working as expected. These tests should also be a part of the checks before a PR is ready for review. The generation of a speculative plan prior to deployment would act as an additional validation check during the review process.
A key aspect of DevSecOps is shifting security earlier in the software development life cycle (SDLC). Security flaws detected earlier in the process helps developers to resolve them faster. We already saw a part of this principle with the incorporation of static analysis tools in the CI automation process.
Another part is the active participation of the security team in the pre-provisioning phase of the process. Security team’s signoff of the speculative plan during the review process adds another level of validation before the infrastructure changes are applied. And the validation and certification by the Security Team of the deployed resources ensures everything is appropriately provisioned.
The combination of static analysis tools, speculative plan reviews, and post deployment certification ensures quick identification of security vulnerabilities during multiple stages of the provisioning process.
Another tenant of DevSecOps is to normalize shared responsibility towards application security across teams, ensuring there’s effective communication between Operations, Security, and Application Development teams.
We observed that the main line of communication with the Operations team was a ticketing system. The Application teams would submit infrastructure requests through this system, but they weren’t taking into account existing infrastructure. This forced the Operations team to re-architect already provisioned resources as requests were fulfilled. This refactoring was opaque to the Application and Security teams, so they didn’t have visibility to the status of their requests.
We recommended Operations representatives to be present in the Technical Review Board meeting, providing feedback on any necessary refactoring efforts before execution of new requests, and to identify those efforts as blockers to the new requests. This would prevent on-the-fly refactoring, and prevent potential introduction of security flaws due to lack of due diligence in these efforts.
We also recommended that Operations have representatives in Application Team stand-ups, giving updates on infrastructure provisioning requests to increase visibility into the status of the request fulfillment process. Ultimately, we encouraged better cross-functional teams, communication, and collaboration to overcome some of the process pain points identified in the interviews.
Migrating on-prem services to the cloud presents new security challenges. DevSecOps practice normalizes cross-functional teams with shared responsibilities toward the security of applications. By building a solid GitOps infrastructure provisioning process that requires reviews and testing automation, security is designed during development. This framework then becomes a strategic move towards a more secure, agile, and responsive system.