Subscribe to the Blog

Get articles sent directly to your inbox.

TL;DR

  • Static analysis for IaC security is a good start to “shift left”
  • Cloud Security needs to continue evolving and moving things further left to achieve true “Continuous Security”.

Background

Let’s discuss what methods exist today and understand the endgame for IaC security. Please note, we will not be talking about each specific tool in the space. This is to maintain objectivity in this blog.

We spend a lot of time thinking about how cloud security has evolved in such a short time span. Recently, we underwent SOC2 compliance for our cloud infrastructure. We had to rethink how we managed our 7 year old cloud infrastructure. In just a year we went from a mixed use of Ansible and the AWS console to “everything-as-code”, meaning we defined all our infrastructure into an Infrastructure-as-Code language (Terraform) and we secured it using our own tool. You can read more about it here.

The focus of this blog is to actually explore the current state of “shift left” and how we can continue to automate additional security strategies. From a security perspective, the ability to codify infrastructure extends the opportunity to apply “shift left” concepts for Cloud Security. This article by Naor Penso is a great read to understand all the benefits of implementing security strategies for IaC (share it with your CISO!). 

The Present: Securing the Code

At the moment, securing the IaC involves some variation of the Static Analysis (or SAST), which is the process of analyzing your proprietary code for security issues. The main advantage is to identify security issues before the code is compiled into a running application. Additionally, testing code against certain outcomes has become a standard part of any developer’s practice – linters, SCA, SAST tools are becoming standard practice. 

Static Analysis, or SAST, presents a great opportunity of introducing security earlier in the development process. Additionally, SAST tools for IaC can be implemented at multiple points in a developer’s workflow. The reality is that SAST can be noisy. Noise breeds mistrust. This is not a new problem. Developers can run into false positives a lot with CVEs with application security tools as well. The easiest alternative is to ignore these violations, and ultimately ignore the tool.

When it comes to IaC security, there are generally two approaches to integrating SAST into your framework The following table summarizes this the best:

Static Analysis MethodsAdvantagesDisadvantages
Scanning the Terraform HCL codeFastest scanning capabilitiesLower detection rate for complex TF
Scanning the Terraform PlanHigher detection rateSlower. Depending on TF complexity, generating TF plan can take longer

It is hard to determine which method is best from a glance, because speed is an important factor for developers (which are the main consumers of these tools) and there are certainly several tools that can scan both. However, if you are optimizing for the highest detection rate, then the following always holds true.

If your Terraform contains any of the following, you will likely need to analyze from the Terraform plan:

  1. Use of Input variables
  2. Data sources external from the TF
  3. HCL expressions that abstract deployment instructions

You will notice that there is a common issue across all of them that recommend the analysis of the Terraform Plan. That is, the “instructions” to deploy the infrastructure are not self-contained within the Terraform file and therefore SAST tools that rely solely on the raw Terraform files have lower detection rates.

Generally speaking, static analysis methods relying on raw Terraform is “good enough” when the Terraform environment is small and does not rely on dynamic data sources.  At Indeni, we wanted to specialize in security and make sure organizations can enforce their policies with the highest degree of accuracy. We care about speed, but introducing 10-20 seconds of delay into the CI is acceptable for us. So we made an active choice to utilize the Terraform plan file to have the highest possible detection rate.

Related Article 

Introducing Cloudrail’s Static Analysis Mode

Now, in terms of integrating SAST tools into your workflow, both variations are acceptable for any stage of the workflow, but speed becomes an important factor. Running a “terraform plan” can take a significant amount of time to generate if you have a complex environment. This is where SAST tools that scan the raw Terraform may be preferred, even if they are significantly hampered by detection. There is an active community project called tool-compare that is helping users benchmark the detection rate of each tool that is worth looking at.

In conclusion, SAST tools are a great start for dealing with IaC security. There are two methods to implement Static Analysis and the main decision criteria is accuracy and speed of scanning. 

Analyzing the Terraform Plan achieves the following objective “How can we analyze the IaC with more context?” For DevSecOps practitioners, this also means “what other context can be derived before analyzing IaC? We believe there is an endgame that Shift Left can achieve with Cloud Security Strategies by automating more security requirements into the code testing lifecycle.

The Endgame: Continuous Security

The premise of “shifting left” is to test the code a lot and early in the process to reduce technical debt. Studies show that fixing issues in code can take 10x less effort before deployment and 100x less effort before the project goes into maintenance. For Cloud Security, “shift left” means continuously reducing the cloud security debt. So, seems like IaC SAST tools fit the bill so far?

Well, SAST tools effectively enable organizations to reduce security debt by validating against static security requirements. For example, the infamous “Make sure S3 buckets are not public” are validated by most IaC SAST tools. These requirements are commonly defined by security benchmarks like CIS. Benchmarks are a great start and everyone should begin their IaC security journey this way. But being compliant does not mean that you are secure. This was made evident by a SOC2 auditing startup that experienced a security breach in their own cloud.

For cloud security to truly shift left, we as an industry need to automate more security strategies earlier in the testing process. Currently, a lot of vendors and OSS projects currently focus on the detection of security issues once infrastructure is deployed. In Application Security, this would be considered Dynamic Analysis or DAST, or “Shift right testing”. 

This anti-pattern is creating a divide between security strategies and how to implement them. On the one hand, Static Analysis can be used to scan for security issues contained within the target code. However, it is incapable of catching all infrastructure security problems until the infrastructure is deployed. The “shift right” testing methodologies have existed for some time. Commonly known as “Dynamic Analysis” or DAST, this strategy is problematic because it means security is observed after deployment. 

This is where the SAST framework lends itself very well to the situation. If there is a way to integrate security events from the DAST approach into a SAST tool, you can implement dynamic security events and enrich your IaC security analysis with more context-awareness. Enriching SAST from DAST methodologies allows more automation. We call it the “Continuous Security” framework.

Let’s use the following as an example and how we can apply this framework:

A developer wants to deploy a service in AWS. His service has several Lambda functions, each requiring read access to several target S3 buckets. The developer has avoided some basic IAM misconfigurations using a SAST tool, but they are not clear on all the privileges needed to run the Lambda (they are not an IAM expert). As a result, the IAM roles defined are likely overly permissive. 

An IAC SAST tool will likely be able to flag obvious scenarios, like an IAM policy that grants the lambda function to assume admin permissions. So it could catch the below scenario:

{
  "Version": "2012-10-17",
  "Statement": {
    "Effect": "Allow",
    "Action": "*",
    "Resource": "*"
  }
}

In reality, the developer has this for the lambda IAM policy because he was not sure whether the lambda function would need additional access in the future :

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1620276342637",
      "Action": ["s3:Get*", “s3:Put*”, “s3:List*”],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

The problem here is that the lambda function is granted to do more than just read the target s3 bucket. It can duplicate its content and place them into a public S3 bucket. 

Related Article  Indeni Cloudrail - IaC Security Without The Noise

So how would the Lambda developer identify what is the “least-permissive” policy? Well, the first step is to make sure that all the Lambda operations are recorded into CloudTrail. Then, you would need to let the Lambda function operate in a test environment and simulate all of its actions. From CloudTrail, we would trace all of the lambda’s activities. For example, the Lambda function may need to fetch a list of buckets and make the ListBuckets call. From the CloudTrail events, the Lambda developer can identify the “diff” between CloudTrail and the IAM policy. If the IAM policy allows for more actions than what are needed by the Lambda function, then the policy is “overly permissive”. 

Now, this workflow is very manual and can be automated and there are tools that can help you here with the detection aspect. But the remediation thereafter is still manual. Manual processes do not bode well for DevOps. This is where we can apply the Continuous Security framework. Once a policy is detected as “overly permissive”, then we can treat this as a security event.

This security event can be integrated into the IaC security tool. In this scenario, the tool would have to merge this dynamic security event in-memory and use it as part of its understanding of the IAM Policy. Then, when the IaC SAST tool scans the IAM policy again after the event has been, it uses the “overly permissive” security event as part of the context and reports back to the Terraform developer that the scope of the policy can be reduced further. As part of the finding, the IaC security tool should also provide what would need to be fixed to “right-size” the policy.

Implementing the Continuous Security framework makes the most sense when the Terraform developer analyzes their code as early as the testing phase. As a result, this helps organizations create a complete feedback loop for detecting overly permissive policies and fixing them before they ever make it into production.

IAM Security has now truly shifted left.

Now, let’s say that the Lambda developer from the earlier example needed to store credit card information of the application’s users. The developer accidentally stores it into an S3 bucket in a public repository.

Again, the process to identify datastores that contain sensitive data requires a live environment and a “shift right” strategy. There are also several tools that solve the discovery process to identify sensitive datastores and again, each finding can be considered a “security event”. For example, consider using Amazon Macie to detect sensitive data and reflect that back through the IAC Security tool, within the CI/CD pipelines.

For the above case, using one of these data security tools would flag the S3 bucket as containing sensitive data. That finding can be merged in-memory with the context of the IaC so that the Terraform developer can be informed of any public S3 buckets that contain sensitive information exposed. 

Data Security has now shifted left.

The main premise behind Continuous Framework is to automate more cloud security challenges like right-sizing IAM policy and data security into the shift-left framework. At the moment, IaC tools evaluate using statically defined security issues. With Continuous Framework, you can introduce dynamic security requirements and introduce the same shift-left benefits to all Cloud Security challenges.

Conclusion:

Shifting left for cloud security has just started. As a community, we are all invested in making Cloud Security easier to tackle. We believe that the Continuous Security framework will drastically reduce the cloud security issues we see five years from now, while making the cost of securing cloud environments far less than they are today.