TL;DR / Summary at the end of the post.
Full Disclosure up-front: I am employed as a Code Scanning Architect at GitHub at the time I published this article.
In Part 1 of this series I made reference to how the “Three types of Static Analysis” differ in terms of the ways they perform scans; and while I talked a bit about how Static code Analysis (SCA) and Binary Static Analysis (BSA) perform shallow / imperfect data flow mapping in their respective “Accuracy, Speed, or Completeness” reviews - the distinguishing difference between Comprehensive Static Analysis (CSA) and legacy methods for analysis boils down to how the technologies treat Local and Remote sources.
In Static Analysis there are three important terms used to define how data flows through an application - sources, sanitizers, and sinks. A source is defined as an input; sanitizers are functions that perform transformations on said input which make it benign from a security perspective; and sinks are where a source is either stored or executed. So what differentiates a Local source from a Remote source? And why should development or security teams care?
Local sources are considered inputs that are either instantiated at compile / run time, or inputs controlled by system values that cannot be changed without privileged access. Some examples might include values found in properties files, environment variables, operating system arguments, or those pulled from trusted sources like a secrets management platform.
On the other hand, Remote sources are considered inputs controlled by unprivileged users once the application is running. Some examples include values provided by requests to a Uniform Resource Identifier (URI), inputs supplied to a form field, or other data that might be captured by the application - such as a user agent string. Really any input controlled by an untrusted or unprivileged user which is consumed by a running application should be considered a Remote source.
How to support this content: If you find this post useful, enjoyable, or influential (and have the coin to spare) - you can support the content I create via Patreon.️ Thank you to those who already support this blog! 😊 And now, back to that content you were enjoying! 🎉
So why should we bother distinguishing between Local and Remote sources when performing Static Analysis? Because the consideration of Local sources as a provider of tainted input is a preposterous threat model, and leads to hundreds (or thousands) of false positive findings that developers just won’t remediate. With the exception of microservice architectures, if you’re considering Local sources a viable threat vector in your application then you’ve likely got much bigger problems to deal with than exploitable flaws in your code. The inclusion of Local sources as tainted input is where legacy Static Analysis technologies start to show their age.
Due to the shallow / imperfect data flow mapping that occurs with Static Code Analysis and Binary Static Analysis respectively, the easiest way to compensate for this lack of complete visibility has been to simply treat all sources as tainted inputs until they go through a sanitizer. Unfortunately this practice has lead to security organizations being inundated with hundreds (or thousands) of vulnerabilities reported by their Static Analysis tools. This “noise in the machine” has engendered a high level of friction (and sometimes downright hostility) between security and development teams over the last two decades - but thankfully modern Static Analysis technologies are addressing this problem.
Since Comprehensive Static Analysis (CSA) is performed by accessing both the source code and build process, it is able to compile an accurate data flow map of the application - and therefore distinguish between Local and Remote sources. What this means in practice is that CSA tools produce a substantially higher ratio of “true positive” findings that are exploitable, while also reducing the over-all volume of findings by removing vulnerabilities from Local sources. For many security teams this can be a hard pill to swallow because they have been conditioned to see hundreds (or thousands) of findings from Static Analysis tools, and have likewise grown accustomed to fighting with development teams in order to remediate vulnerabilities.
But what about microservice architectures? Well, this is the exception that proves the rule. Since microservice architectures tend to follow the Unix principle of “doing one thing really well”, they generally consume inputs that look like Local sources in CSA tools - and often won’t have sinks that occur within the same microservice codebase. The good news is that this can be overcome with small customizations that are easily made in modern CSA tools, and is achievable with the support of open source examples.
Since Local sources are either instantiated at compile / run time - or come from inputs controlled by system values that cannot be changed without privileged access (such as properties files or environment variables) - it is fairly ridiculous to consider them as tainted inputs.
On the other hand, Remote sources come from inputs that are controlled by unprivileged users after the application is running (such as requests to a URI, or values provided to a form field). These are the types of sources that lead to findings by bug bounty hunters and traditional application penetration tests, and are realistically exploitable within your application.
That being said, microservice architectures are the exception that prove the rule. Sources rarely sink within the same microservice, and are often perceived as Local sources by Comprehensive Static Analysis tools. Some minor modifications are generally necessary to support this kind of architecture when performing Static Analysis, and open source examples are available to make this easy to implement in modern CSA tools.
While the Signal to Noise ratio has been extremely poor in Static Analysis tools over the last two decades due to the inclusion of Local sources as a form of tainted input, Comprehensive Static Analysis has changed this paradigm by providing development and security teams with high impact findings specifically from Remote sources. Likewise, by weeding out Local sources from the results, Comprehensive Static Analysis is reducing friction between development and security teams by drawing attention to actionable findings that drive focus toward remediation.
And that concludes Part 3 of this series on Static Analysis! I’ve got a few ideas waiting in the backlog for my next few posts, such as a series on Building a Successful DevSecOps Program, a few singleton posts about Security and DevSecOps in general - as well as some “Off-topic” ideas. Stay tuned, and in the interim remember to
git commit && stay classy!
If you found this post useful or interesting, I invite you to support my content through Patreon 😊 and thanks once again to those who already support this content!