Abstract:This paper introduces the basic principles of SCA technology, application scenarios, industry TOP SCA commercial tools analysis and technical development trend; Let readers have a basic preliminary understanding of SCA technology, can better accurately use SCA tools to find some security problems in application software, so as to help improve the quality of software security.

This article is shared from Huawei Cloud Community “SCA Test Technology (I)”, the original author: security technology ape.

1. What is SCA

SCA (Software Composition Analysis), commonly known as Software Composition Analysis, is a technology to identify, manage and track Software by analyzing some information and characteristics contained in the Software. We all know that in today’s software development, introducing open source software into your project is a familiar way to avoid reinventing the wheel. For example, open source software in open source libraries is growing at a rate of 21% per year (source Forrester report), and open source security threats are becoming an unavoidable topic for organizations. It is one of the most effective methods to use SCA technology to detect and manage the security of application program.

2. Rationale

SCA is theoretically a general purpose analysis method that can analyze objects in any development language, Java, C/C++, Golang, Python, JavaScript, etc. It focuses on the contents of files at the file level. And details of how files relate to each other and how they are combined into targets. The target program form of SCA analysis can be divided into source code and compiled binary files of various types. The data objects analyzed are insensitive to the program architecture and compilation mode, for example: Class names, method/function names, constant strings, etc., are the same whether the target application is running on an x86 platform or an ARM platform, whether it is running on a Windows program or a Linux program. In short, SCA is an application analysis technology across development languages.

SCA analysis process: first decompress the target source code or binary file, extract the features from the file, then identify and analyze the features, get the relationship of each part, so as to get a portrait of the application — component name + version number, and then correlate the list of known vulnerabilities.

Since the target program does not need to be run in the process of SCA analysis, it has the advantages of less dependence on the external, comprehensive analysis, fast and high efficiency.

3. Top SCA tool analysis in the industry

According to Forrester’s latest SCA report, Forrester rated the different tools on 10 dimensions. Based on the overall score, the following Magic Quadrant chart of the TOP 10 SCA tools in the industry was compiled:



3.1 Tool overview analysis

  1. Tools have five TOP 10 SCA support package (note 2) open source software SCA inspection ability (synopsys/Sonatype/going/Jfrog/GitLab), other tools only supported source SCA inspection ability.
  2. Among the 5 support package SCA checking tools, C/C++, Java,.NET language support is good, but Golang, Python, JavaScript language support is weak, such as: The first three languages account for 90%+ of the component objects supported by Synopsys, with a corresponding high detection rate, while Golang’s component detection rate is much lower.
  3. SCA is mainly used for the detection to the application of open source software is a typical coding problem detection expanding trend, such as going tool, it can be provided for such as buffer overflow, the command line, deadlocks, repeated release, integer overflow, UAF, formatted string vulnerabilities, typical coding problem such as SQL injection ability of detection.

4. Factors influencing the accuracy of SCA analysis

  1. From the SCA principles, we can see that there are two factors that affect the accuracy of analysis: the number of components and detection algorithms that SCA tools support, and the way applications refer to open source software.
  2. Because SCA tools determine whether an application references a component based on the characteristics of the sample component to match the characteristics in the application under test, the higher the number of supported components, the higher the detection rate, and the lower the number of supported components, the higher the detection miss. In addition, whether the detection algorithm and feature design are reasonable also directly affects the accuracy and efficiency of the analysis. Different SCA tool manufacturers have different solutions, just like fingerprint/face recognition on mobile phones, and different manufacturers have different sensitivity and accuracy in recognition.
  3. Applications in reference to open source software, the different application even if there are also references refer to the same component different functions, reference function of how many are different, so the result is that the component contained in the application characteristics of number and size of different, the characteristics of the reference function to include more commonly, the function of reference contains less and less features. However, the number of component features contained in an application directly affects the accuracy of detection by SCA tools. The fewer the component features, the more difficult it is for SCA tools to detect. Therefore, even if two different applications reference the same component, one application may detect the component, but the other application may not detect the component. This scenario is particularly obvious for SCA tools to detect binaries.
  4. Because of the above SCA analysis accuracy, in extreme cases if a component cannot be detected, then there is no way to know if there is a vulnerability in the application for that component.

5. Conclusion:

  1. Whether the source code file SCA inspection tools or binary SCA tools, they are a kind of complementary relation, each have each advantages and disadvantages, such as binary SCA detection can be found in the process of building the security issues, introduced tool chain, the source code of the SCA is not, SolarWinds event is well illustrates this point.
  2. At present, known vulnerabilities of open source software are detected by SCA tools based on component name + version number to correlate known vulnerabilities. For partial compilation scenarios (only part of component code is compiled into binary files) and patch scenarios (vulnerabilities have been repaired), the false positive rate is high.
  3. SCA tool scanning efficiency and accuracy are a pair of contradictions, this is a tool manufacturers need to consider the trade-off, and the technology that can improve the accuracy without reducing the scanning efficiency is always the subject of research and pursuit of SCA tool manufacturers.
  • Note 1: Top 10 Open Source Software Programming Languages: JavaScript (51%), (10%), Java, C + + (7%), Python (7%), Ruby (%), Go (4%), C (4%), PHP (4%), TypeScript (4%), (3%), Perl, C # (2%), Shell (1%).
  • Note 2: A software package is a distribution package used to install and run a product. It contains the binary files compiled by the product that can be run, such as.so/.jar/.exe/.dll/.pyc
  • Note 3: A. LiSense Risk Management; B. Vulnerability identification; C. Active vulnerability management; D. Policy management; E.S DLC integration; F. Container and serverless scanning; G. Audit report; H. Risk reporting; I. Repair speed report; J. Manufacturer’s own analysis;

Click on the attention, the first time to understand Huawei cloud fresh technology ~