Abstract: This paper introduces the basic principle of SCA technology, application scenarios, analysis and description of the industry’s TOP SCA commercial tools and technical development trend. Readers can have a basic and preliminary understanding of SCA technology, and better and accurately use SCA tools to find some security problems in application software, so as to help improve the quality of software security.

This article is shared by Huawei Cloud community “Talking About SCA Test Technology (1)”, originally written by: Security technology ape.

1. What is SCA

SCA (Software Composition Analysis) Software component Analysis is commonly understood as a technology to identify, manage and track the Software by analyzing some information and features contained in the Software. We know that introducing open source software into your project and avoiding the duplication of wheels is all too familiar in software development today, with open source libraries growing at a rate of 21 percent per year (source: Forrester Report), and open source security threats becoming an unavoidable topic for organizations. SCA technology is one of the most effective ways to detect and manage the security of applications.

2. Fundamentals

SCA is theoretically a general-purpose analysis method that can analyze objects in any development language, Java, C/C++, Golang, Python, JavaScript, and so on. It focuses on file content at the file level. And the details of how files relate to each other and how they compose each other into goals. The target program form of SCA analysis can be either source code or compiled binary files of various types. The data objects analyzed are insensitive to program architecture and compilation mode, such as: Class names, method/function names, constant strings, etc., are the same regardless of whether the target program is running on x86 or ARM, Windows or Linux. In short, SCA is an application analysis technology across development languages.

SCA analysis process: first decompress the target source code or binary file, and extract features from the file, and then identify and analyze the features to obtain the relationship between each part, so as to obtain the portrait of the application – component name + version number, and then associate the list of known vulnerabilities.

SCA analysis process does not need to run the target program, so it has the advantages of less external dependence, comprehensive analysis, fast and high efficiency.

3. Industry TOP SCA tool analysis

According to Forrester’s latest SCA report, Tools are scored across 10 dimensions, and the magic quadrants of the industry’s TOP 10 SCA tools are as follows:

3.1 Tool Overview Analysis

1. The tools have five TOP 10 SCA support package (note 2) open source software SCA check ability (synopsys/Sonatype/going/Jfrog/GitLab), other tools only supported source SCA inspection ability.

2. Among the five supporting software packages, SCA inspection tools have good support for C/C++, Java and.NET languages, but weak support for Golang, Python and JavaScript languages, such as: The first three languages account for more than 90% of the component objects supported by Synopsys and have a high detection rate, while Golang has a much lower detection rate.

3. SCA has expanded from primarily being used as a detection tool for open source software to detecting trends in typical coding problems in applications, such as Veracode tools that provide protection against buffer overflow, command line injection, deadlocks, rerelease, integer overflow, UAF, formatting string vulnerabilities, Ability to detect typical coding problems such as SQL injection.

4. Analysis of factors affecting the accuracy of SCA analysis

1. From the principle of SCA, we can know that the factors affecting the accuracy of analysis can be divided into two aspects: one is the number of components supported by SCA tools and detection algorithms, and the other is the way that applications reference open source software.

2. SCA tool determines whether the application references the component according to the characteristics of the sample component to match the characteristics in the program under test. Therefore, the more supported components are, the higher the detection rate will be. In addition, whether the detection algorithm and feature design are reasonable also directly affects the accuracy and efficiency of analysis. Different SCA tool manufacturers have different solutions, just like fingerprint/face identification on mobile phones, different manufacturers have different sensitivity and accuracy of identification.

3. The application in reference to open source software, the different application even if there are also references refer to the same component different functions, reference function of how many are different, so the result is that the component contained in the application characteristics of number and size different, the characteristics of the reference function to include more commonly, the function of reference contains less and less features. The number of component features that an application contains directly affects the accuracy of SCA tool detection. The fewer component features there are, the more difficult SCA tool detection is. Therefore, even if two different applications reference the same component, one application may detect the component, and the other application may not detect the component. This scenario is especially obvious for SCA tools to detect binaries.

4. Because of the accuracy of SCA analysis described above, if a component is not detected in the extreme case, then there is no way to know if there is a vulnerability of that component in the application.

5. Conclusion:

1. No matter the SCA detection tool of the source file or the SCA detection tool of the binary file, they are a complementary relationship with their own advantages and disadvantages. For example, the SCA detection of the binary file can find the security problems introduced by the tool chain during the construction process, while the SCA detection of the source file cannot. SolarWinds is a good example of this.

2. Currently, SCA tool detects known vulnerabilities of open source software based on component name + version number to associate known vulnerabilities. For partial compilation scenarios (only part of component codes are compiled into binary files) and patch patching scenarios (vulnerabilities have been repaired), the false positive rate is high.

3. SCA tool scanning efficiency and accuracy are contradictory, which is where tool manufacturers need to weigh. The technology that can not only improve accuracy but also reduce scanning efficiency is always the research topic and pursuit goal of SCA tool manufacturers.

Note 1: Top 10 Open Source Software Programming languages: JavaScript (51%), (10%), Java, C + + (7%), Python (7%), Ruby (%), Go (4%), C (4%), PHP (4%), TypeScript (4%), (3%), Perl, C # (2%), Shell (1%).

Note 2: A software package is a distribution package used to install and run the product. It contains the binaries compiled by the product and ready to run, such as.so/.jar/.exe/.dll/.pyc

Note 3: A. Cicense Risk Management; B. Vulnerability identification; C. Active vulnerability management; D. policy management; E.S DLC integration; F. Container and serverless scanning; G. Audit reports; H. Risk reporting; I. Repair speed report; J. Vendor analysis

Click to follow, the first time to learn about Huawei cloud fresh technology ~