The background,

Tang Qingsong, Beijing Qujia Technology Co., LTD. Security engineer, physical book “PHP WEB security development combat” author, good at enterprise security construction, SDL security construction. PHPCon 2020 8th PHP Developer Conference to share “PHP Security Coding specification and Review”, NSC 2019 7th China Network Security Conference to share “PHP deserialization Vulnerability Analysis Practice” to watch the 2018 Security Developer Summit as a Web security training camp lecturer

Hello, everyone. I am very glad to share the topic of code Security System Construction with you. My name is Tang Qingsong, and I have done more in SDL at present. Today’s topic has a lot to do with SDL. I am sharing this topic is actually part of the SDL. Many students will also do some work in SDL if they work in Party A, so I hope what I share this time will be helpful to you.

This topic is part of the SDL, but it’s not exactly SDL because I’m mainly focused on the concept of safe leftward movement. So today’s topic is focused on code risk management. So what are the possible risk points in this code security? So it might include technical work and non-technical work, like administrative work and some of the learning stuff that’s covered.

1.1 Content Overview

Today I talk about this topic, mainly lies in consciousness, technology, supervision and learning four aspects how to carry out our safety work? Here I have made a brain map, and in the brain map, I think from the light security system, there are four levels that we can do, such as the safety training.

Among safety training, we are the first to tell him what are the risk points, the second is that we have to teach him how to get to from the pit, then teach him from the pit, we can get him the warehouse code directly, then we go to the analysis, after analysis, and then at the time of training we can tell him you the code where there is some risk, Tell him about the problem.

The third is we told you how you can’t write, how you should write. So now that we’ve developed this set of rules, it’s impossible to say that you have to have a monitoring mechanism to actually keep an eye on you. Here I will also show you how to combine Semgrep and GitLab to make a hook event to detect some risk points in the code in real time.

So number four is we definitely have security tests before we go live. What problems might arise during safety testing? So what I’m going to do today is I’m going to talk about a couple of problem points. So first let’s talk about safety training and how we do it. For safety training, I believe that many students who do technology may be good at their own technology, but if they tell others how to step on pits and some cases, they may not be good at it.

Let’s talk about how I understand securing the entire application. I think it’s not measured by whether one dimension is good enough, it’s a comprehensive aspect. It is also a multi-team work, and we, as security personnel, have the main responsibility here.

We need to do everything we can with development and testing to make sure our applications are safe. We first establish such a security awareness for these developers, which tells him that there will be a lot of vulnerabilities in the Internet, and what are the dangers of these vulnerabilities? So first of all, when he was developing it, he thought he couldn’t make it buggy.

After the second one made him aware, we should teach him how to avoid these pits. We should not let him know that there are these safety risks, he did not know how to deal with them, and then he stepped on the pits. So at this time our security personnel in this technology, to have a certain ability.

The third is at the level of supervision. If you tell a developer that the Internet is full of bugs, you teach him to avoid them. But you don’t have to supervise a lot of people and they may not necessarily do what you want them to do, so you have to have some kind of oversight at this point.

The fourth is event-driven. I believe that as party A, you will certainly encounter some security events to drive. For example, our company is doing games, but also from time to time there will be some drag hanging or so some plug-in these aspects of the problem. So we sort these things out, sort them out into case studies for developers to learn from.

Second, safety training

So I’m going to talk a little bit about a non-technical topic, which is how do we train these developments. First of all, I think there are several aspects of training, AND I can give you some suggestions for reference. For example, what are we going to talk about in this first training? How do I avoid saying it all at once, so do I still say it next quarter? Did I just say it all at once? I think there’s a little bit of a trick to what you should say the first time.

The second one said we did a training, a training, and we brought in all the development team members. So let’s say dozens of people, hundreds of people, and then we do it on a stage or this group does it. So what I’m going to do is I’m going to recommend that you do it in groups, and I’ll talk about why you do it in groups.

The third is the case to use local materials, we must open his code before each training to have a look, take his code to do a training for them. And then we moved on to one of the topics that we’re going to train on, which is a form of this.

2.1 First basic training

What can we talk to him about in his first basic training? I think you can first tell him you are in the party A security, how do you do code audit for them, how do you do security? That’s what she really cares about. Let her know about your work. Then you usually pay attention to what point, you give him full communication. Set up a trust mechanism.

Second, you can tell him the classification of this vulnerability, such as a general coding vulnerability, such as SQL injection, XSS CSRF file upload, and then command to execute code injection and so on. That these problems to tell the end, we can also introduce to him about this logical vulnerability, such as the payment vulnerability, the full vulnerability, verification code vulnerability, SMS vulnerability and so on these vulnerabilities, we can give him popular science.

So when you talk about these vulnerabilities, you can combine them with this group. What kind of business do they do? For example, if he’s in the middle, he might get out of business. He may not have such a payment problem, or there is no user some questions or so unrelated to him this main some questions, you can give a general mention, but do not talk about details.

Third, you can tell him about some methods of code self-inspection. You can teach him some simple methods, such as how to audit his own security problems after he finishes writing the code. But is this parameter filtered, is this parameter mandatory, for example, if I’m going to introduce an ID that’s in PHP, then maybe it’s not using this integer conversion, so it accepts maybe this character, and then concatenates it into this SQL injection, right? So at this point you have to talk to him, you say you have to do a filter when you receive this, and if you don’t do filtering, you have to do pDO when you concatenate social statements, and then look up SQL and show him how to do that.

How did you check the SQL vulnerability? I believe that the technical students of party A have certain views, I will not expand here, you can give a general mention of these development. Generally speaking, it is the students who do development. If you give him a start, he can figure out a lot of things by himself.

2.2 Group Training

The second time is to do training, MY suggestion is group training, for example, your company may be divided into many groups. So for now, I mainly train the students at the back end. So when I do training I usually talk about the back end. So on the back end, each of these groups, they actually involve different things, so for example, some of these groups, they don’t use HTTP, they use this socket, TCP, something like that. Then you tell him some web business bugs, he may not like to hear.

So the purpose of group training, I think the main is to teach students according to their aptitude. So each group has its own different characteristics, and you have to tell him his own different characteristics. So you try to keep it within 10 people at every training session. And I think you need to control it, you need to limit it to 45 minutes. We usually have classes, right? No more than 45 minutes.

I said that you have to have an awareness of the time, you don’t have to talk all the time, and then the other people don’t listen, you have to have such an interaction mechanism. So there are three things I need to say to you. The first one is to start talking about, you must be close to the team of a real code. Every time you do training for him, you have to audit her code first, audit more or less there may be some security problems, or some non-standard problems. You take this code and you go to him and you say, you don’t come straight up, that’s not bad for the case, a lot of people might not want to listen. Then the second is to be close to his business scenario, which was also mentioned just now. When you talk about HTTP for TCP, he is not willing to listen to him. I have no such business at all. The third is to share more forms of stories to form an interactive atmosphere. For example, when you do training for him, you tell him the more complete time, then you talk about the past, you may have dug a pre-stored vulnerability of a website, how you found it, and you may have seen some cases of others on the Internet, then these analysis steps you can tell him about.

So let’s say I found a URL whose ID is equal to 100, and I thought I’d give it a try, so I’ll change the 100 to 101 and hit enter in the URL. I found a message of others, order information of others, a personal user information and other permission information of others. I often communicated with them in the form of this story, which I think was very effective.

2.3 Use local materials for the case

The third is the use of local materials. Most of the time, when we share this with the students in group A, they may not have A vulnerability case of their group. At this time I think you must be nearby, you can tell their department, right? You can tell him about the whole company. And then the case is to get as close to their team as possible. So how do you take these cases? We know that before 2016, there were many cases of this kind of vulnerability in the Internet, including some cases of vulnerability in various large companies. But after June 1, 2016, we basically can’t see why the network security law came out, so many of the more intuitive vulnerability cases, we now have no way to say that it is very easy to get. At this point, you can get it from three sources.

The first one says that when you do regular code audits, you save these cases and take pictures. And then code the sensitive information. But to get the idea across, to put these cases together. The second is from the security test, every time our business will go online, there will be a round of security test, the business test found some problems, can also be sorted out, we missed a case database. The third is the bug incident. For example, there are some general bug events, such as the end of 2021, is there such a component of logic4? I remember a few days before the Spring Festival, there was a command execution bug at that time, that affected the scope is quite large, it can be sorted out these events. You can tell him that you rely on some components and need to update them. If you’re referencing a component, you’d better check its version for bugs. If it has bugs, don’t use it.

Generally speaking, PHP uses Composer, Java uses another package manager, and Python uses a package manager, so we need to update it in time, don’t say we package it once, we don’t update it for several years, that can easily create a security hole.

Third, risk reminder

So the next thing we’re going to talk about is real-time risk alerting. So we gave him after I finish training, we have the training for the first time, also have a training this quarterly, there are also some of these cases interpretation right, we also have a role in the supervision, is after they write the code, we have to timely to remind him, and we each quarter we have a full scan and so on these information.

3.1 Function of risk reminder

First of all, I would like to talk about one of the functions of this risk reminder, which is mainly to strengthen our awareness of a reminder. You don’t want to talk about it and then don’t remind him, after a week or two, he used to the code before, how to write or how to write? You will find that your previous training has not changed much after you tell him. So you have to have a reminder.

That’s a real-time reminder that we can do a tick event in git’s repository, and every time it pushes code, we can pull out its code, and then pull out its changed lines, and pull out the changed lines. Then we will judge whether he has a dangerous function and so on some problems. So if there are any of these problems, let’s give him feedback, let’s tell him where you might be at risk.

There are three points in reminding us of this. The first is to strengthen his security awareness and let him know that security is being taken care of. The second is to block security at its source. But this hook you do not say the danger function, just call back. You can send it back in git with a message telling it that there might be a risk and asking it to pay attention. For example, if you put a variable in the command execution area, you need to make sure that your variable is controllable and filtered to give it a hint.

The third is to improve the speed of safe feedback. So if you don’t have a real-time reminder, and you go and scan his warehouse every two weeks, then maybe his code is online, right?

3.2 Reminder of risk function

So the risk functions, I’ve just listed a few functions, for example, there’s this one that has this code injection, there’s this one that executes commands from the system, there’s this one that uses this plaintext FTP to upload files,

There are some cryptographic libraries, there are some re libraries, there are also some information leakage alerts.

You can put it in your security alert. I think it’s a priority, so you can put these high-risk ones in there for everyone. Like FTP, you can play it or not. That sounds like Pprof. I think you should definitely pay attention. There are some important information in PP info and Golang, it directly uses the naked write this statement and read the contents of the file and execute the system command of these dangerous functions, you give it to remind.

3.3 Hook Usage

How to use the hook event just now? In fact, the principle of this hook is mainly in the Git server, the storage of a hook this script, every time he push, the server will trigger such a script.

When the script is triggered, you can use some command line, you can get which files he has changed, that changed some line numbers, you can get. After finishing these data, you put the rules of this test on some of the dangerous functions I provided just now, as well as your own to expand some, write this rule file to it.

Second, semgrep has been popular and used by many teams, so I think it is a mature thing. You can check the code through sum group and this rule, and then return these risks back. So the specific implementation address, I wrote an article before, or more detailed, then you can go to open, and then according to the operation to achieve the line.

So I’m going to show you the hook. What does it look like? For example, if I’m on the command line and I type a Git commit and commit the code, it will trigger when I push it. So when we push, we can see what file he’s telling me. If an EXEC executes the A variable in Git, it may result in A command injection. Make sure the content is not user-controlled. So this is a suggestive remark, that specific you can go to the interface to optimize it can also be.

Third, code audit

In code audit, we also have four points to share with you.

So first of all, the direction of our code audit, how do we audit? So here are some more technical topics. Then there is a saying that wen no first, wu no second, I believe that technology is the same.

3.1 Code audit direction

So the first one is the generic code type this audit. So let’s say we can audit things like SQL injection, XSS, command execution, file upload, etc., so we can upgrade it through common code. So the second one we have to audit with her business. So let’s say you only have a system with users, you only have a user password retrieval and some vulnerability in terms of permissions.

The third one is the component one, depending on the language the repository is using, for example, FOR PHP, which is composed, which is Java, which is a different form. So put these components together, make sure it uses the component version, and then make sure it doesn’t have any component vulnerabilities. That’s a generic code audit. In fact, I also briefly mentioned the front, in fact, mainly from these three aspects to see.

The first one is the receive parameter, so if I’m receiving an ID here I definitely know that it’s an integer. When I did, I didn’t force it with a plastic. That’s a common thing in a weakly typed language like PHP, so you can keep track of that variable, whether it’s being executed in an SQL statement, whether it’s being returned to this foreground, whether it’s being executed in this command execution or in this code execution or whatever. These are ways of tracking the parameters that are received, it’s not filtered, and we’re just going to go all the way to the end of this program. So that’s one way to do it.

So the second way is the correlation anomie. The correlational anomaly basically says we can go to the function and see if it has a variable in it, and if it has a variable, we can trace the source of the variable. So if this variable is receiving the user, and it’s not filtered, and then it’s put in this dangerous function, then there must be a security problem, right? The third is that we parse its dependency files. So for now PhP, Java Python and Go all have dependency package management. So it’s actually easier for us to test its dependencies now, and we can all do one here. Of course some of it may still be in a more traditional way. For example, PhP is older than PhP 7.0, and there are many systems under PhP 7.0. He doesn’t know how to use Composer. He downloaded the source code directly into his directory. It might be a little bit more difficult for you to parse at this point.

3.2 Tool Selection

At this time, you have to use some third-party tools to analyze it. When it comes to the selection of some tools for code auditing, I have used most of them before, such as Fortify, and I am familiar with them. So check Max is a tool, I haven’t used it yet, because I haven’t bought the third one of them. The code guard is from Chianxin. Well, I used this one for a while.

The fourth one, I currently use more, mainly used in this hook detection. Some of its features are used in auditing systems as well, but for now it’s full code auditing, and I prefer to use Fortify. So in the case of this hook event, because fortify is this parsing of that AST, it’s memory eating and it’s slow, Therefore, I still mainly rely on the Sem group at present, and the CodeQl of the fifth CodeQl includes this tool. I was still learning about it a while ago and have not used it in this production environment, so I am not sure about it.

The sixth tool I’ve used so far. The feeling is that it’s completely open source, including his rules, including his engine system everybody’s open source, but he can only detect other backend languages in PhP for now and he can’t detect them. So now I use fortify and Sem group more. Fortify is about business and Check Max, but code hygiene is also business. Fortify works well for me, but checkMax doesn’t, so I don’t have much to say.

I’ve been using Fortify for a while now, last December. For a month. The main feeling is that it is similar to the number of bugs detected by the welfare model, but its interface design makes me feel very awkward when I click it. It is said that there will be a new version of SemGroup in March this year, by which time people can also try out their SemGroup rules, which are open source. And then this engine of it is encrypted.

Fifth CodeQL. Github has already been used in large scale, of course, can also go to experience. But you can only use it for learning, not for business. Now LET me talk about a batch code audit is an implementation. That’s just a few products at the moment, but the support for a single library is pretty good. When I was a party A, I would encounter such a problem, because I was responsible for the security of the whole company’s code base, SO I could not just detect some security problems in several warehouses. I had to do a batch audit. It’s like our company might have 600 warehouses, one by one, and I’m going to go crazy, like Fortify opening up a warehouse big project, it might take a day or two. The audit alone could take him a year to analyze. So it’s not really realistic to use the default.

3.3 Batch code audit tools

So I wrote a batch code audit tool called QingScan. So its main function is actually contains four parts, one is information collection, the second part is black box detection, the third part is code audit. There’s a special use for that. So, I’m going to focus on this white box audit. The main purpose of a white audit is to pull your project down and then call fortify and the SEM group and all the other code audit tools and scan it one by one. So scan this one and then move on to the next one. It can also be distributed. So far, I have used it in our company, and I recommend you to try it out. So this address is here

Github.com/78778443/Qi…

Four, safety test

Then to share a security test, security tests mainly have these several, Web site test, API interface test, private protocol test and case output.

4.1 Web Site Testing

So this web site test actually I think is relatively routine, there is no too big a technical difficulty

For example, testing a SQL injection, XSS, but generally speaking, SQL injection and XSS these problems have been relatively less, XSS may be reflective XSS more, but I think the impact is not too big, because now all that cookie encryption is HTTP Only another form, So there’s not much to say.

4.2 API interface test

How is the API tested? API interface it’s a little bit different from this web site. For web sites, we might be able to crawler that address out, and then scan that address. So we have results from the scan, so we’ll verify this, and when we’re done we’ll bring it up.

So the problem with the API is that we can’t crawl it. So at this point we usually open an Xray port, using this service mode. And then the phone goes and we point this port to this proxy address for XRay, and then we open some requests, and we collect those addresses, and we scan them. We’ll also have a list of urls that the developer actually provided us. And then the functional test there is also a copy of the test that we are doing on this address to carry out this logic, such as the detection of overreach and the payment loophole and so on.

4.3 Difficulties in testing private protocols

The third is the private protocol test, the private protocol is actually more troublesome to test, such as the socket protocol, then TCP protocol, so we actually have no way to directly put the packet to the resolution, unless we have a client to simulate them. So far, there are only a few key projects. We will compare the data format of this end and the server side with them, and then conduct simulation test. That’s a lot of work. So the private protocol, see you this staff enough, there are not too many good methods, can only simulate such a private protocol client.

The traditional external site detection is the simplest, is to collect the address, then test the general problem spots, and then test the business function. So the business function is that I overpay, and then the user gets the password back and the captcha and so on. Okay, API. The main thing is we have to get the address first, and after we get the address the other tests are pretty much the same. So we have the address. We can do it two ways. The first is to get a list of apis from the development team and figure out what each parameter does. These interfaces are then subjected to some scanning tests and some logical tests, not too different from the traditional concept.

The second is that maybe we didn’t get everything from the development team. So we can use reay to open a port and burp Suite to open a port. Then the phone set up a proxy, put our packets through there, and then collect a batch of this address. The third is more trouble, just mentioned, there is no easy way to understand the format of the packet, and then more trouble. And not easy to do each packet, you have to go through the way of the program, your manual is very no way to change the data. But there’s some data that’s in hexadecimal. So either he’s saying or the inscription you saw says you have to simulate a client for encapsulating packets. See you this manpower enough is not enough, if the manpower is not enough, the test also has no great significance.

4.4 Case Output

Every time we detect a vulnerability, or encounter an emergency response incident, we can record it into our security system, so that we can accumulate experience

So here’s a picture, just for our team, of some of the holes in the company as a whole. There’s a quarterly report, there’s a quarterly report for this department and statistics, some statistics for its vulnerability categories.

Five, the summary

So today I’m going to talk about four things, from this training to making this hook and this code audit, and finally security testing. So this time the topic stopped here, that I hope to help you. See you later.


Author: Tang Qingsong Date: March 15, 2022