preface

With the popularity of Code security, more and more developers know how to defend against language-independent vulnerabilities such as SQLI and XSS, but little is known about some vulnerabilities and defects related to the development language itself. Therefore, these points are the focus of our Code audit. The purpose of this article is to summarize some of the points that often cause problems in PHP code, which we focus on when auditing. (PS: This paper simply lists the problems, and the underlying causes of the problems are not explained in detail)

If there are any mistakes in this article, please correct them:

1. Code audit definition

Code audit refers to the inspection of the source code to find bugs in the code, which can cause security problems. This is a skill that requires a wide range of skills, including programming (understanding the logic of code), understanding how vulnerabilities are created, and familiarity with systems and middleware.

2, code audit ideas

1) Reverse trace checks the parameters of the sensitive function, and then backtrace the variables to determine whether the variables are controllable and not strictly filtered. 2) Forward tracking firstly find out which files are receiving external functions, and then track the process of variable transfer to observe whether any variables are passed into high-risk functions or whether there are code logic loopholes in the transfer process. This way of forward tracing, digging more than reverse tracing. 3) Empirical judgment Directly mining function point vulnerabilities According to my own experience to determine which functions of this kind of application usually have vulnerabilities, and directly read the whole function code.

3, PHP code audit needs to master the following (similar to other languages)

1) PHP programming language features and fundamentals 2) Web front-end programming fundamentals 3) Vulnerability formation principle 4) code audit ideas 5) characteristics differences between different systems and middleware.

Vulnerability instances

TODO: continue to enrich and add the actual vulnerability cases of each point, such as file_put_contents, copy, file_get_contents and other read and write operations, and unlink, file_exists and other deletion judgment file functions for the difference in path processing caused by deletion bypass

Override variables such as extract() and parse_str()

The extract function imports variables (such as _GET, \_GET, _GET, _POST) from the array, taking the array’s key name as the value of the variable. The parse_str function parses variables from a format string like name=Bill&age=60. If the first function does not set EXTR_SKIP or EXTR_PREFIX_SAME to handle variable conflicts, and the second function does not use an array to accept variables, there will be variable overwriting problems intval() integer overflow, round down, and integer judgment

The maximum signed range for a 32-bit system is -2147483648 to 2147483647. The maximum signed range for a 64-bit system is 9223372036854775807. Therefore, On 32-bit systems intval(' 1000000000000 ') returns 2147483647 and intval(10.99999) returns 10. Intval and int are truncated, not rounded. The conversion does not begin until it encounters a number or sign, and then ends when it encounters a non-number or terminating sign (\0)Copy the code

Size comparison problems caused by accuracy problems in floating-point numbers

PHP is indifferent to decimals after they are less than 10^-16

Var_dump (1.000000000000000 == 1) >> TRUE var_dump(1.0000000000000001 == 1) >> TRUE is_numeric() differs from Intval () Is_numeric function in identifying the Numbers they ignore string at the beginning of ' ', '\ t', '\ n', '\ r \', \ 'v', \ 'f'. While '. 'can appear anywhere, E and E can appear in the middle of parameters and still be judged as numbers. Namely is_numeric (" \ n \ r \ t 0.1 e2 ") > > TRUE intval () function will ignore ' ' ' ', '\ r \ n', '\ t', '\ n', '\ 0', that is to say intval (" \ n \ r \ t 12 ") > > 12Copy the code

STRCMP () array comparison bypass

int strcmp ( string
s t r 1 , s t r i n g str1 , string
str2 )

Argument str1 first string. Str2 The second string. Return < 0 if str1 is less than str2;

Return > 0 if str1 is greater than str2; If they are equal, return 0.

STRCMP ()==0; STRCMP ()==0

Sha1 () and MD5 () functions are bypassed for array comparison

The sha1 () MD5 () function accepts string arguments by default, but returns NULL if the arguments are arrays. Similar sha1 (\ _GET \ [\] ‘name’) = = = sha1 (_GET [‘ password ‘]) of compare can bypass the weak type = = a bypass

This aspect of the problem is widespread, do not make too much explanation

Md5 (' 240610708 '); / / 0 e462097431906509019562988736854 md5 (' QNKCDZO); / / 0 e830400451993494058024219903391 md5 (' 240610708 ') = = md5 (' QNKCDZO) md5 (' aabg7XSs) = = md5 (' aabC9RqS) Sha1 (' aaroZmOk ') == SHA1 (' aaK1STfY ') SHA1 (' aaO8zKZF ') == SHA1 (' aa3OFF9m ') '0010E2' == '1E3' '0x1234Ab' == '1193131' '0 xabcdef' = = '0 xabcdef'Copy the code

When converted to Boolean, the following are only considered FALSE: FALSE, 0, 0.0, “”,” 0 “, array(), NULL

Prior to PHP 7, if you passed an illegal number (8 or 9) to an octal number, the rest of the digits were ignored. var_dump(0123)=var_dump(01239)=83

After PHP 7, a Parse Error is generated.

When a string is converted to a numeric value, it is converted to a numeric value and the following non-numeric characters are omitted. Converts to 0 if there is no number at the beginning

$foo = 1 + "Bob -1.3e3"; $foo is integer (1) \$foo = 1 + "bob3"; $foo is integer (1) \$foo = 1 + 10 Small Pigs; $foo is integer (11) ' '== 0 == false' '123' == 123 ' 'ABC' == 0 '123a' == 123 '0x01' == 1 '0e123456789' == '0 e987654321 [false] = =' [0] = = / NULL = = ['] 'true NULL = = false = = 0 = = 1Copy the code

Eregi () matches bypass

Eregi () accepts string arguments by default and returns NULL if an array is passed. PHP variable names should not have dots [.] or Spaces, otherwise they will be converted to underscores [_].

parse_str("na.me=admin&pass wd=123",$test);
var_dump($test); 

array(2) {
  ["na_me"]=>
  string(5) "admin"
  ["pass_wd"]=>
  string(3) "123"
Copy the code

The in_arrary() function defaults to loose comparisons (type conversions)

In_arrary (" 1asd ",arrart(1,2,3,4)) => true in_arrary(" 1asd ",arrart(1,2,3,4), true) => false \\(strict comparison requires strict set to true, Htmlspecialchars () only escapes double quotes by default, but does not escape single quotes. If both quotes are escaped, add the ENT_QUOTES argument in PHP4 or PHP <5.2.1. Sprintf () formatting bug (can eat escaped single quotes)Copy the code

The printf () and sprintf () functions allow for padding by using % followed by a character

For example, the %10s string will default to be filled with a space to the left of length 10. It is also possible that %010s will be filled with character 0, but if we want to fill it with other characters, we need to use ‘single quotes’, for example, %’ #10s, which is filled with # (the percent sign will not only eat the single quotes, Will eat \ slash)

Sprintf () can also be written with the specified parameter position

The number after % indicates the number of arguments, and the $indicates the formatting type

So when the special characters we entered are put in quotes to escape, but are concatenated using the sprintf function

For example, the ‘%’ in %1$’ %s’ is treated as if the % padding is used, causing the latter ‘to escape

Mysql > select * from ‘\’; mysql > select * from ‘\’;

Select * from user where username = ‘%\’ and 1=1# ‘;

If the statement is concatenated using the sprintf function, \ after % is eaten, resulting in ‘escape

<? php $sql = "select * from user where username = '%\' and 1=1#';" ; $args = "admin"; echo sprintf( $sql, $args ) ; //result: select * from user where username = '' and 1=1#' ? >Copy the code

PHP Warning: sprintf(): Too few arguments

At this point we can use %1$to eat the transfer added \

<? php $sql = "select * from user where username = '%1$\' and 1=1#' and password='%s';" ; $args = "admin"; echo sprintf( $sql, $args) ; //result: select * from user where username = '' and 1=1#' and password='admin'; ? >Copy the code

In PHP, = assignment takes precedence over and

The c=is_numeric(c = IS \_numeric(c=is_numeric(a) and is_b) programs are meant to be numeric, but the B) programs are meant to be numeric, Libcurl/parse_URL/libcurl/url/SSRF

When there are multiple @ symbols in a URL, the host fetched in parse_URL is the host after the last @ symbol, and libcurl is the host after the first @ symbol fetched. So when you parse http://[email protected]:[email protected], PHP gets host = baidu.com, and libcurl = eval.com. In addition, when parsing domain names like https://[email protected], PHP gets the host [email protected], but libcurl gets the host evil.comCopy the code

The flexibility of the URL standard results in bypassing filter_var and parse_URL for SSRF

The filter_var() function applies to evil.com; Google.com returns false that the url is incorrectly formatted, but 0://evil.com:80; Google.com: 80 /, 0://evil.com: 80, google.com: 80 /, 0://evil.com:80\google.com: 80 / returns true. Retrieving web content via file_get_contents and returning it to the client can cause XSS