A focus on web technology after 80 I don’t have to fight smart, I only need to fight those lazy people I will surpass most people! Geek xiaojun @ nuggets first original article personal blog: 👉 cnblogs.com 👈

Greedy matches in PHP regular expressions with (no greedy)


Greed match

What is greed matching? Without further ado, let’s take a look at a code case as follows:

$string='aaaaaaabbbbbbbbbbbbccccccc';
/ / the following
$pattern='/ab+/';
// Or as follows
$pattern='/ab.+/';
preg_match($pattern.$string.$arr);
show($arr);
Copy the code

PHP regular expressions are greedy by default and the solution to greedy matching is to use? So stop greed in general plus? To solve the code as follows:

$string='aaaaaaabbbbbbbbbbbbccccccc';
/ / the following
$pattern='/ab+? / ';
// Or as follows
$pattern='/ab.+? / ';
preg_match($pattern.$string.$arr);
show($arr);
Copy the code

Small case 1: here is related to whether or not to add S as a single line character, because after adding S as a single line will produce greed match, so prevent greed in the global search match! The code is as follows:

$string='test test http://www.163.com test test  < / a > ';

// Match url
$pattern='/http:\/\/(ftp|www)\.\w+.(com|org|net)/';
// Match the href. You get greedy matches
$pattern='/href="(.+?) "/s';

preg_match_all($pattern.$string.$arr);
show($arr);

Copy the code

Example 2: Delete all comments in a class file. For example, the Car. Class.php file contains the following contents:


      

/ * * *@description
* @author3 # * /
interface Car{
   function run();
}


/ * * * *@descriptionBMW * */
class Bmw implements Car{
   public function run(){
       echo 'BMW is running!! '; }}/ * * *@descriptionMercedes-benz * */
class Bz implements Car{
   public function run(){
       echo 'Mercedes is running!! '; }}// $Bmw=new Bmw();
// $Bmw->run();
// echo '<br>';
// $Bz=new Bz();
// $Bz->run();

? >
Copy the code

And then I’m going to process the comments in car.class.php and I’m going to delete all the comments that are in there and we’re going to do this

// Load the file
$file='./Car.class.php';
$content=file_get_contents($file);
// Regex matches
$pattern='/\/\*\*.*\*\//s';
// Replace with null
$result=preg_replace($pattern.' '.$content);
// Finally write the processed characters back into the tape file
file_put_contents($file.$result);
Copy the code

This may seem fine, but you may be surprised when you open the car.class.php file. Now there should be only one class left in car.class.php and the rest is gone!!

class Bz implements Car{
    public function run(){
        echo 'Mercedes is running!! '; }}Copy the code

== reason ==: If this substitution is done, the following situation will occur: the reason is; Greedy matches.* matches /* from the beginning to the end of /, whatever in between is treated as. Within the scope of!

Ban greed

The right thing to do is to add? # stop greed match code is as follows:

// Load the file
$file='./Car.class.php';
$content=file_get_contents($file);
// Re matches and disallows greed
$pattern='/ \ \ * \ *. *? \*\//s';
// Replace with null
$result=preg_replace($pattern.' '.$content);
// Finally write the processed characters back into the tape file
file_put_contents($file.$result);
Copy the code

As a result, the comments in the car.class.php file are removed entirely, and the code in the car.class.php file looks like this:

interface Car{
    function run();
}



class Bmw implements Car{
    public function run(){
        echo 'BMW is running!! '; }}class Bz implements Car{
    public function run(){
        echo 'Mercedes is running!! '; }}Copy the code

== small case 3==: search a occurs 1 to 5 times, also includes between, this default is also greedy, because it will extract the maximum match that time! That’s five times the following code:

$string='cbaaaaaa';
$pattern='/ ba/is {1, 5}';
preg_match($pattern.$string.$arr);
show($arr);
Copy the code

Plus? The following code can prevent greedy matching and extract the minimum number of matches as follows:

$string='cbaaaaaa';
$pattern='/ ba {1, 5}? /is';
preg_match($pattern.$string.$arr);
show($arr);
Copy the code

== Tip ==:?? The two question marks are exactly the smallest, right? The sign is zero or one, right? The number denotes the prohibition of greed taking the perfect minimum value 0, for example: one of the cases? The sign means I still picked one A, but two?? There must be no “A” in the question mark because there are two? /ba{0}/

$string='cbaaaaaa';
$pattern='/ba?? /is';
preg_match($pattern.$string.$arr);
show($arr);
Copy the code

If you like my article, please 👉 “like” “comment” “follow” 👈 one key three even, everyone’s support is my motivation to stick to it!

If there are any mistakes or inaccuracies in the above content, please leave a comment at 👇 below to point out, or you have better ideas, welcome to exchange and learn together