preface

I was recently puzzled when I looked at PHP7 garbage collection and saw some code examples on the web showing different results when running in a local environment. If you think carefully, it’s hard to find the problem: Most of these articles are from the php5.x era, and after the release of PHP7, the new zval structure was adopted, the relevant information is relatively poor, so I combined some information to make a summary, mainly focusing on the explanation of the new zval container reference counting mechanism, if there is any fallacy, please kindly advise.

New zval structure in PHP7

Look at the code first!

struct _zval_struct {
	union {
		zend_long         lval;             /* long value */
		double            dval;             /* double value */
		zend_refcounted  *counted;
		zend_string      *str;
		zend_array       *arr;
		zend_object      *obj;
		zend_resource    *res;
		zend_reference   *ref;
		zend_ast_ref     *ast;
		zval             *zv;
		void             *ptr;
		zend_class_entry *ce;
		zend_function    *func;
		struct {
			uint32_t w1;
			uint32_t w2;
		} ww;
	} value;
    union {
        struct {
            ZEND_ENDIAN_LOHI_4(
                zend_uchar    type,         /* active type */
                zend_uchar    type_flags,
                zend_uchar    const_flags,
                zend_uchar    reserved)     /* call info for EX(This) */
        } v;
        uint32_t type_info;
    } u1;
    union {
        uint32_t     var_flags;
        uint32_t     next;                 /* hash collision chain */
        uint32_t     cache_slot;           /* literal cache slot */
        uint32_t     lineno;               /* line number (for ast nodes) */
        uint32_t     num_args;             /* arguments number for EX(This) */
        uint32_t     fe_pos;               /* foreach position */
        uint32_t     fe_iter_idx;          /* foreach iterator index */
    } u2;
};
Copy the code

For a detailed description of this structure, you can refer to brother Bird’s article at the end of the article, which is very detailed. I will not play with the sword before The Lord Guan. Here I only put forward a few key points:

  1. Variables in PHP7 are divided intoThe variable nameandA variable’s valueTwo parts, respectivelyzval_structAnd declared thereinvalue
  2. zval_struct.valueIn thezend_longdoubleAre allSimple data types, can store a concrete value directly, whereas other complex data types store a pointer to another data structurePointer to the
  3. In PHP7, reference counters are stored invalueInstead ofzval_struct
  4. NULL,The BooleanBelong to theThere is no valueThe data type (where Boolean type passesIS_FALSEIS_TRUETwo constants to mark), naturally there is no reference count
  5. A REFERENCE becomes a data structure rather than just a marker bit. Its structure is as follows:
struct _zend_reference {
    zend_refcounted_h gc;
    zval              val;
}
Copy the code
  1. zend_referenceAs azval_structContains a kind ofvalueTypes, too, have their ownvalValue, this value is pointing to azval_struct.value. They all have their ownReference counter.

The reference counter is used to keep track of how many zVal’s currently point to the same Zend_value.

For point 6, look at the following code:

$a = 'foo';
$b = &$a;
$c = $a;
Copy the code

The data structure looks like this:

$a = zend_reference; $b = zend_reference; $a = zend_reference; That’s where the contents of the string are stored.

$c also has a zval_struct, and its value can be initialized to point directly to the zend_string mentioned above, so there is no copy when copying.

Let’s take a look at some of the phenomena that can occur in this new zVAL structure and the reasons behind them.

The problem

Why do some variables start with a reference counter of 0

The phenomenon of

$var_int = 233;
$var_float = 233.3;
$var_str = '233';

xdebug_debug_zval('var_int');
xdebug_debug_zval('var_float');
xdebug_debug_zval('var_str');

/** var_int: (refcount=0, is_ref=0)int 233 var_float: (refcount=0, is_ref=0)float 233.3 var_str: (refcount=0, is_ref=0)string '233' (length=3) **********/
Copy the code

why

In PHP7, assigning a value to a variable involves two operations:

  1. Apply one for the symbolic quantity (that is, the variable name)zval_structstructure
  2. Stores the value of a variable tozval_struct.valueFor in thezvalvalueValues that can be stored in a field are not referenced again,Instead, it assigns values directly at copy time, these types include:
  • IS_LONG
  • IS_DOUBLE

That’s what we do in PHP, integer versus floating point.

Why is the refcount of var_str also 0? This brings up the two types of strings in PHP:

  1. interned stringInternal strings (function name, class name, variable name, static string) :
 $str = '233';    // Static string
Copy the code
  1. Common string:
 $str = '233' . time(); 
Copy the code

For internal strings, the contents of strings are unique and unchanged, which is equivalent to strings defined in static variables in C language. Their life cycle exists throughout the request period. After the request is completed, it is uniformly destroyed and released, which naturally eliminates the need for memory management through reference counting.

Why does a counter change to 2 when referential assignments are made to integer, floating point, or static string variables

The phenomenon of

$var_int_1 = 233;
$var_int_2 = &var_int;
xdebug_debug_zval('var_int_1');

/ * * * * var_int output: (refcount = 2, is_ref = 1) int 233 * * * * * * * * * * /
Copy the code

why

When assigning an integer, floating point, or static string value to a variable, the data type of value is zend_long, double, or zend_string. In this case, the value can be stored directly in the value. Copying by value creates a new zval_struct that stores the value in the same way into the value of the same data type, so the refcount value will always be 0.

However, when reference copying is done using the & operator, the situation is different:

  1. PHP for&The variable operated by the operator applies azend_referencestructure
  2. willzend_reference.valuePoint to the originalzval_struct.value
  3. zval_struct.valueThe data type of thezend_refrence
  4. willzval_struct.valuePoints to just applied and initializedzend_reference
  5. Apply for new variableszval_structThe structure will be hisvaluePoint to the one you just createdzend_reference

$var_int_1 and $var_int_2 both have a zval_struct structure, and their zval_struct.value refers to the same zend_reference structure, so the reference counter of this structure is 2.

Zend_reference refers to an integer or floating-point value. If the value is zend_string, the value refers to a counter of 1. Xdebug displays the zend_reference counter (2).

Why is the reference counter of the initial array 2

The phenomenon of

$var_empty_arr = [1.2.'3'];
xdebug_debug_zval('var_empty_arr');

/** output ** var_arr: (refcount=2, is_ref=0) array (size=3) 0 => (refcount=0, is_ref=0)int 1 1 => (refcount=0, is_ref=0)int 2 2 => (refcount=1, is_ref=0)string '3' (length=1) **********/
Copy the code

why

This relates to another concept in PHP7 called immutable array.

For arrays the not-refcounted variant is called an “immutable array”. If you use opcache, then constant array literals in your code will be converted into immutable arrays. Once again, these live in shared memory and as such must not use refcounting. Immutable arrays have a dummy refcount of 2, as it allows us to optimize certain separation paths.

An immutable array is an array type optimized by the OpCache extension. Simply put, all arrays compiled repeatedly with a constant result are optimized to be immutable arrays. Here is a counter example:

$array = [1.2, time()];
Copy the code

PHP does not know the return value of the time() function at compile time, so $array is a mutable array.

Immutable arrays do not use reference counting, just like the internal strings we discussed above, except that the internal string is always counted to 0, whereas immutable arrays use a pseudo-count of 2.

conclusion

  • Simple data types
    • Integer (without reference counting)
    • Floating point (no reference counting)
    • Boolean (does not use reference counting)
    • NULL (no reference counting)
  • Complex data types
    • string
      • Plain string (using reference counting, initial value 1)
      • Internal string (does not use reference count, reference count value is always 0)
    • An array of
      • Plain array (using reference counting, initial value 1)
      • Immutable arrays (do not use reference counting, use dummy value 2)
    • Object (using reference counting, initial value 1)

The resources

  • “PHP7 Kernel Anatomy” (Qin Peng)
  • php7-internal
  • Confusion about PHP 7 refcount