The infrastructure of variables

We all know that PHP variables are weakly typed and do not need to be specified when they are declared. So how does this work? This starts with the infrastructure of variables.

The realization of the zval

In the source file zend_type. H, you can see the definition of zval:

typedef struct _zval_struct     zval;

struct _zval_struct {
	zend_value        value;			/* value */
	union {
		struct {
			ZEND_ENDIAN_LOHI_4(
				zend_uchar    type,			/* active type */
				zend_uchar    type_flags,
				zend_uchar    const_flags,
				zend_uchar    reserved)	    /* call info for EX(This) */
		} v;
		uint32_t type_info;
	} u1;
	union {
		uint32_t     next;                 /* hash collision chain */
		uint32_t     cache_slot;           /* literal cache slot */
		uint32_t     lineno;               /* line number (for ast nodes) */
		uint32_t     num_args;             /* arguments number for EX(This) */
		uint32_t     fe_pos;               /* foreach position */
		uint32_t     fe_iter_idx;          /* foreach iterator index */
		uint32_t     access_flags;         /* class constant access flags */
		uint32_t     property_guard;       /* single property guard */
		uint32_t     extra;                /* not further specified */
	} u2;
}
Copy the code

The structure of zval consists of a union union zend_value, which holds the value or pointer of variable type, and two union unions U1 and U2

  • u1

U1 is used to store the variable type and its information, and the fields in it are used as follows:

Type: Records the type of a variable. It can be asked by u2.v.type visit

Type_flags: flags of variable specific types (such as constant type, reference count type, and immutable type). Different types of variables have different flags.

Const_flags: constant type flags

Reserved: Reserved field

  • u2

U2 is mainly used for auxiliary purposes. Due to the memory alignment of the structure, the space of U2 with or without U2 is already occupied, so it is used. U2’s secondary fields record a number of types of information that can greatly benefit internal functionality, improve cache friendliness or reduce memory addressing operations. Some of these fields are described here.

Next: Used to resolve hash conflicts by recording the position of the next element in the conflict.

Cache_slot: runtime cache. When executing a function, it will first look in the cache. If it is not in the cache, it will look in the global function table.

Num_args: Number of arguments passed in a function call

Access_flags: Access flags for object classes, such as public protected Private.

  • zend_value
typedef union _zend_value {
	zend_long         lval;				/ * * / integer
	double            dval;				/* Float */
	zend_refcounted  *counted;
	zend_string      *str;
	zend_array       *arr;
	zend_object      *obj;
	zend_resource    *res;
	zend_reference   *ref;
	zend_ast_ref     *ast;
	zval             *zv;
	void             *ptr;
	zend_class_entry *ce;
	zend_function    *func;
	struct {
		uint32_t w1;
		uint32_t w2;
	} ww;
} zend_value;
Copy the code

As you can see from Zend__value, long and double store values directly, while the other types are Pointers to their respective structures. So, thanks to structures like zval, PHP variables don’t have to be explicitly typed when they are declared, because whatever type of value you assign to a variable, it will help you find the corresponding storage structure.

For example, a variable whose value is a string looks like this:

PHP5 and PHP7 zval structure comparison

  • PHP5

  • PHP7

As you can see, the php7 zVal takes up only 16 bytes in total, which is a huge memory savings compared to the PHP5 zVal’s 48 bytes.

Also, in PHP5, all variables are applied in the heap, but for temporary variables, there is no need to apply in the heap. So in PHP7 this has been optimized so that temporary variables are applied directly on the stack.

Common variable types

Here are a few common types of variable structure, other more types, you can view the source code.

Integer and floating point

For integers and floating-point types, since they take up little space, the values of integers stored directly in zVal are stored in LVAL, and the floating-point values are stored in DVAL.

typedef union _zend_value {
    zend_long         lval;             / * * / integer
    double            dval;             /* Float */. }Copy the code

string

New string structures are defined in PHP 7. The structure is as follows:

struct _zend_string {
	zend_refcounted_h ;
	zend_ulong        h;                /* hash value */
	size_t            len;
	char              val[1];
};
Copy the code

The meaning of each of the above fields:

Gc: Variable reference information. All variable types that use reference counting have this structure.

H: hash value, used when calculating the index in the array. (This reportedly improved PHP7 performance by 5%)

Len: String length. This value is used to ensure binary security

Val: string content, variable length struct, allocated memory array by len length

Array is a very powerful data structure in PHP. Its underlying implementation is the ordinary ordered HashTable. Here is a brief look at its structure. We’ll go into more details later.

typedef struct _zend_array HashTable;


struct _zend_array {
	zend_refcounted_h gc;
	union {
		struct {
			ZEND_ENDIAN_LOHI_4(
				zend_uchar    flags,
				zend_uchar    nApplyCount,
				zend_uchar    nIteratorsCount,
				zend_uchar    consistency)
		} v;
		uint32_t flags;
	} u;
	uint32_t          nTableMask;
	Bucket           *arData;
	uint32_t          nNumUsed;
	uint32_t          nNumOfElements;
	uint32_t          nTableSize;
	uint32_t          nInternalPointer;
	zend_long         nNextFreeElement;
	dtor_func_t       pDestructor;
}
Copy the code

# # # object

The object structure of PHP7 has also been redesigned and is quite different from the implementation of PHP5.

struct _zend_object {
    zend_refcounted_h gc;
    uint32_t          handle;
    zend_class_entry *ce; 
    const zend_object_handlers *handlers;
    HashTable        *properties; 
    zval              properties_table[1];
};
Copy the code

Here are some of the fields:

Gc: indicates the GC header

*ce: the class of the object

* Properties: HashTable structure where key is the property name of the object and value is the offset of the property value in the properties_tables array. The offset is used to find the corresponding property value in the properties_talbe array.

Properties_talbe [1] : Stores the property value of the object


Ok, so let’s go over here.

The resources

PHP7 Basic design and source code implementation