“This is the 13th day of my participation in the First Challenge 2022. For details: First Challenge 2022”

preface

Numpy structured arrays can support elements of multiple data types, similar to Python’s built-in dictionaries.

Previously, we have defined the numpy structured array data type primarily using numpy.dtype.

Indexes of structured arrays take three forms:

  • Access individual field values indexed by field names
  • Access multiple field values indexed by a list of field names
  • Access field values indexed by integer scalars

Indexes in structured arrays return values in the form of views

The NUMpy module provides the Ndarray subclass RecArray, which allows properties to access structured array fields directly.

Numpy.rec.array () can be used to create record arrays or convert structured arrays to record arrays.

Also, numpy.lib.recfunctions provides methods for logging arrays related to structured arrays.

In this issue, we will learn the recfunctions module in common methods to learn and use, Let’s go~

1. Numpy. Lib. Recfunctions overview

Numpy.lib.recfunctions has a large collection of helper methods for creating and manipulating structured arrays

Currently, the methods in recfunctions have been rewritten and extended.

You can see the methods described in the numpy->lib->recfunctions.py file.

We need to import the RecFunctions library using from in advance when using the methods associated with the RecFunctions module

from numpy.lib import recfunctions as rfn
Copy the code

2. Add new fields

The numpy.lib.recfunctions module provides the append_fields method to add sex fields to an existing structured array

append_fields(base, names, data, dtypes=None,
                  fill_value=-1, usemask=True, asrecarray=False)
Copy the code

Parameter Description:

parameter instructions
base An array that needs to be expanded
names New field name
data Arrays or sequences of data
dtypes Optional, data type sequence
fill_value Optional to populate the missing data on the array
usemask Optional, whether to return an array of masks
asrecarray Optional whether to return an array of records
  • The field name and name parameter need to be given together
  • The corresponding values and data parameters need to be given together
  • Names,data, and dtypes can be values if only a single field is appended
>>> from numpy.lib import recfunctions as rfn
>>> arr = np.array([("Tom".12."Beijing"), ("Anne".10."Guangzhou"), ("Kenty".15."Shengzheng")],dtype=[("name"."U5"), ("age"."i8"), ("add
ress"."U5")])
>>> rfn.append_fields(arr,"province"["hebei"."guangdong"."guangdong"]."S16")
masked_array(data=[('Tom'.12.'Beiji'.b'hebei'),
                   ('Anne'.10.'Guang'.b'guangdong'),
                   ('Kenty'.15.'Sheng'.b'guangdong')],
             mask=[(False.False.False.False),
                   (False.False.False.False),
                   (False.False.False.False)],
       fill_value=('N/A'.999999.'N/A'.b'N/A'),
            dtype=[('name'.'<U5'), ('age'.'<i8'), ('address'.'<U5'), ('province'.'S16')])

Copy the code

3. Structured array field reduction

Recfunctions provides the apply_along_fields method to use func functions to shrink fields in structured arrays.

apply_along_fields(func, arr)
Copy the code

Parameter Description:

parameter instructions
func Functions applied to field dimensions must support axis parameters such as Np. mean, NP. sum, etc
arr Structured array

The apply_along_fields method is similar to apply_along_axis in that fields in a structured array are treated as additional axes.

During this process, numpy.result_type is called to convert the dtypes type to the same type.

>>> arr = np.array([(10.5), (3.9), (0.8)],dtype=[("x"."i8"), ("y"."i8")])
>>> rfn.apply_along_fields(np.mean,arr)
array([7.5.6. , 4. ])
>>> rfn.apply_along_fields(np.sum,arr)
array([15.12.8], dtype=int64)
>>> rfn.apply_along_fields(np.gradient,arr)
array([[-5., -5.],
       [ 6..6.],
       [ 8..8.]]) > > >Copy the code

Func supports axis numpy function methods.

  • Nump. Mean: Calculate the average
  • Numpy. Sum: calculation and
  • Numpy. gradient: j computes the gradient of the array
  • Numpy. namin: Evaluates the array minimum, ignoring NaN
  • Numpy. amin: Evaluates the array minimum
  • Numpy. amax: Calculates the maximum value of an array

4. Delete the fields of the structured array

The recFunctions module provides the drop_fields method to drop a specified field and return a new array

Also, Drop_fields can support nested fields

drop_fields(base, drop_names, usemask=True, asrecarray=False)
Copy the code

Parameter Description:

The title instructions
base Input array
drop_names Deleted fields or sequences of fields
usemask Whether to return an array of masks is optional
asrecarray Whether to return an array of records. Default is False
>>> arr = np.array([(10.5."X"), (3.9."Y"), (0.8."Z")],dtype=[("x"."i8"), ("y"."i8"), ("z"."S6")])
>>> rfn.drop_fields(arr,"x")
array([(5.b'X'), (9.b'Y'), (8.b'Z')], dtype=[('y'.'<i8'), ('z'.'S6')])
>>> rfn.drop_fields(arr,"z")
array([(10.5), ( 3.9), ( 0.8)], dtype=[('x'.'<i8'), ('y'.'<i8')])
>>> rfn.drop_fields(arr,["x"."z"])
array([(5,), (9,), (8,)], dtype=[('y'.'<i8')]) > > >Copy the code

Note: Version 1.18.0: The drop_fields method returns an array of 0 fields if all fields are dropped, rather than None as before.

>>> rfn.drop_fields(arr,["x"."z"."y"])
array([(), (), ()], dtype=[])
>>>
Copy the code

5. Join two structured arrays

Recfunctions provide the join_BY method to join two structured arrays together by key.

join_by(key, r1, r2, jointype='inner', r1postfix='1', r2postfix='2',
            defaults=None, usemask=True, asrecarray=False)
Copy the code

Parameter Description:

The title instructions
key Use to compare strings or sequences of strings corresponding to fields
r1 Structured Array 1
r2 Structured Array 2
jointype The connection type can be inner,outer, or leftouter
r1postfix Optional string appended to the name of the key R1 field that does not exist in R2
r2postfix Optional string appended to the name of the r2 field that does not exist in r1
default Dictionaries map field names to corresponding default values
usemask Whether to return a mask array. Default is True
asrecarray Whether to return an array of records
>>> arr = np.array([(10.5."X"), (3.9."Y"), (0.8."Z")],dtype=[("x"."i8"), ("y"."i8"), ("z"."S6")])
>>> arr2 = np.array([(10.5), (3.9), (0.8)],dtype=[("x"."i8"), ("y"."i8")])
>>> rfn.join_by("x",arr,arr2)
masked_array(data=[(0.8.8.b'Z'), (3.9.9.b'Y'), (10.5.5.b'X')],
             mask=[(False.False.False.False),
                   (False.False.False.False),
                   (False.False.False.False)],
       fill_value=(999999.999999.999999.b'N/A'),
            dtype=[('x'.'<i8'), ('y1'.'<i8'), ('y2'.'<i8'), ('z'.'S6')]) > > >Copy the code

Note:

  • The key should be the sequence of strings relative to the field used to join the array.
  • If the key is not found in either array, an exception is thrown
  • Arrays 1 and 2 should not contain duplicate keys. During connection, the data is not checked, which results in unreliable results.

If z is selected for connection in ARR and ARR2, ValueError is reported

>>> rfn.join_by("z",arr,arr2)
Traceback (most recent call last):
  File "<stdin>", line 1.in <module>
  File "<__array_function__ internals>", line 6.in join_by
  File "C:\Users\user\AppData\Roaming\Python\Python37\site-packages\numpy\lib\recfunctions.py", line 1480.in join_by
    raise ValueError('r2 does not have key field %r' % name)
ValueError: r2 does not have key field 'z'
>>>

Copy the code

6. Merge structured arrays

Recfunctions provide the merge_fields method to combine fields from two array sequences.

merge_arrays(seqarrays, fill_value=-1, flatten=False,
                 usemask=False, asrecarray=False)
Copy the code

Parameter Description:

The title instructions
seqarrays Array sequence
fill_value Populates missing data for shorter arrays
flatten Whether to collapse nested fields is optional
usemask Optional whether to use a mask array
asrecarray Whether to return an array of records
>>> arr = np.array([(10.5."X"), (3.9."Y"), (0.8."Z")],dtype=[("x"."i8"), ("y"."i8"), ("z"."S6")])
>>> arr2 = np.array([(10.5), (3.9), (0.8)],dtype=[("x"."i8"), ("y"."i8")])
>>> rfn.merge_arrays((arr,arr2))
array([((10.5.b'X'), (10.5)), (( 3.9.b'Y'), ( 3.9)),
       (( 0.8.b'Z'), ( 0.8))],
      dtype=[('f0', [('x'.'<i8'), ('y'.'<i8'), ('z'.'S6')), ('f1', [('x'.'<i8'), ('y'.'<i8')]]) > > >Copy the code

conclusion

In this issue, we add a new append_feilds() method to the recfunctions module to perform structured array operations such as adding a new append_feilds() method, merging two arrays merge_feilds() method, joining two arrays join_by() method, and removing the specified field Drop_fields () method.

There are other methods for structured arrays in the RecFunctions module, which we will continue to learn in the next installment.

That’s the content of this episode. Please give us your thumbs up and comments. See you next time