This is the 25th day of my participation in the August Challenge

1. Type conversion

As we all know, different types of values can perform different operations, so we need to convert the raw data we get to the data type we want. In numpy arrays, the conversion method is astype(), which specifies the target type to be converted in parentheses after astype. Here is an example: Create an array of type integer

import numpy as np
​
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr)
Copy the code

result:

[1 2 3] [4 5 6] [7 8 9]]Copy the code
print(arr.dtype)
Copy the code

result:

int32
Copy the code

1.1 Convert arR arrays from int to float

arr_float = arr.astype(np.float64)
print(arr_float)
Copy the code

result:

[[1. 2. 3.] [4. 5. 6.] [7.Copy the code
print(arr_float.dtype)
Copy the code

result:

float64
Copy the code

1.2 Convert arR array from int to STR

arr_str = arr.astype(np.string_)
print(arr_str)
Copy the code

result:

[[b'1' b'2' b'3']
 [b'4' b'5' b'6']
 [b'7' b'8' b'9']]
Copy the code
print(arr_str.dtype)
Copy the code

result:

|S11
Copy the code

Note: The Astype () method is also mentioned in the pandas series for converting data types and handling outliers. The astype() method and the Astype () method in pandas belong to two different libraries, but are essentially the same. A column in pandas is a NUMPY array.

2. Handle the missing value

Missing values are represented by Np.nan in numpy

Missing value processing is divided into 2 steps. The first step is to judge whether there are missing values and find out the missing values. The second step is to fill in the missing values. The example process is as follows:

Before processing the missing value, create an array containing one missing value

import numpy

​
arr = np.array([1.2, np.nan, 4])
print(arr)
Copy the code

result:

[ 1.  2. nan  4.]
Copy the code

2.1 Searching for missing Values

The method used to find missing values is isnan(). If a location returns True, it is a missing value. If False is returned, the value at that location is not a missing value.

The following is an example of how to find a missing value:

print(np.isnan(arr))
Copy the code

result:

[False False  True False]
Copy the code

2.2 Handling missing Values

An example of how to handle missing values is as follows:

arr[np.isnan(arr)] = 0
print(arr)
Copy the code

result:

/ 1. 2. 0. 4.Copy the code

3. Repeat value processing

Create an array of duplicate values. Duplicate values are easier to handle by calling unique(). Just look at the example below.

import numpy as np
​
arr = np.array([1, 2, 3, 2, 1])
print(arr)
Copy the code

result:

[1 2 3 2 1]
Copy the code
print(np.unique(arr))
Copy the code

result:

[1, 2, 3]Copy the code