Wen: George ran


Translation: Good luck


Source:
Towardsdatascience.com/23-great-pa…

There are 23 codes for data analysts to help you understand data better! Pandas is an open source, BSD-licensed library that provides high-performance, easy-to-use data structures and analysis tools for the Python programming language. If you are not familiar with it, you can copy the link to the official website and learn about it in 10 minutes: pandas.pydata.org/pandas-docs…


The application case collection can also be viewed at: pandas.pydata.org/pandas-docs…


(1) Read the CSV data set

Pd. DataFrame. From_csv (" csv_file ")Copy the code

Or:

Pd. Read_csv (" csv_file ")Copy the code


(2) Read the Excel data set

pd.read_excel("excel_file")
Copy the code


(3) Write data directly to CSV

The data are separated by commas and have no index:

df.to_csv("data.csv", sep=",", index=False)
Copy the code


(4) Characteristic information of basic data set

df.info()
Copy the code


(5) Statistical results of basic data sets

print(df.describe())
Copy the code


(6) Print data in table form

print(tabulate(print_table, headers=headers))
Copy the code

Print_table is a list column, and headers is a string header column


(7) List column names

df.columns
Copy the code


Basic data processing


(8) Delete the missing data

df.dropna(axis=0, how='any')
Copy the code

Returns the objects labeled on the given axis, discarding the corresponding data one by one.


(9) Replace lost data

df.replace(to_replace=None, value=None)
Copy the code

Replace the value given in “to_replace” with the value of “value”.


(10) Check NAN

pd.isnull(object)
Copy the code

Detect missing values (NaN in numeric arrays, None and NaN in object arrays)


(11) Delete features

df.drop('feature_variable_name', axis=1)
Copy the code

Axes 0 represent rows and 1 represent columns


(12) Convert object type to float

pd.to_numeric(df["feature_name"], errors='coerce')
Copy the code

Convert object types to numerals for evaluation (if they are strings)


(13) Convert data to Numpy array

df.as_matrix()
Copy the code


(14) Get the header “N” line of data

df.head(n)
Copy the code


(15) Obtain data by feature name

df.loc[feature_name] 
Copy the code


(16) Apply functions to data

This function multiplies all values in the height column of the data by 2

df["height"].apply(*lambda* height: 2 * height)
Copy the code

Or:

def multiply(x):

 return x * 2

df["height"].apply(multiply)
Copy the code


(17) Rename data column

Here we rename column 3 of the data to “size”

df.rename(columns = {df.columns[2]:'size'}, inplace=True)
Copy the code


(18) Extract a column separately

df["name"].unique()
Copy the code


(19) Access sub-data

We select the “Name” and “size” columns from the data

new_df = df[["name"."size"]]
Copy the code


(20) Summarize data information

Df.sum ()
Df.min ()
Df.max ()
# minimum index df.idxmin()
Df.idxmax ()
Df.describe ()
Df.mean ()
Df.median ()
Copy the code


(21) Sort the data

df.sort_values(ascending = False)
Copy the code


(22) Boolean index

Here we filter the “size” data column to display a value equal to 5:

df[df["size"] = = 5)Copy the code


(23) Select a value

Select the first row of the “size” column:

df.loc([0], ['size'])
Copy the code



So how do programmers systematically learn data analysis?

Through career path planning, Udacity one-stop learning of “Data science” college courses, combined with silicon Valley authoritative course content, famous enterprise practical projects and personalized learning guidance, to help each student to avoid detdettions in the career path of “data science”, and zero pressure entry. Short-term, efficient learning of the knowledge and experience necessary for each stage and career path.

You can find the study plan suitable for yourself in “Youda Data Science Institute”, and under the guidance of industry experts, avoid learning mistakes, effectively master the core skills of data analysis, easily far more than competitors in the industry, and get a high salary Offer!




If you fall into one of these categories:

  • Negative data/statistical basis, want to add data talent bonus white;
  • Have some programming experience, facing career and annual salary bottleneck transition;
  • Professionals who want to pursue a career in data science or academia;

The Udacity Data Science School will provide you with a clearer and more efficient learning path, enabling you to master core technologies, gain rich practical experience, and quickly improve your competitiveness in the workplace.


Come and experience our new college BA!