It is very nice to be able to manipulate a text file in the same way that SQL does. It is very nice to be able to manipulate a text file in a fixed delimiter

How to load TXT file?

Sample file data papa.txt

 paxi_id grade
  1       50
  2       50
  3       100
  4       200
  3       100
  5       100
 
Copy the code

Install Jupyter and run The Jupyter Notebook in a file directory. In the browser window that opens, select Python to run the notebook

import pandas # introduction of pandas
papa=pandas.read_csv('papa.txt',sep='\t') # load papa. TXT and specify its delimiter as \t
papa.head() Display the first few lines of data
Copy the code

You can see the results of the load visually presented in a table

How do I know how many rows of data I just loaded? There are several columns?

The operation instructions are as follows

rowNum=papa.shape[0] Table headers are not included
colNum=papa.columns.size
Copy the code

The results for

How do I de-weight the entire data in one column?

The operation instructions are as follows

uPapa=papa.drop_duplicates(['paxi_id'])
Copy the code

The results are as follows

How do I get the deduplicated value of a column? How many are there after de-weighting?

The operation instructions are as follows

uPaxiId=papa['paxi_id'].unique()
print("uPaxiId:",uPaxiId)
totalUPaxiIdNum=uPaxiId.size
print("num:",totalUPaxiIdNum)
Copy the code

The result is as follows

How do you compute the sum of a column?

The operation instructions are as follows

papa['grade'].sum()
Copy the code

The results are as follows

How do I filter rows for specific values?

The operation instructions are as follows

papa[ ( papa['grade'] == 50 ) | ( papa['grade'] == 100)]Copy the code

The results are as follows

How do you compute the number of values in a column?

The operation instructions are as follows

gPapa=papa.groupby('grade').size()
Copy the code

The results are as follows

How do you compute the sum of two or all of them?

The operation instructions are as follows

v=gPapa[50]+gPapa[100]
print("The sum of two :",v)
print("The sum.",gPapa.sum())
Copy the code

The results are as follows

How can values be graphically represented?

The operation instructions are as follows

import matplotlib.pyplot as plt
fig=plt.figure()
gPapa.plot(kind='bar',grid=True) #bar and barh can switch between x and y axes
plt.show() # when the display is required, it will draw all the images at once
Copy the code

The results are as follows

How to join two TXT files according to one column?

The other file is xixi.txt

paxi_id	type1, 3, 2, 4, 3, 4, 4, 5, 3Copy the code

Execute the following instructions

xixi=pandas.read_csv('xixi.txt',sep='\t')
uXixi=xixi.drop_duplicates(['paxi_id'])
pandas.merge(uPapa,uXixi,on=['paxi_id']) #join
Copy the code

The results are as follows

Export the graph of the dictionary

period={'1': 100,'2': 200,'3':150}
import matplotlib.pyplot as plt
fig=plt.figure()
plt.bar(range(len(period)),period.values(),align='center')
plt.xticks(range(len(period)),list(period.values())) 
plt.show()
Copy the code

Pandas official documentation is attached

Pandas.pydata.org/pandas-docs… Have a tutorial –