This is the 17th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

First, write first

Reference book for this series of study notes: Data Analysis in Action. Tomaz Joubas will share his notes from this study book with you as part of a series called Data Analysis in Action from Scratch.

In this article, you will continue to learn how to read and write CSV files in the text library. The text library is used to read and write CSV files. The text library is used to read and write CSV files.

Pandas reads and writes CSV data

Ii. Supplement in the previous section

CSV

Comma-separated Values (CSV, sometimes called character Separated Values because the characters can also not be commas) are files that store table data (numeric and text) in plain text.

TSV

TSV stands for Tab-Separated Values, which separates values by Tab characters. Python’s CSV module should properly be called the DSV module, because it actually supports normal form delimiter-Separated value files (DSV, Delimiter-Separated values).

The delimiter parameter is set to semicolon by default, which means that files are treated as CSV by default. When 'delimiter='\t', the file being processed is TSV.Copy the code

Iii. Basic knowledge summary

1. Read and write TSV files using pandas

2. Read and write the JSON file using pandas

4. Start using your head

1. Read and write TSV files using pandas

I explained the difference between CSV and TSV at the beginning of this article, and I’m sure that some of you who read the first article will know how to handle TSV files.

CSV is a delimiter for TSV, and TSV is a delimiter for TSV. Python uses the CSV module to read the contents of these files. The functions used read_csv() and to_csv() were described in detail in the previous article, so I’ll go straight to the example code.

(1) Read TSV file code
import pandas as pd
import os

Get the parent directory of the current file
father_path = os.getcwd()

The path to the original data file
rpath_tsv = father_path+r'\data01\city_station.tsv'
# fetch data
tsv_read = pd.read_csv(rpath_tsv, sep="\t")
# Display the top 10 items of data
print(tsv_read.head(10))
Copy the code

The results

Site name code 0 Beijing North VAP 1 Beijing East BOP 2 Beijing BJP 3 Beijing South VNP 4 Beijing West BXPCopy the code
(2) Write TSV file code
import pandas as pd
import os

Get the parent directory of the current file
father_path = os.getcwd()

Save the path to the data file
path_tsv = father_path+r'\data01\temp_city.tsv'

data = {"Site name": ["Beijing North"."Beijing East"."Beijing"."Beijing South"."Beijing West"]."Code": ["VAP"."BOP"."BJP"."VNP"."BXP"]}
df = pd.DataFrame(data)
df.to_csv(path_tsv, sep="\t", index=False)
Copy the code
The results

(3) Extra meals

CSV module can also read CSV and TSV files directly

csv.reader(csvfile, dialect='excel', **fmtparams)
csv.writer(csvfile, dialect='excel', **fmtparams)
Copy the code
  • Csvfile must be an object that supports iterators. It can be a file object or a list object. If it is a file object, it needs to be opened with a “B” flag.
  • Qdialect. The default is Excel style, which is separated by commas (,). The Dialect mode also supports customization
  • Fmtparam, the format parameter used to override the encoding style specified in the previous Dialect object.
2. Read and write the JSON file using pandas
(1) Use pandas to read the JSON file
import pandas as pd
import os

Get the parent directory of the current file
father_path = os.getcwd()
The path to the original data file
rpath_json = father_path+r'\data01\realEstate_trans.json'
json_read = pd.read_json(rpath_json)

Print the first 10 lines
print(json_read.head(10))
Copy the code
The results

Function analysis

read_json(path_or_buf,orient,encoding,numpy)

Common parameter analysis:

  • Path_or_buf: indicates the file path.
  • Orient: Indicates the expected JSON string format. To_json () can generate compatible JSON strings with the corresponding orientation values. One possible set of directions is:
'split' : dict like {index -> [index], columns -> [columns], data -> [values]}
'records' : list like [{column -> value}, ... , {column -> value}]
'index' : dict like {index -> {column -> value}}
'columns' : dict like {column -> {index -> value}}
'values' : just the values array
Copy the code
  • Encoding: A string. The default value is’ UTF-8 ‘.
  • Numpy: Boolean value, False by default, decoded directly into numpy array. Only numeric data is supported, but non-numeric columns and index labels are supported. Also note that if numpy = True, the JSON order must be the same for each term.
(2) Write a JSON file to pandas
import pandas as pd
import os

Get the parent directory of the current file
father_path = os.getcwd()
Path to store data files
wpath_json = father_path+r'\data01\temp_trans.json'
data = [{"city": "SACRAMENTO"."longitude": -121.434879."street": "3526 HIGH ST"."sq__ft": 836."latitude": 38.631913."sale_date": "Wed May 21 00:00:00 EDT 2008"."zip": 95838."beds": 2."type": "Residential"."state": "CA"."baths": 1."price": 59222}, {"city": "SACRAMENTO"."longitude": -121.431028."street": "51 OMAHA CT"."sq__ft": 1167."latitude": 38.478902."sale_date": "Wed May 21 00:00:00 EDT 2008"."zip": 95823."beds": 3."type": "Residential"."state": "CA"."baths": 1."price": 68212}, {"city": "SACRAMENTO"."longitude": -121.443839."street": "2796 BRANCH ST"."sq__ft": 796."latitude": 38.618305."sale_date": "Wed May 21 00:00:00 EDT 2008"."zip": 95815."beds": 2."type": "Residential"."state": "CA"."baths": 1."price": 68880}]
df = pd.DataFrame(data)
df.to_json(wpath_json)
Copy the code
The results

Function analysis

to_json(path_or_buf,orient,encoding,index)

  • The first three parameters are the same as those in read_json()
  • Index: False Indicates that the index is not written. The default value is True.

Loads () and dumps() of JSON templates are also used to write and write JSON files.

Five, send you words

I always think that to learn a language well, the bottom is the most important, so don’t think that the basic things are too easy, learn the basics, you can become a great leader.

Persistence and hard work: results.

The idea is very complicated,

The implementation is interesting,

As long as you don’t give up,

Fame will come.

— Old Watch doggerel

See you next time. I’m a cat lover and a tech lover. If you find this article helpful, please like, comment and follow me!