This is the 18th day of my participation in the First Challenge 2022.

Today we introduce a super amazing library in Python that 99% of people like and 1% don’t use! In today’s big data era, the value of data can be imagined. Sometimes in order to test, we need to simulate the real environment, but we can’t directly use real data, we need to create some data. Compared to Excel, I still find Python much less time and effort to create such “virtual” data.

Weekend, suddenly thought of had done this problem, here for everyone to do a replay!

Requirement: The boss asked me to simulate a batch of data for the project experiment. As some real data cannot be displayed, I need to simulate some data, including: name, province, detailed address, mobile phone number, ID number, date of birth, email address, etc. Of course, this batch of data must be written into Excel and handed to the boss in one go. So, this demand, you will do it?

Actual combat: simulate 1W data written into Excel

Before we talk about the basics, let’s jump right into the field and get a taste of how to write simulation data into an Excel file.

from faker import Faker import pandas as pd fake = Faker(["zh_CN"]) Faker.seed(0) def get_data(): Key_list = [" name "," full address "," province "," mobile number "," ID number "," date of birth "," email "] name = fake.name() address = fake.address() province = address[:3] number = fake.phone_number() id_card = fake.ssn() birth_date = id_card[6:14] email = fake.email() info_list = [name,address,province,number,id_card,birth_date,email] person_info = dict(zip(key_list,info_list)) return person_info Df = pd DataFrame (columns = [" name ", "address", "province", "phone number", "id", "birth", "email"]) for I in range (10000) : Person_info = [get_data()] df1 = pd.dataframe (person_info) df = pd.concat([df,df1]) df.to_excel(" simulation data.xlsx ",index=None)Copy the code

The results are as follows:

The above data is purely simulation, if the same, please do not check!

Python library lecture

How do you use such a good Python library? We can install the library directly using the following code.

pip install Faker -i https://pypi.tuna.tsinghua.edu.cn/simple/
Copy the code

To use it, import the library using the following code.

from faker import Faker
Copy the code

So before we talk about writing to Excel, let’s talk about how each function is used.

1. Generate the name

fake = Faker(locale='zh_CN')
name = fake.name()
name
Copy the code

The results are as follows:

2. Generate a detailed address

address = fake.address()
address
Copy the code

The results are as follows:

3. Generate the province

province = address[:3]
province
Copy the code

The results are as follows:

Since the result of this function is different every time I run it, I used the sliced method to generate provinces. And of course there are special functions that generate provinces.

fake.province()
Copy the code

The results are as follows:

4. Generate a mobile phone number

number = fake.phone_number()
number
Copy the code

The results are as follows:

5. Generate id numbers

id_card = fake.ssn()
id_card
Copy the code

The results are as follows:

6. Generate date of birth

birth_date = id_card[6:14]
birth_date
Copy the code

The results are as follows:

7. Generate the mailbox

email = fake.email()
email
Copy the code

The results are as follows:

supplement

Of course, the Faker library can not only help us generate the above information, but there are many other methods available, which fall into the following categories:

  • Address the address
  • Person: Sex, name, etc
  • Barcode bar
  • Color color class
  • Company: Company name, email address, and company name prefix
  • Credit_card Card type: card number, validity period, type, etc
  • Currency monetary
  • Date_time Time and date classes: date, year, month, etc
  • File File class: file name, file type, and file extension
  • Internet
  • The job of work
  • Lorem counts fake words
  • Misc miscellaneous class
  • Phone_number Mobile phone number: mobile number, carrier number segment
  • Python data
  • Profile Information: Name, gender, address, and company
  • SSN Social Security Code (ID Card Number)
  • User_agent User agent

For the use of these methods, we refer directly to faker’s official website, which is extremely convenient to use. faker.readthedocs.io/en/master/providers.html

1. The address and the address

Fake. Country () # country fake.city() # city fake.city_suffix() # city suffix City or county fake.address() # Address fake.street_address() # Street fake.street_name() # Street name fake.postcode() # Postcode fake.latitude() # Dimension Fake. Longitude () # longitudeCopy the code

2. The person character

Fake. Name () # name fake. Last_name () # surname fake.first_name() # name fake.name_male() # male name fake Fake.first_name_male () # Male name fake.name_female() # Female nameCopy the code

3. Color color

Rgb_css_color () # RGB color used by CSS fake.rGB_color () # String representing RGB color fake.color_name() # Color name Fake.safe_hex_color () # Safe hexadecimal color fake.safe_color_name() # Safe color nameCopy the code

4. Company

Fake.com pany() # company nameCopy the code

Credit_card Bank credit card

Credit_card_provider (card_type=None) # The provider of the card Credit_card_security_code (card_type=None)# Security password of the card fake.credit_card_expire() # Validity period of the card Credit_card_full (card_type=None) # Complete card informationCopy the code

6. Date_time Time time

Fake.date_time (tzinfo=None) # Random date time fake.iso8601(tzinfo=None) # Date output in iso8601 standard fake.date_time_this_month(before_now=True, after_now=False, Tzinfo =None) # date of this month fake.date_time_this_year(before_now=True, after_now=False, Tzinfo =None) # date of this year fake.date_time_this_decade(before_now=True, after_now=False, Tzinfo =None) # A date in this decade fake.date_time_this_century(before_now=True, after_now=False, Date_time_between (start_date="-30y", end_date="now") # fake.date_time_between(start_date="-30y", end_date="now", Tzinfo =None) # A random time between two times. fake.timezone() # Time zone fake.time(pattern="%H:%M:%S") # Time (customizable format) fake.am_pm() # random am PM Day_of_week () # Day_of_month () # Fake. Day_of_month () # Time_delta () # Fake.date_object () # Fake.unix_time () # Date (pattern="%Y-%m-%d") # Random date(customizable format) fake.date_time_ad(tzinfo=None) # Random date after A.D.Copy the code

7. The file file

fake.file_name(category="image", Fake.file_name () # Create fake.file_extension(category=None) # create fake.file_extension(category=None) # create fake.file_extension(category=None) # create fake.file_extension(category=None) # fake.mime_type(category=None) # mime-typeCopy the code

8. The Internet

Ipv4 (network=False) # ipv4 address fak. ipv6(network=False) # ipv6 address fak. uri_path(deep=None) # URI path Fake.uri_extension () # uri fake.uri() # uri fake.url() # url fake.image_url(width=None, Domain_word () # fake.domain_name() # fake.tld() # fake.user_name() # Fake.user_agent () # UA fake.mac_address() # MAC address fake.safe_email() # free_email() # fake.pany_email () # fake.email() # fake.emailCopy the code

9. The job work

Fake. Job ()# Job positionCopy the code

10. Lorem counts fake texts

Word () # fake. Words (nb=3) # fake. Sentence (nb=3) # fake. Sentences (nb=3) # sentences(fake. Paragraph (nb_nb_words =3, What's fake? What's fake? What's fake?Copy the code

11. Phone_number Phone number

Fake.phone_number () # Phone number fake.phonenumber_prefix() # Carrier number segment, the first three digits of the phonenumberCopy the code

12. SSN Social Security Code (ID card)

Fake.ssn () # Random id number generation (18 digits)Copy the code

13. User_agent User agent

fake.user_agent()
Copy the code
  • EOF –