Faker is a powerful Python library. It’s up to you.

When doing program development, we often use some test data, I believe that most of the students are like this to build test data:

test1
test01
test02 Test 1 Test 2 Test data 1 This is a test text this is a long, long, long test text...Copy the code

Hands up if you’ve been shot.

Not only do you have to manually type the test data, but you fake it. What can we do about it? Is there something that can automatically give me some real data? Don’t tell me, there is!

There is a magic library in Python called Faker that automatically generates all kinds of true-looking “fake” data for us. Let’s take a look!

The installation

Python3 > PIP > Python3 > PIP > Python3 > PIP > PIP

pip3 install faker
Copy the code


Once installed, let’s try using the simplest example to generate some fake data:

from faker import Faker
faker = Faker()
print('name:', faker.name())
print('address:', faker.address())
print('text:', faker.text())
Copy the code

First we import a faker class from the faker package and instantiate it as a Faker object. Then we call its name, address, and text methods in sequence.

name: Nicholas Wilson
address: 70561 Simmons Road Apt. 893
Lake Raymondville, HI 35240
text: Both begin bring federal space.
Official start idea specific. Able under young fire.
Who show line traditional easy people. Until economic lead event case. Technology college his director style.
Copy the code

See here to give us a real looking English name, address, long text.

But we are Chinese, we must want to generate Chinese, don’t worry, this library support for so many languages have, of course, including Chinese, specific list of supported languages can see: https://faker.readthedocs.io/en/master/locales.html.

Here are some of the more common language codes:

  • Simplified Chinese: zh_CN
  • Traditional Chinese: zh_TW
  • American English: en_US
  • British English: en_GB
  • Devon: de_DE
  • Japanese: ja_JP
  • Korean: ko_KR
  • French: fr_FR

If you want to generate Chinese, you only need to pass the corresponding language code in the first parameter of the Faker class, for example, the simplified Chinese is passed in zh_CN, so the above code rewrite is as follows:

from faker import Faker
faker = Faker('zh_CN')
print('name:', faker.name())
print('address:', faker.address())
print('text:', faker.text())
Copy the code

The running results are as follows:

Name: He Lin Address: Block F, Nanxi Beizhen Street, Liupanshui County, Ningxia Hui Autonomous Region 912311 TEXT: Business software integral starting times major One kind of administration solves two kinds of people. Support only local everything. Culture is nothing but so. The system feels this is why they. Time and so on continues to be a state of prestige. Site password situation. The problem is that or. In fact, the process is detailed. Chinese historical environment telephone regulations. Experience Shanghai control don't live. Friends run projects on us. Later today those using free countries join however. The maximum number of Spaces is one. Date passed to Get Beijing, Japan.Copy the code

You can see a Chinese name, address, and long text generated. It looks like the address is a random combination of provinces, prefecture-level cities, county-level cities, and streets. The text is also a random combination of words, but this is actually much better than the test data in the first column of the article.

The above code results in a different result each time it is run, because the resulting result is a random combination.

Provider

Let’s look at the detail faker what type of data can be generated, specific apis available can see https://faker.readthedocs.io/en/master/locales/zh_CN.html, which listed all of the available methods.

When you open it, you will find that there is a Provider object in it.

In fact, the faker library is designed as a “plug-in” for the Provider object for the sake of decoupling. Faker can add Provider objects, which provide a core implementation for Faker objects to generate certain data. Just as the Faker object is a generator, what does its generation function depend on? It depends on the Provider, which provides the Faker object with the ability to generate certain items of data.

Because the Faker object has some Provider objects built in, the Faker object can generate the name, address, and text you just requested.

Since the Faker object has the ability to generate data, it must have some default Provider objects built in, so let’s print them out:

from faker import Faker
faker = Faker('zh_CN')
print(faker.providers)
Copy the code

The running results are as follows:

[<faker.providers.user_agent.Provider object at 0x10249de48>, <faker.providers.ssn.zh_CN.Provider object at 0x10249dc18>, <faker.providers.python.Provider object at 0x10249dd68>, <faker.providers.profile.Provider object at 0x10249dcc0>, <faker.providers.phone_number.zh_CN.Provider object at 0x10249dc88>, <faker.providers.person.zh_CN.Provider object at 0x10249de80>, <faker.providers.misc.Provider object at 0x10249df60>, <faker.providers.lorem.zh_CN.Provider object at 0x10249dc50>, <faker.providers.job.zh_CN.Provider object at 0x10249de10>, <faker.providers.isbn.Provider object at 0x10249c6d8>, <faker.providers.internet.zh_CN.Provider object at 0x10249c828>, <faker.providers.geo.en_US.Provider object at 0x102484748>, <faker.providers.file.Provider object at 0x102484828>, <faker.providers.date_time.en_US.Provider object at 0x1023789e8>, <faker.providers.currency.Provider object at 0x102484780>, <faker.providers.credit_card.Provider object at 0x1024845f8>, <faker.providers.company.zh_CN.Provider object at 0x102499ef0>, <faker.providers.color.en_US.Provider object at 0x1023532e8>, <faker.providers.barcode.Provider object at 0x101cb6d30>, <faker.providers.bank.en_GB.Provider object at 0x102378f98>, <faker.providers.automotive.en_US.Provider object at 0x1017a5c50>, <faker.providers.address.zh_CN.Provider object at 0x101787c18>]
Copy the code


There are quite a few providers, such as user_Agent, phone_number, ISBN, credit_card, etc., which have different languages. For example, phone_number stands for phone number. This is the difference between different languages, so there is another layer of zh_CN, for language differentiation.

In this way, general providers are directly in a module of a Provider category, and providers with different languages are further divided into modules based on different languages. The design is very scientific, easy to expand and not redundant.

Now that we know that Faker has so many providers, let’s see how the name, address, and other methods we just called relate to providers.

Let’s print the name, address, text and other methods:

from faker import Faker
faker = Faker('zh_CN')
print('name:', faker.name)
print('address:', faker.address)
print('text:', faker.text)
Copy the code

Note that the three methods are not called, but printed directly, so that the object description of the method can be printed directly, as follows:

name: <bound method Provider.name of <faker.providers.person.zh_CN.Provider object at 0x10f6dea58>>
address: <bound method Provider.address of <faker.providers.address.zh_CN.Provider object at 0x10e9e6cf8>>
text: <bound method Provider.text of <faker.providers.lorem.zh_CN.Provider object at 0x10f6dfda0>>
Copy the code

Suddenly realize that we call the method is Faker the Provider is called the inside of the corresponding methods, such as the name is Faker. Will. Person. Zh_CN. Inside the Provider name method, the two is consistent, We dig a dig source verification, source code in: https://github.com/joke2k/faker/blob/master/faker/providers/person/init.py, sure enough, it defines the name method, then the Faker dynamically will introduce this method came in, It’s ready to use.

Methods listed

Since there are so many providers, let’s take a detailed look at what other commonly used methods are.

https://faker.readthedocs.io/en/master/providers.html.

Address

Address, used to generate some address-related data, such as Address, city, zip code, street, etc., as follows:

faker.address()
Room 253105, Block D, Wuhan Street, Nanhu, Jie County, Xinjiang Uygur Autonomous Region
faker.building_number()
Tower # 'B'
faker.city()
# 'Lu county'
faker.city_name()
# 'Guiyang'
faker.city_suffix()
County '#'
faker.country()
# 'Alaska'
faker.country_code(representation="alpha-2")
# 'CR'
faker.district()
# 'West Peak'
faker.postcode()
# '726749'
faker.province()
# 'Fujian'
faker.street_address()
# 'Road N block'
faker.street_name()
# 'Li Lu'
faker.street_suffix()
# 'road'
Copy the code

Color

Color, used to generate Color related data, such as HEX, RGB, RGBA, etc.

faker.color_name()
# 'DarkKhaki'
faker.hex_color()
# '#97d14e'
faker.rgb_color()
107179 ploidy '#'
faker.rgb_css_color()
# 'RGB (20,46,70)'
faker.safe_color_name()
# 'navy'
faker.safe_hex_color()
# '#dd2200'
Copy the code

Company

Company is used to generate company-related data, such as Company name, Company prefix, Company suffix, etc., as follows:

faker.bs()
# 'grow rich initiatives'
faker.catch_phrase()
# 'Self-enabling encompassing function'
faker.company()
# 'Heng Cong Baihui Network Limited'
faker.company_prefix()
# 'Huilai Computer'
faker.company_suffix()
# 'Information LTD.'
Copy the code

Credit Card

Credit Card, used to generate Credit Card related data, such as expiration time, bank Card number, security code, etc., used as follows:

faker.credit_card_expire(start="now", end="+10y", date_format="%m/%y")
# '08/20'
faker.credit_card_full(card_type=None)
# 'Mastercard\n \n5183689713096897 01/25\nCVV: 012\n'
faker.credit_card_number(card_type=None)
# '4009911097184929918'
faker.credit_card_provider(card_type=None)
# 'JCB 15 digit'
faker.credit_card_security_code(card_type=None)
# '259'
Copy the code

Date Time

Date Time: used to generate Time data, such as the year, month, week, and Date of birth. It can return datetime data.

faker.am_pm()
# 'AM'
faker.century()
# 'X'
faker.date(pattern="%Y-%m-%d", end_datetime=None)
# '1997-06-16'
faker.date_between(start_date="-30y", end_date="today")
# datetime.date(2000, 8, 30)
faker.date_between_dates(date_start=None, date_end=None)
# datetime.date(2019, 7, 30)
faker.date_object(end_datetime=None)
# datetime.date(1978, 3, 12)
faker.date_of_birth(tzinfo=None, minimum_age=0, maximum_age=115)
# datetime.date(2012, 6, 3)
faker.date_this_century(before_today=True, after_today=False)
# datetime.date(2011, 6, 12)
faker.date_this_decade(before_today=True, after_today=False)
# datetime.date(2011, 8, 22)
faker.date_this_month(before_today=True, after_today=False)
# datetime.date(2019, 7, 25)
faker.date_this_year(before_today=True, after_today=False)
# datetime.date(2019, 7, 22)
faker.date_time(tzinfo=None, end_datetime=None)
# datetime.datetime(2018, 8, 11, 22, 3, 34)
faker.date_time_ad(tzinfo=None, end_datetime=None, start_datetime=None)
# datetime.datetime(1566, 8, 26, 16, 25, 30)
faker.date_time_between(start_date="-30y", end_date="now", tzinfo=None)
# datetime.datetime(2015, 1, 31, 4, 14, 10)
faker.date_time_between_dates(datetime_start=None, datetime_end=None, tzinfo=None)
# datetime.datetime(2019, 7, 30, 17, 51, 44)
faker.date_time_this_century(before_now=True, after_now=False, tzinfo=None)
# datetime.datetime(2002, 9, 25, 23, 59, 49)
faker.date_time_this_decade(before_now=True, after_now=False, tzinfo=None)
# datetime.datetime(2010, 5, 25, 20, 20, 52)
faker.date_time_this_month(before_now=True, after_now=False, tzinfo=None)
# datetime.datetime(2019, 7, 19, 18, 4, 6)
faker.date_time_this_year(before_now=True, after_now=False, tzinfo=None)
# datetime.datetime(2019, 3, 15, 11, 4, 18)
faker.day_of_month()
# '04'
faker.day_of_week()
# 'Monday'
faker.future_date(end_date="+30d", tzinfo=None)
# datetime.date(2019, 8, 12)
faker.future_datetime(end_date="+30d", tzinfo=None)
# datetime.datetime(2019, 8, 24, 2, 59, 4)
faker.iso8601(tzinfo=None, end_datetime=None)
# '1987-07-01T18:33:56'
faker.month()
# '11'
faker.month_name()
# 'August'
faker.past_date(start_date="-30d", tzinfo=None)
# datetime.date(2019, 7, 25)
faker.past_datetime(start_date="-30d", tzinfo=None)
# datetime.datetime(2019, 7, 18, 22, 46, 51)
faker.time(pattern="%H:%M:%S", end_datetime=None)
# '16:22:30'
faker.time_delta(end_datetime=None)
# datetime.timedelta(0)
faker.time_object(end_datetime=None)
# datetime.time(22, 12, 15)
faker.time_series(start_date="-30d", end_date="now", precision=None, distrib=None, tzinfo=None)
# <generator object Provider.time_series at 0x7fcbce0604f8>
faker.timezone()
# 'Indian/Comoro'
faker.unix_time(end_datetime=None, start_datetime=None)
# 1182857626
faker.year()
# '1970'
Copy the code

File

File is used to generate data related to files and File paths, including File extensions, File paths, MIME_TYPE, and disk partitions. The usage is as follows:

faker.file_extension(category=None)
# 'flac'
faker.file_name(category=None, extension=None)
# 'then.numbers'
faker.file_path(depth=1, category=None, extension=None)
# '/ relationship/technology.mov '
faker.mime_type(category=None)
# 'video/ogg'
faker.unix_device(prefix=None)
# '/dev/sdd'
faker.unix_partition(prefix=None)
# '/dev/xvds3'
Copy the code

Geo

Geo, used to generate data related to geographical location, including latitude and longitude, time zone, etc., use as follows:

Faker. Coordinate (center = None, the radius = 0.001)# a Decimal (' 114.420686 ')
faker.latitude()
# a Decimal (' 9.772541 ')
faker.latlng()
# (a Decimal (' 27.0730915 '), a Decimal (' 5.919460 '))
faker.local_latlng(country_code="US", coords_only=False)
# ('41.47892', '-87.45476', 'Schererville', 'US', 'America/Chicago')
faker.location_on_land(coords_only=False)
# (' 12.74482 ', 4.52514 ' ', 'Argungu', 'NG', '" influenza/Lagos')
faker.longitude()
# a Decimal (' 40.885895 ')
Copy the code

Internet

Internet, used to generate Data related to the Internet, including random email addresses, domain names, IP addresses, urls, user names, name suffixes, etc., used as follows:

faker.ascii_company_email(*args, **kwargs)
# '[email protected]'
faker.ascii_email(*args, **kwargs)
# '[email protected]'
faker.ascii_free_email(*args, **kwargs)
# '[email protected]'
faker.ascii_safe_email(*args, **kwargs)
# '[email protected]'
faker.company_email(*args, **kwargs)
# '[email protected]'
faker.domain_name(levels=1)
# 'xiulan.cn'
faker.domain_word(*args, **kwargs)
# 'luo'
faker.email(*args, **kwargs)
# '[email protected]'
faker.free_email(*args, **kwargs)
# '[email protected]'
faker.free_email_domain(*args, **kwargs)
# 'yahoo.com'
faker.hostname(*args, **kwargs)
# 'lt-18.pan.cn'
faker.image_url(width=None, height=None)
# 'https://placekitten.com/51/201'
faker.ipv4(network=False, address_class=None, private=None)
# '192.233.68.5'
faker.ipv4_network_class()
# 'a'
faker.ipv4_private(network=False, address_class=None)
# '10.9.97.93'
faker.ipv4_public(network=False, address_class=None)
# '192.51.22.7'
faker.ipv6(network=False)
# 'de57:9c6f:a38c:9864:10ec:6442:775d:5f02'
faker.mac_address()
# '99:80:5c:ab:8c:a9'
faker.safe_email(*args, **kwargs)
# '[email protected]'
faker.slug(*args, **kwargs)
# ' '
faker.tld()
# 'cn'
faker.uri()
# 'http://fangfan.org/app/tag/post/'
faker.uri_extension()
# '.php'
faker.uri_page()
# 'about'
faker.uri_path(deep=None)
# 'app'
faker.url(schemes=None)
# 'http://mingli.cn/'
faker.user_name(*args, **kwargs)
# 'jie54'
Copy the code

Job

Job, used to generate job-specific data, as follows:

faker.job()
# 'Ironing'
Copy the code

Lorem

Lorem, used to generate some fake text data, including sentences, paragraphs, long text, keywords, etc., in addition to passing in different parameters to control the generated length, the usage is as follows:

faker.paragraph(nb_sentences=3, variable_nb_sentences=True, ext_word_list=None)
# 'includes those points to report. Picture address basically all.'
faker.paragraphs(nb=3, ext_word_list=None)
# [' The plan provides so so organize the merchandise among them. Participate to become different publishing regions. Elite Technology thank you for your needs.
# 'very relevant is that one is one article at a time. Add those as well as the following to your.',
# 'Students should come out and analyze increasing relationship organization. Friend registration should require units. Feeling finally unable to find the choice of the people.']
faker.sentence(nb_words=6, variable_nb_words=True, ext_word_list=None)
# 'Introduce the result and handle it yourself.'
faker.sentences(nb=3, ext_word_list=None)
# [' viewing actually a learning login browse is one of them.', 'and resources of people things.',' Technology prices free college education.']
faker.text(max_nb_chars=200, ext_word_list=None)
# (' only current Domestic Chinese so. Prestige system online though.\n'
# 'Picture people are very cooperative with this kind of thanks for the update. Name detailed Direct Society all the way home completely.\n'
# 'important more as long as the market. It has to be just student music. System USA category these all environments.\n'
# 'But then the people of America about.\n'
# 'case professional international see research. Music Environment Market Search discovery.\n'
# 'tool is still where people are today. Message writer brand engineering project must. Now our news should be related.\n'
# 'Update the economic capacity of all resources if. The phone can log into the country.')
faker.texts(nb_texts=3, max_nb_chars=200, ext_word_list=None)
# [' success may recommend your industry. Region and recommend.\n'
# 'Networking continues to be a major must. Start security service.\n'
# 'should pass online later through the university. Management requirements related to international reading are current. In order to should result click on the company to start how.\n'
# 'success at one time maximum production site. This join her address is limited.\n'
# 'According to the news car up very subject display must. Some construction comes from author telephone support.\n'
# 'Just resources or because of economic things like. For what Chinese size to get served. Network password is free to join a community welcome.',
# 'Sector activity technology. Commodity impact occurs industry password completion. Is the department results data study of course. Or help the city ask home market to educate you.\n'
# 'Professional full analysis handles what city University does.\n'
# 'file is very international all up integral company. The material is not the film. This is what this site needs.\n'
# 'Cooperation is not as important as the current market development space. Your member recommends successful education for China.\n'
# 'file is not if comments. Because the empirical device prescribes.\n'
# 'Join together to influence everyone online to run online if. Engineering enterprises of this kind later.'
# 'Space market appears must be basic phone. Display a standard other design work. Project constant news issues more updates so.\n'
# 'Profiles online content together will not. Anyone who knows a variety of two. Category things run then the investment market.\n'
# 'Those who use the introduction company friends people you browse. Should express a little general note mainly thank you. Call back to experience a source to join.\n'
# 'district law otherwise expressed though. Participate in social like limited forums generally published. Category current culture can.\n'
# 'Report quality work primarily. Enterprise publish completely. Get name author level two forums just phone.']
faker.word(ext_word_list=None)
# 'Attention'
faker.words(nb=3, ext_word_list=None, unique=False)
# [' responsibility ', 'organization ',' later ']Here the parameters of each method are different, the specific parameter interpretation can see the source code of each method annotation: https://github.com/joke2k/faker/blob/master/faker/providers/lorem/__init__.py,Copy the code

Misc

Misc is used to generate some encrypted data, such as password, SHA1, SHA256, MD5, and so on. The usage is as follows:

faker.boolean(chance_of_getting_true=50)
# True
faker.md5(raw_output=False)
# '3166fa26ffd3f2a33e020dfe11191ac6'
faker.null_boolean()
# False
faker.password(length=10, special_chars=True, digits=True, upper_case=True, lower_case=True)
# 'W7Ln8La@%O'
faker.sha1(raw_output=False)
# 'c8301a2a79445439ee5287f38053e4b3a05eac79'
faker.sha256(raw_output=False)
# '1e909d331e20cf241aaa2da894deae5a3a75e5cdc35c053422d9b8e7ccfa0402'
faker.uuid4(cast_to=<class 'str'>)
# '6e6fe387-6877-48d9-94ea-4263c4c71aa5'
Copy the code

Person

Person, used to generate data related to a Person’s name, including last name, first name, full name, English name, etc., as well as male and female names, as follows:

faker.first_name()
# 'clever'
faker.first_name_female()
# 'fang'
faker.first_name_male()
# ', '
faker.first_romanized_name()
# 'Jing'
faker.last_name()
# 'temperature'
faker.last_name_female()
# 'coach'
faker.last_name_male()
# 'Chen'
faker.last_romanized_name()
# 'Lei'
faker.name()
# 'huang Ming'
faker.name_female()
# 'Zhang Kai'
faker.name_male()
# 'Huang Peng'
Copy the code

User-Agent

User-agent, used to generate content related to the browser user-agent, can be customized to various browsers, and can also pass in version information to control the generated content, as follows:

faker.chrome(version_from=13, version_to=63, build_from=800, build_to=899)
# (' Mozilla / 5.0 (X11; Linux x86_64) AppleWebKit/5332 (KHTML, like Gecko) '
# 'Chrome / 40.0.837.0 Safari / 5332')
faker.firefox()
# (' Mozilla / 5.0 (Macintosh; U; Intel Mac OS X 10_8_9; The rv: 1.9.4.20) '
# 'Gecko / 2019-05-02 05:58:44 Firefox / 3.6.19')
faker.internet_explorer()
# 'Mozilla / 5.0 (compatible; MSIE 8.0; Windows NT 5.2; Trident / 3.0) '
faker.linux_platform_token()
# 'X11; Linux i686'
faker.linux_processor()
# 'x86_64'
faker.mac_platform_token()
# 'Macintosh; U; PPC Mac OS X 10_12_5'
faker.mac_processor()
# 'U; Intel'
faker.opera()
# 'Opera / 9.77. (Windows NT 4.0; Vi -.vn) Presto / 2.9.182 Version 11.00 / '
faker.safari()
# (' Mozilla / 5.0 (Macintosh; PPC Mac OS X 10_7_1 rv:5.0; or-IN) '
# 'AppleWebKit/535.9.4 (KHTML, like Gecko) Version/5.0.2 Safari/535.9.4')
faker.user_agent()
# 'Opera / 8.69 (X11; Linux i686; Ml - IN) Presto / 2.9.170 Version 11.00 / '
faker.windows_platform_token()
# 'Windows NT 6.1'
Copy the code

Other Provider

In addition, there are some providers contributed by the community, such as WiFi and micro-services. You can check the documentation. In addition, you need to install these extension packages and add providers by yourself.

To add a Provider, call the add_provider method as shown in the following example:

from faker import Faker
from faker.providers import internet
faker = Faker()
faker.add_provider(internet)
print(faker.ipv4_private())Copy the code