First published on wechat public account: Python Programming time

Online blog: python.iswbm.com/en/latest/c…


1. Make fun of the problem

You’re familiar with pprint in Python, right?

Search any search engine for how to print a nice dictionary or format a string, and most people will recommend it.

For example, the following JSON string or dictionary (which I found casually on the Internet) is impossible to read without formatting and beautifying it.

[{"id":1580615."name":"It's leather."."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":"Your ironhead was here from 2011 to 2017. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":"Nonexistent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":"Betta fish 271934 Don't miss it. It's got the best chicken."}]
Copy the code

If you don’t want to see a bunch of words, use pprint, which is highly recommended, to see what happens (as shown in Python 2, Python 3 has a different effect).

>>> info=[{"id":1580615."name":"It's leather."."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":"Your ironhead was here from 2011 to 2017. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":"Nonexistent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":"Betta fish 271934 Don't miss it. It's got the best chicken."}]
>>> 
>>> from pprint import pprint
>>> pprint(info)
[{'des': '2011-2017 \xe4\xbd\xa0\xe7\x9a\x84\xe9\x93\x81\xe5\xa4\xb4\xe5\xa8\x83\xe4\xb8\x80\xe7\x9b\xb4\xe5\x9c\xa8\xe8\xbf\x99\xe5\x84\xbf \xe3\x80\x82\xe4\xb8\xad\xe5\x9b\xbd\xe6\x9c\x80\xe5\xa4\xa7\xe7\x9a\x84\xe5\xae\x9e\xe5\x90\x8d\xe5\x88\xb6SNS\xe7\xbd\ x91\xe7\xbb\x9c\xe5\xb9\xb3\xe5\x8f\xb0\xef\xbc\x8c\xe5\xab\xa9\xe5\xa4\xb4\xe9\x9d\x92'.'downloadUrl': 'app/com.renren.mobile.android/com.renren.mobile.android.apk'.'iconUrl': 'app/com.renren.mobile.android/icon.jpg'.'id': 1580615.'name': '\xe7\x9a\xae\xe7\x9a\x84\xe5\x98\x9b'.'packageName': 'com.renren.mobile.android'.'size': 21803987.'stars': 2},
 {'des': '\xe6\x96\x97\xe9\xb1\xbc271934 \xe8\xb5\xb0\xe8\xbf\x87\xe8\xb7\xaf\xe8\xbf\x87\xe4\xb8\x8d\xe8\xa6\x81\xe9\x94\x99\xe8\xbf\x87\xef\xbc\x8c\xe8\xbf\x99 \xe9\x87\x8c\xe6\x9c\x89\xe6\x9c\x80\xe5\xa5\xbd\xe7\x9a\x84\xe9\xb8\xa1\xe5\x84\xbf'.'downloadUrl': 'app/com.ct.client/com.ct.client.apk'.'iconUrl': 'app/com.ct.client/icon.jpg'.'id': 1540629.'name': '\xe4\xb8\x8d\xe5\xad\x98\xe5\x9c\xa8\xe7\x9a\x84'.'packageName': 'com.ct.client'.'size': 4794202.'stars': 2}]
Copy the code

Seems to have some effect, it is really a “artifact”.

But you tell me, \xe4\ XBD \ xA0 \xe7\x9a? What was supposed to improve readability is now completely unreadable.

Fortunately, I know a little bit about Python 2 encoding. I know that the default string format (without u) in Python 2 is STR, which is also a type of bytes, which is stored in bytes.

Ok, it seems that I was wrong, I changed, use the Unicode type to define Chinese strings.

>>> info = [{"id":1580615."name":U "Leather"."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":U "2011-2017 Your ironhead was here the whole time. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":U "non-existent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":U "Betta fish 271934 Don't miss it. They have the best chicken."}]
>>> 
>>> from pprint import pprint
>>> pprint(info)
[{'des': u'2011-2017\u4f60\u7684\u94c1\u5934\u5a03\u4e00\u76f4\u5728\u8fd9\u513f\u3002\u4e2d\u56fd\u6700\u5927\u7684\u5b9e\u540d\ u5236SNS\u7f51\u7edc\u5e73\u53f0\uff0c\u5ae9\u5934\u9752'.'downloadUrl': 'app/com.renren.mobile.android/com.renren.mobile.android.apk'.'iconUrl': 'app/com.renren.mobile.android/icon.jpg'.'id': 1580615.'name': u'\u76ae\u7684\u561b'.'packageName': 'com.renren.mobile.android'.'size': 21803987.'stars': 2},
 {'des': u'\u6597\u9c7c271934\u8d70\u8fc7\u8def\u8fc7\u4e0d\u8981\u9519\u8fc7\uff0c\u8fd9\u91cc\u6709\u6700\u597d\u7684\u9e21\u51 3f'.'downloadUrl': 'app/com.ct.client/com.ct.client.apk'.'iconUrl': 'app/com.ct.client/icon.jpg'.'id': 1540629.'name': u'\u4e0d\u5b58\u5728\u7684'.'packageName': 'com.ct.client'.'size': 4794202.'stars': 2}]
Copy the code

Yes, it was a little better, but when I saw the following, I was devastated. I don’t know what the hell this is, am I too bad? Am I a computer?

u'\u6597\u9c7c271934\u8d70\u8fc7\u8def\u8fc7\u4e0d\u8981\u9519\u8fc7\uff0c\u8fd9\u91cc\u6709\u6700\u597d\u7684\u9e21\u513 f'
Copy the code

In addition, we know that json strictly requires double quotation marks, and I used double quotation marks when defining the dictionary. Why single quotation marks? Am I too hard? I can’t even control my own code?

At this point, we know two problems with pprint:

  1. Unable to print Chinese properly in Python 2
  2. Unable to output formatted content in JSON standard format (double quotes)

2. Solve problems

Print Chinese

If you are using Python 3, you will find that Chinese will display normally.

# Python3.7
>>> info = [{"id":1580615."name":U "Leather"."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":U "2011-2017 Your ironhead was here the whole time. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":U "non-existent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":U "Betta fish 271934 Don't miss it. They have the best chicken."}]
>>> 
>>> from pprint import pprint
>>> pprint(info)
[{'des': '2011-2017 Your ironhead was here the whole time. Nentouqing, The largest real-name SNS network platform in China.'downloadUrl': 'app/com.renren.mobile.android/com.renren.mobile.android.apk'.'iconUrl': 'app/com.renren.mobile.android/icon.jpg'.'id': 1580615.'name': 'Leather.'.'packageName': 'com.renren.mobile.android'.'size': 21803987.'stars': 2},
 {'des': 'Betta fish 271934 Don't miss it, they have the best chickens'.'downloadUrl': 'app/com.ct.client/com.ct.client.apk'.'iconUrl': 'app/com.ct.client/icon.jpg'.'id': 1540629.'name': 'nonexistent'.'packageName': 'com.ct.client'.'size': 4794202.'stars': 2}]
>>> 

Copy the code

But a lot of times (on some of the company’s servers) you don’t get to choose which version of Python you use, and I could have chosen not to because there are better alternatives (more on that later).

But I out of curiosity, just two days ago did not write an article about coding, I think they are relatively skilled in coding, I want to solve this problem.

Just look at the pprint source code, I really found a solution, if you also want to challenge, stop here, do your own research on how to implement, I believe you read the source code will be helpful.

Here are my solutions for your reference:

Write your own printer object, derived from PrettyPrinter (printer used by pprint)

In addition, the format method is overridden to determine whether the string object passed in is of type STR. If it is not of type STR, but of type Unicode, it is encoded as STR with UFT8.

# coding: utf-8
from pprint import PrettyPrinter

# Inherit PrettyPrinter, copy format method
class MyPrettyPrinter(PrettyPrinter):
    def format(self, object, context, maxlevels, level):
        if isinstance(object, unicode):
            return (object.encode('utf8'), True.False)
        return PrettyPrinter.format(self, object, context, maxlevels, level)

info = [{"id":1580615."name":U "Leather"."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":U "2011-2017 Your ironhead was here the whole time. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":U "non-existent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":U "Betta fish 271934 Don't miss it. They have the best chicken."}]

MyPrettyPrinter().pprint(info)
Copy the code

The output is as follows, which has solved the Chinese display problem:

Print double quotes

With the Chinese problem solved, let’s see how to get pprint to print double quotes.

When instantiating a PrettyPrinter object, you can receive a stream object that indicates where you want to output content. The default is sys.stdout, which is standard output.

Now we need to modify the output by replacing the single quotes with double quotes.

It is perfectly possible to define a stream object that does not need to inherit from any parent class, as long as you implement the write method.

With that in mind, you can start writing code as follows:

# coding: utf-8
from pprint import PrettyPrinter

class MyPrettyPrinter(PrettyPrinter):
    def format(self, object, context, maxlevels, level):
        if isinstance(object, unicode):
            return (object.encode('utf8'), True.False)
        return PrettyPrinter.format(self, object, context, maxlevels, level)

class MyStream(a):
    def write(self, text):
        print text.replace('\' '.'"')

info = [{"id":1580615."name":U "Leather"."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":U "2011-2017 Your ironhead was here the whole time. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":U "non-existent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":U "Betta fish 271934 Don't miss it. They have the best chicken."}]
MyPrettyPrinter(stream=MyStream()).pprint(info)
Copy the code

I tried it. Oh, my God, that’s what happened.

[{"des"
: 
2011- 2017.Your ironhead was here the whole time. Nentouqing, the largest real-name SNS network platform in China,"downloadUrl": 
"app/com.renren.mobile.android/com.renren.mobile.android.apk"
,
  "iconUrl": 
"app/com.renren.mobile.android/icon.jpg"
,
  "id": 
1580615
,
  "name": Leather."packageName": 
"com.renren.mobile.android"
,
  "size": 
21803987
,
  "stars": 
2}, {"des": bettas271934Don't miss it, they have the best chicken,"downloadUrl": 
"app/com.ct.client/com.ct.client.apk"
,
  "iconUrl": 
"app/com.ct.client/icon.jpg"
,
  "id": 
1540629
,
  "name": Does not exist,"packageName": 
"com.ct.client"
,
  "size": 
4794202
,
  "stars": 
2}]Copy the code

After some research, it turns out that this is because the print function defaults to a newline character.

So how do I get print to print without a line break?

It’s easy, but I’m sure a lot of people don’t know, to just put a comma after the content of print.

It looks like this.

Once you know what the problem is, change the code

# coding: utf-8
from pprint import PrettyPrinter

class MyPrettyPrinter(PrettyPrinter):
    def format(self, object, context, maxlevels, level):
        if isinstance(object, unicode):
            return (object.encode('utf8'), True.False)
        return PrettyPrinter.format(self, object, context, maxlevels, level)

class MyStream(a):
    def write(self, text):
        print text.replace('\' '.'"'),

info = [{"id":1580615."name":U "Leather"."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":U "2011-2017 Your ironhead was here the whole time. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":U "non-existent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":U "Betta fish 271934 Don't miss it. They have the best chicken."}]

MyPrettyPrinter(stream=MyStream()).pprint(info)
Copy the code

That’s it. That must have been hard.

3. Why bother

Through the above toss and turn, I finally realized my dream demand.

The price is that I spent two hours, to achieve, and for xiao Bai, may not have the confidence, also do not have the patience to do such a thing.

So WHAT I’m saying is, pprint in Python 2, really don’t use it anymore.

Why do I say this, because there are better alternatives, and life is short, so why bother writing two classes when you can do it with one line of code? (Am I scolding myself?

If you are willing to ditch pprint, I recommend json.dumps, and I guarantee you never want to use pprint again.

Print Chinese

Pprint is not entirely to blame for Python 2’s failure to print Chinese.

The same problem with json.dumps, however, is not much simpler than with pprint.

Specific code examples are as follows:

>>> info = [{"id":1580615."name":"It's leather."."packageName":"com.renren.mobile.android"."iconUrl":"app/com.renren.mobile.android/icon.jpg"."stars":2."size":21803987."downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk"."des":"Your ironhead was here from 2011 to 2017. The largest real-name SNS network platform in China, Nentou Qing}, {"id":1540629."name":"Nonexistent"."packageName":"com.ct.client"."iconUrl":"app/com.ct.client/icon.jpg"."stars":2."size":4794202."downloadUrl":"app/com.ct.client/com.ct.client.apk"."des":"Betta fish 271934 Don't miss it. It's got the best chicken."}]
>>> 
>>> import json
>>> 
>>> 
>>> print json.dumps(info, indent=4, ensure_ascii=False) [{"downloadUrl": "app/com.renren.mobile.android/com.renren.mobile.android.apk"."iconUrl": "app/com.renren.mobile.android/icon.jpg"."name": "It's leather."."stars": 2."packageName": "com.renren.mobile.android"."des": "Your ironhead was here from 2011 to 2017. The largest real-name SNS network platform in China, Nentou Qing."id": 1580615."size": 21803987
    }, 
    {
        "downloadUrl": "app/com.ct.client/com.ct.client.apk"."iconUrl": "app/com.ct.client/icon.jpg"."name": "Nonexistent"."stars": 2."packageName": "com.ct.client"."des": "Betta fish 271934 Don't miss it. It's got the best chicken."."id": 1540629."size": 4794202}]>>> 
Copy the code

The key parameters for json.dumps are two:

  • Indent =4: indent by 4 Spaces
  • Ensure_ascii =False: Accepts non-ASCII encoded characters so that Chinese can be used

Json. dumps is a complete match for pprint:

  1. Two arguments will do all I need (print Chinese with double quotes)
  2. Even in Python 2, you don’t need to use ChineseU 'Chinese'This kind of writing
  3. Python2 and Python3 are written exactly the same, so there is no need to worry about compatibility

4. To sum up

It was a simple idea, but I spent a lot of time looking at the pprint source code to prove how difficult it was to implement those two requirements (the processing was actually quite complicated), but it worked out well in the end.

That’s it for this article. I think there are three things you can learn from reading this article

  1. Core point: Do not use pprint under Python2
  2. If you really want to use, and have the same transformation requirements, you can refer to my implementation
  3. A print statement in Python 2 can be followed by a comma

The above. I hope this article will be helpful to you.