Files and exceptions

In the actual development, it is often necessary to persist the data in the program, and the most direct and simple way to achieve data persistence is to save the data in a file. The word “file” may require a general introduction to the file system, but wikipedia has a very good explanation of this concept. No more words are wasted here.

Using Python’s built-in open function, you can specify the file name, operation mode, encoding information, and so on to obtain the object to operate on the file. Then you can read and write the file. The operation mode mentioned here refers to what file to open (character file or binary file) and what operation to do (read, write or append), as shown in the following table.

Operating mode Specific meaning
'r' Read (default)
'w' Write (will truncate the previous content)
'x' Write, an exception is raised if the file already exists
'a' Append, to write content to the end of an existing file
'b' Binary mode
't' Text mode (default)
'+' Update (both read and write)

The following image, from the Newbie tutorial website, shows how to set the operation mode according to the needs of the application.

Reading and writing text files

To read a text file, specify the filename with a path (either relative or absolute) when using the open function and set the file mode to ‘r’ (the default is ‘r’ if not specified), then specify the encoding with the encoding argument (if not specified, the default is None, If the encoding used to save the file is not guaranteed to be the same as the encoding specified by the encoding parameter, the read may fail because the characters cannot be decoded. The following example demonstrates how to read a plain text file.

def main(a):
    f = open('To oaks. TXT'.'r', encoding='utf-8')
    print(f.read())
    f.close()


if __name__ == '__main__':
    main()
Copy the code

Note that if the open function specifies a file that does not exist or cannot be opened, an exception will be raised that will cause the program to crash. To give code some robustness and fault tolerance, we can use Python’s exception mechanism to handle code that might have a condition at run time, as shown below.

def main(a):
    f = None
    try:
        f = open('To oaks. TXT'.'r', encoding='utf-8')
        print(f.read())
    except FileNotFoundError:
        print('Cannot open the specified file! ')
    except LookupError:
        print('Specifies unknown encoding! ')
    except UnicodeDecodeError:
        print('Decoding error while reading file! ')
    finally:
        if f:
            f.close()


if __name__ == '__main__':
    main()
Copy the code

In Python, we can place code that might have a condition at run time in a try block, followed by one or more except blocks to catch possible exception conditions. For example, when reading a file above, FileNotFoundError will be raised if the file is not found, LookupError will be raised if an unknown encoding is specified, and UnicodeDecodeError will be raised if the file cannot be decoded as specified. We follow the three except after the try to handle each of these different exception conditions. Finally, we use a finally block to close open files and release external resources acquired by the program. Since the finally block is executed both normally and abnormally (even by calling the SYS module’s exit function to exit Python), the finally block is executed. Because the exit function essentially raises SystemExit), we often refer to the finally block as “always execute code block” and it is best used for freeing external resources. If you don’t want to turn off file object release resources ina finally code block, you can also use context syntax to specify the context of the file object with the with keyword and automatically release file resources when you leave the context, as shown below.

def main(a):
    try:
        with open('To oaks. TXT'.'r', encoding='utf-8') as f:
            print(f.read())
    except FileNotFoundError:
        print('Cannot open the specified file! ')
    except LookupError:
        print('Specifies unknown encoding! ')
    except UnicodeDecodeError:
        print('Decoding error while reading file! ')


if __name__ == '__main__':
    main()
Copy the code

In addition to reading a file using the read method of a file object, you can use a for-in loop to read the file line by line or the readlines method to read the file line by line into a list container, as shown below.

import time


def main(a):
    Read the entire file at once
    with open('To oaks. TXT'.'r', encoding='utf-8') as f:
        print(f.read())

    Read line by line through a for-in loop
    with open('To oaks. TXT', mode='r') as f:
        for line in f:
            print(line, end=' ')
            time.sleep(0.5)
    print()

    Read the file line by line into the list
    with open('To oaks. TXT') as f:
        lines = f.readlines()
    print(lines)
    

if __name__ == '__main__':
    main()
Copy the code

To write text information to a file file is as simple as specifying a filename and setting the file mode to ‘W’ when using the open function. Note that mode should be set to ‘A’ if appending to file contents is required. If the file to be written does not exist, the file is automatically created instead of raising an exception. The following example demonstrates how to write primes between 1 and 9999 to three separate files (primes between 1 and 99 are stored in A.txt, primes between 100 and 999 in B.txt, and primes between 1000 and 9999 in c.txt).

from math import sqrt


def is_prime(n):
    """ a function to judge prime numbers """
    assert n > 0
    for factor in range(2, int(sqrt(n)) + 1) :if n % factor == 0:
            return False
    return True ifn ! =1 else False


def main(a):
    filenames = ('a.txt'.'b.txt'.'c.txt')
    fs_list = []
    try:
        for filename in filenames:
            fs_list.append(open(filename, 'w', encoding='utf-8'))
        for number in range(1.10000) :if is_prime(number):
                if number < 100:
                    fs_list[0].write(str(number) + '\n')
                elif number < 1000:
                    fs_list[1].write(str(number) + '\n')
                else:
                    fs_list[2].write(str(number) + '\n')
    except IOError as ex:
        print(ex)
        print('Error writing file! ')
    finally:
        for fs in fs_list:
            fs.close()
    print('Operation complete! ')


if __name__ == '__main__':
    main()
Copy the code

Reading and writing binaries

Now that you know how to read and write text files, it’s easy to read and write binary files. The following code copies image files.

def main(a):
    try:
        with open('guido.jpg'.'rb') as fs1:
            data = fs1.read()
            print(type(data))  # <class 'bytes'>
        with open('guido. JPG'.'wb') as fs2:
            fs2.write(data)
    except FileNotFoundError as e:
        print('The specified file cannot be opened.')
    except IOError as e:
        print('Error reading/writing file.')
    print('Program execution completed.')


if __name__ == '__main__':
    main()
Copy the code

Reading and writing JSON files

Now that we’ve seen how to save text data and binary data to a file, there’s another question: what if you want to save data from a list or dictionary to a file? The answer is to save the data in JSON format. JSON, which stands for “JavaScript Object Notation,” is a literal syntax for creating objects in JavaScript. It is now widely used for cross-platform and cross-language data exchanges simply because JSON is plain text. There is no problem with any system or programming language handling plain text. JSON has largely replaced XML as the de facto standard for exchanging data between heterogeneous systems. For more information about JSON, you can refer to the official WEBSITE of JSON. From this website, you can also learn about the tools or third-party libraries that each language can use to handle JSON data formats. Here is a simple EXAMPLE of JSON.

{
    "name": "LuoHao"."age": 38."qq": 957658."friends": ["Hammer king"."Bai Yuanfang"]."cars": [{"brand": "BYD"."max_speed": 180},
        {"brand": "Audi"."max_speed": 280},
        {"brand": "Benz"."max_speed": 320}}]Copy the code

As you may have noticed, JSON is the same as Python dictionaries. In fact, it is easy to find a relationship between JSON data types and Python data types as shown in the following two tables.

JSON Python
object dict
array list
string str
number (int / real) int / float
true / false True / False
null None
Python JSON
dict object
list, tuple array
str string
int, float, int- & float-derived Enums number
True / False true / false
None null

You can use the JSON module in Python to save a dictionary or list to a file in JSON format, as shown below.

import json


def main(a):
    mydict = {
        'name': 'LuoHao'.'age': 38.'qq': 957658.'friends': [King's Sledgehammer.'Bai Yuanfang'].'cars': [{'brand': 'BYD'.'max_speed': 180},
            {'brand': 'Audi'.'max_speed': 280},
            {'brand': 'Benz'.'max_speed': 320}}]try:
        with open('data.json'.'w', encoding='utf-8') as fs:
            json.dump(mydict, fs)
    except IOError as e:
        print(e)
    print('Save data done! ')


if __name__ == '__main__':
    main()
Copy the code

The JSON module mainly has four important functions, which are:

  • dump– Serialize Python objects to a file in JSON format
  • dumps– Processes Python objects into jSON-formatted strings
  • load– Deserialize the JSON data in the file into objects
  • loads– Deserialize the contents of a string into a Python object

Two concepts emerge, one called serialization and the other called deserialization. The liberal encyclopedia Wikipedia explains the two concepts this way: “Serialization (serialization) in the data processing of computer science, is refers to converting data structure or object state can be stored or transmitted in the form of such as needed to restore to the original state, and through the serialization of the data to get byte, these bytes can be utilized to produce a copy of the original object (copy). The opposite of this process, the operation of extracting a data structure from a sequence of bytes, is deserialization.

At present, most network data services (or network API) provide data in JSON format based on HTTP protocol. For the relevant knowledge of HTTP protocol, you can read Ruan Yifeng’s “Introduction to HTTP Protocol”. If you want to know about domestic network data services, you can read aggregation Data and Avatar data and other websites. For foreign users, check out the {API}Search website. The following example demonstrates how to use the Requests module (nicely packaged third-party network access module) to access the network API for national news, parse JSON data and display news headlines through the JSON module, using the National news data interface provided by Tianline data. The APIKey needs to go to the website to apply for it.

import requests
import json


def main(a):
    resp = requests.get('http://api.tianapi.com/guonei/?key=APIKey&num=10')
    data_model = json.loads(resp.text)
    for news in data_model['newslist']:
        print(news['title'])


if __name__ == '__main__':
    main()
Copy the code

In Python, you can use the pickle and shelve modules in addition to the JSON module to implement serialization and deserialization, but these two modules use a special serialization protocol to serialize data, so the serialized data is only recognized by Python. For more information about these two modules, you can find information on the Internet. Also, if you want to learn more about the Python exception mechanism, check out SegmentFault’s article summary: Exception Handling in Python. This article not only introduces the use of exceptions in Python, but also summarizes a set of best practices that are well worth reading.