30 days learning Python👨💻 18 days — File I/O

File Operation Overview

Today I’ll explore how to manipulate and communicate with files using Python. These days I’ve explored and shared several concepts in Python and some programming best practices in Python. However, our programs often need to communicate externally for a variety of reasons, such as reading data from Excel, CSV, or PDF files, converting and compressing images, extracting data from text files, reading data from databases, and many more. Interactions with the outside world are performed using I/O or input-output operations.

Files help us store data permanently in the system. When we write programs to manage data, it is temporarily stored in the machine’s RAM, which is wiped when the computer is shut down. To store data for a long time, it needs to be stored in a database or file system so that it can be used later.

Files can be broadly classified into two categories based on their content:

  • Binary (also known as rich text)
  • Text

If you’re interested in these two file types and want to know more about them, check out this great article

Python provides a built-in open function to open arbitrary files. Any file needs to be opened before data can be read or written. Reading data from files is easy in Python.

I used the REPL as a platform to experiment with all the code snippets provided in this article.

Open the file

I created a test.txt file and wrote some fake data to test it.

# test.txt
I am learning python.
Copy the code

The contents of this file can now be read using Python, like this

main.py

content = open('test.txt')
output = content.read()
print(output) # I am learning python.
Copy the code

When we use the open function to open a file, we can also specify the mode to open it. The default is R (read mode). We can also specify whether the file needs to be opened in text or binary mode.

model describe
r Open the file in read-only mode (default)
w Open in write mode. Create a new file if it doesn’t exist, and overwrite it if it does
x Create a new file, error if the file exists
a Open a file and append at the end, or create a new file if it doesn’t exist
t Open in text mode. (Default)
b Open in binary mode.
+ Open files and update (read and write)

We can also specify the encoding format when opening the file. The default format is UTF-8

Close the file

It is important to close the file after performing operations on it, freeing up memory associated with the file.

main.py

content = open('test.txt', mode='r')
output = content.read()
print(output)
content.close()
Copy the code

A try-except finally statement can be added to the above code to ensure that the file will be closed if any errors occur during the operation.

main.py

try:
    content = open('test.txt', mode='r')
    output = content.read()
    print(output)
except FileNotFoundError as error:
    print(f'file not found {error}')
finally:
    content.close()
Copy the code

Python provides a handy syntax for performing file-opening operations, using the with statement. It automatically closes the file once it’s done.

main.py

with open('test.txt', mode='r') as content:
    output = content.read()
    print(output) # I am learning python.
Copy the code

Written to the file

Python provides a write method to write data to a file. The file needs to be opened in W mode for writing to the file. Note that using W mode overwrites the contents of the file. If you need to append content, you should use mode A. If the file does not exist, it is created and then written.

main.py

with open('test.txt', mode='w', encoding='utf-8') as my_file:
    my_file.write('This is the first line\n') # \n is used for line breaks
    my_file.write('This is the second line\n')
    my_file.write('This is the third line')
Copy the code

main.py

with open('test.txt', mode='a', encoding='utf-8') as my_file:
    my_file.write('This text will be appended')
Copy the code

The alternative is to use the Writelines method. It can provide a list.

with open('test.txt', mode='w', encoding='utf-8') as my_file:
    my_file.writelines(['First line'.'\n'.'Second Line'])
Copy the code

Read the file

Python provides many ways to read files. The file needs to be opened in R mode, or r+ mode if we need to perform both read and write operations. The read method takes a size argument, which is the total number of bytes read. If size is not provided, the entire file is read.

main.py

with open('test.txt', mode='r', encoding='utf-8') as my_file:
    content = my_file.read()
    print(content)
Copy the code

There is also a tell method that tells us the cursor position of the file we are currently reading.

The seeek method is used to move the cursor to the location specified in the file.

main.py

with open('test.txt', mode='r', encoding='utf-8') as my_file:
    my_file.seek(0) Move the cursor to the beginning of the file
    print(my_file.tell()) Output file cursor
    content = my_file.read()
    print(content)
Copy the code

If there are many lines in the file, the most efficient way to read the file is to use a loop.

main.py

with open('test.txt', mode='r', encoding='utf-8') as my_file:
    for line in my_file:
        print(line)
Copy the code

In addition, Python provides two methods, readline and Readlines

Readline reads a file and stops when it encounters a new line (\n)

Readlines returns a list of all rows

Python file methods

Here is a complete list of file methods in Python

methods describe
close() Close the open file. If the file is already closed, there is no impact
detach() Separates the underlying binary buffer from TextIOBase and returns.
fileno() Returns a file descriptor of integer type
flush() Refresh the internal file buffer
isatty() Returns True if the file stream is interactive
read(n) Reads the specified number of bytes from a file. If negative or none is specified, read all
readable()
readline(n=-1) Reads and returns a line from a file. If specified, a maximum of n bytes are read.
readlines(n=-1) Reads and returns a list of rows and columns from a file. If specified, a maximum of N bytes/characters are read.
seek(offset,from=SEEK_SET) Change the file location to offset bytes, referencing from (start, current, end).
seekable()
tell()
truncate(size=None) Resize the file stream to byte size. If size is not specified, adjust to the current position.
writable()
write(s) Writes the string s to a file and returns the number of characters written.
writelines(lines)

Fun exercise

Let’s try to build a language translator that reads a file with English content and creates a new translation of that file in a different language.

For this exercise, we’ll use a third-party package on PyPI called Translate. With the help of this package, we can do offline translation.

First, the package needs to be downloaded. Because I am using the REPL, I will add it to the Packages section of the REPL. If using a local project, we can download it on the console using PIP.

Create a file called quote.txt and write a quote:

quote.txt

If you can't make it good, at least make it look good. - Bill Gates
Copy the code

Now let’s generate two translations of this quote. One is in Spanish with a file named quote-es.txt, and the other is in French with a file named quote-fr.txt

main.py

from translate import Translator

spanish_translate = Translator(to_lang="es")
french_translate = Translator(to_lang="fr")

try:
    with open('quote.txt', mode='r') as quote_file:
        # read the file
        quote = quote_file.read()
        # do the translations
        quote_spanish = spanish_translate.translate(quote)
        quote_french = french_translate.translate(quote)
        # create the translated files
        try:
            with open('quote-es.txt', mode='w') as quote_de:
                quote_de.write(quote_spanish)
            with open('quote-fr.txt', mode='w') as quote_fr:
                quote_fr.write(quote_french)
        except IOError as error:
            print('An error ocurred')
            raise (error)
except FileNotFoundError as error:
    print('File not found')
    raise (error)
Copy the code

Two translation files for this quote will be automatically generated. It does look great!

Built-in file manipulation module

Python provides a built-in module called Pathlib as part of the standard library. It provides a variety of classes that can easily represent file system paths with semantics that are applicable to different operating systems. This module was introduced in Python3.4 and is ideal for dealing with many directories.

Here are some resources to explain the Pathlib module

  • Realpython.com/python-path…
  • Docs.python.org/3/library/p…
  • www.geeksforgeeks.org/pathlib-mod…

We will use the Pathlib module later when we create the project.

That’s all for today. Tomorrow I plan to explore the use of regular expressions in Python and some examples.

The original link

30 Days of Python 👨‍💻 – Day 18 – File I/O