Openpyxl introduction

Openpyxl is a Python library that can read and write XLSX, XLSM, XLTX, and XLTM files directly. It can be used to automate batch operations on local XLSX files using Python syntax

If you are using Anaconda as a Python environment, OpenPyXL can be used directly without installation. If you need to install it, the method is very simple. The PIP tool is a single command

pip install openpyxl
Copy the code

XLSX file attributes

Before dealing with The Excel table, we need to understand a few terms of the XLSX file interpretation and construction

1. What does Workbook mean by god?

A Workbook is an XLSX file.

2. What do sheet, cell, row and col refer to respectively?

For question 2, please refer to the following figure (see note)

Openpyxl basic command operations

Create an empty workbook
from openpypl import Wrokbook
from openpyxl.utils import get_column_letter

wb = Workbook()
ws1 = wb.active
Copy the code

By default, a Workbook contains at least one worksheet. Run the workbook. active command to get the current first sheet.

Create a new Worksheet
ws1 = wb.create_sheet("Mysheet") # insert at the end (default)

ws2 = wb.create_sheet("Mysheet", 0) # insert at first position
# or
ws3 = wb.create_sheet("Mysheet", -1) # insert at the penultimate position
Copy the code

You can run create_sheet() to create a new worksheet. By default, create_sheet has two parameters: name and index.

  • Name defines the name of sheet;
  • Index, to set where to insert the sheet, default to None which means that the newly created sheet is inserted at the end; 0 is where the first sheet was inserted;
3. Change the name of the sheet

One line of code

ws.title = 'New Title'
Copy the code
4. Change the TAB background color on the sheet
ws.sheet_properties.tabColor = "1072BA"
Copy the code

You can modify the wroksheet.sheet_properties. TabColor parameter. Note that only the RRGGBB color code is accepted;

The background color of sheet TAB is not clear.

5, Return the names of all sheets in the Workbook

Run the workbook. sheetname command to view the sheetname

>>> Print (wb.sheetnames) ['Sheet2', 'New Title', 'Sheet1'] # print(sheet.title)Copy the code
6. Copy the existing Worksheets into the newly created Workbook

The workbook.copy_worksheet () function is available

Source = wb.active target = wb.copy_worksheet(source)Copy the code

Note that workbook cannot be copied when the workbook is in read-only or write-only mode. In addition, only cells(value, style, hyperlink, comment) and specific Worksheet properties (dimension, format, attribute) can be copied. Other workbook/Worksheet properties cannot be copied (Images Charts, etc.).

7. Obtain the data of a cell

Once you have successfully created the Workbook and Worksheet, you can then modify the cell(the contents of the cell can be retrieved by a specific keyword in the Worksheet)

>>> c = ws['A4'] #Copy the code

It is modified by an assignment command

ws['A4'] = 4
Copy the code

Openpyxl has a function called worksheet.cell () that modifies the data in a cell by locating it to a specific row or column.

d = ws.cell(row = 4,columns = 2,value = 10)
Copy the code
  • Row indicates a specified row
  • Columns Indicates a specified column
  • Value indicates the data value to be replaced in the cell. If this parameter is not set, only memory space is created for the cell and no value is assigned

For example,

>>> for x in range(1,101):... For y in range(1,101):... ws.cell(row=x, column=y)Copy the code
8. Get multiple cells

8.1 openPyXL can also be sliced to obtain multiple cells

>>> cell_range = ws['A1':'C2']
Copy the code

8.2. The acquisition methods of multi-row and multi-column data in Worksheet are similar

>>> colC = ws['C']
>>> col_range = ws['C:D']
>>> row10 = ws[10]
>>> row_range = ws[5:10]
Copy the code

Iter_row () retrieves the range of rows and columns in the sheet and iterates through each cell

>>> for row in ws.iter_rows(min_row=1, max_col=3, max_row=2):
...    for cell in row:
...        print(cell)
<Cell Sheet1.A1>
<Cell Sheet1.B1>
<Cell Sheet1.C1>
<Cell Sheet1.A2>
<Cell Sheet1.B2>
<Cell Sheet1.C2>
Copy the code

8.4, worksheet.iter_cols () can do the same

>>> for col in ws.iter_cols(min_row=1, max_col=3, max_row=2):
...     for cell in col:
...         print(cell)
<Cell Sheet1.A1>
<Cell Sheet1.A2>
<Cell Sheet1.B1>
<Cell Sheet1.B2>
<Cell Sheet1.C1>
<Cell Sheet1.C2>
Copy the code

Note that the Worksheet,iter_cols() methods cannot be used in read-only mode

9. View only the value in the cell

The worksheet. Values property iterates over all rows in the worksheet, but returns only cell values

for row in ws.values:
   for value in row:
     print(value)
Copy the code

Iter_rows () and worksheet.iter_cols (). Values_only = True returns the cell value

>>> for row in ws.iter_rows(min_row=1, max_col=3, max_row=2, values_only=True):
...   print(row)

(None, None, None)
(None, None, None)
Copy the code

10. Read and write files

10. Load the Excel file

The openPyxl.load_workbook () function opens an XLSX file specified for local storage

>>> from openpyxl import load_workbook
>>> wb2 = load_workbook('test.xlsx')
>>> print wb2.sheetnames
['Sheet2', 'New Title', 'Sheet1']
Copy the code

10.2. Excel file Storage

You can run the workbook.save (path) command to save the modified Workbook to the local disk after it is created

>>> wb = Workbook()
>>> wb.save('balances.xlsx')
Copy the code

Well, that’s all for this article. Thank you for reading!