Basic Concepts of Excel

  • Workbook: An Excel spreadsheet document with the extension.xlsx
  • Worksheets: A workbook can contain up to 255 worksheets
  • Active table: The last table the user exits before viewing or closing Excel
  • Column: starts at A by default, row: starts at 1 by default
  • Cell: The square where rows and columns cross is a cell

Install the OpenPyXL module

# pip install openpyxl
import openpyxl
pip show openpyxl Check the version of the package
Copy the code

Reading Excel documents

Open Excel documents with the OpenPyXL module

wb = openpyxl.load_workbook(r"C:\\Users\\Administrator\\example.xlsx") # wb means workbook
Copy the code

Get the worksheet from the workbook

Get all sheet_name for traversal
wb.get_sheet_names()

Get the sheet with the specified name
sheet3 = wb.get_sheet_by_name('Sheet3') #get sheet by name

sheet3.title #view the title of sheet
anotherSheet = wb.get_active_sheet() #get active sheet
anotherSheet.title # Active cell title
Copy the code

Retrieves cells from the table

import openpyxl
wb = openpyxl.load_workbook('example.xlsx')
sheet1 = wb.get_sheet_by_name("Sheet1")

Extract elements using alphanumeric methods
sheet1['A1'] # Cell Extracts the Cell
sheet1['A1'].value The data type and content of the cell
sheet1['A1'].row # The row on which the cell resides
sheet1['A1'].column The column in which the cell is located
sheet1['A1'].coordinate The column in which the cell is located

Use cell as a row or column
sheet1.cell(row=1,column=2) Extract the cells in row 1 and column 2
sheet1.cell(row=1,column=2).value Extract the values of the cells in row 1 and column 2
for i in range(1.8.2) :print(i,sheet1.cell(row=i,column=2).value)
Copy the code

Gets the maximum number of rows and columns

import openpyxl
wb = openpyxl.load_workbook(r"C:\\Users\\Administrator\\example.xlsx")
sheet1 = wb.get_sheet_by_name('Sheet1')
sheet1.max_row  # Maximum number of lines
sheet1.max_column # Maximum number of columns
Copy the code

Column conversion between letters and numbers

import openpyxl
from openpyxl.utils import get_column_letter,column_index_from_string 
get_column_letter(1)
get_column_letter(100)
column_index_from_string('A')
column_index_from_string('AA')
Copy the code

Retrieves rows and columns from a table

import openpyxl
wb = openpyxl.load_workbook(r"C:\\Users\\Administrator\\example.xlsx")
sheet1 = wb.get_sheet_by_name('Sheet1')

Pay attention to the use of the whole section
tuple(sheet1['A1':'C3']) Each row is a separate element of the tuple
list(sheet1['A1':'C3']) Each line is a separate element of the list

for rowOfCellObjects in sheet1['A1':'C3'] :# Print area
    for cellObj in rowOfCellObjects:
        print(cellObj.coordinate,cellObj.value)
    print('---End of Row---')
Copy the code

Print a separate column

import openpyxl
wb = openpyxl.load_workbook(r"C:\\Users\\Administrator\\example.xlsx")
sheet1 = wb.get_active_sheet()

Print a single column
for cellObj in list(sheet1.columns)[1] :print(cellObj.value)
Copy the code

summary

  • Import the OpenPyXL module
  • Call the openPyxl.load_workbook () function
  • Get the Workbook object
  • Call the get_active_sheet() or get_sheet_by_name() workbook method
  • Use the index or worksheet cell() method, which takes row and column arguments
  • Get Cell object
  • Read the value/row/column coordinate of the Cell object properties

Write to Excel document

Create and save Excel documents

In practice the following code will have a problem, we create a new WB object, no active page

import openpyxl

Create an empty object
wb = openpyxl.Workbook() 
wb.get_sheet_names() View the sheet of the empty object
sheet = wb.get_active_sheet() Get the current active worksheet
sheet.title
sheet.title = 'Spam Bacon Eggs Sheet' Change the title of the current worksheet
wb.get_sheet_names() View the modified active worksheet title
Copy the code

Improved code

import openpyxl

Create an empty object
wb = openpyxl.Workbook() 
wb.get_sheet_names() View the sheet of the empty object

# Note that active is not a function
sheet = wb.active Get the current active worksheet
sheet.title
sheet.title = 'Spam Bacon Eggs Sheet' Change the title of the current worksheet
wb.get_sheet_names() View the modified active worksheet title
Copy the code

Output [‘Sheet’] Sheet [‘Spam Bacon Eggs Sheet’]

To operate existing Excel, you can use the get_active_sheet routine

import openpyxl
wb = openpyxl.load_workbook('example.xlsx')
sheet = wb.get_active_sheet()
sheet.title = 'Spam Spam Spam'
wb.save('example_copy.xlsx') Save a copy of the modified worksheet name
Copy the code

Create and delete worksheets

import openpyxl
wb = openpyxl.Workbook()
wb.get_sheet_names()

Create a sheet page
wb.create_sheet()
wb.get_sheet_names()

Create a sheet page with the specified name and location
wb.create_sheet(index=0,title='First Sheet')
wb.get_sheet_names()
wb.create_sheet(index=2,title = 'Middle Sheet')
wb.get_sheet_names

# Delete sheet page by name
wb.remove_sheet(wb.get_sheet_by_name('Middle Sheet'))
wb.remove_sheet(wb.get_sheet_by_name('Sheet1'))
wb.get_sheet_names()
['First Sheet'.'Sheet']
Copy the code

Writes values to cells

import openpyxl
wb = openpyxl.Workbook()
sheet = wb.get_sheet_by_name('Sheet')
sheet['A1'] = 'Hello World'
sheet['A1'].value
Copy the code

Project: Update a spreadsheet

Use updated information to build data structures

import openpyxl
wb  = openpyxl.load_workbook('produceSales.xlsx')
sheet = wb.get_sheet_by_name('Sheet')
# the produce types and their updated prices
price_updates = {'Garlic':3.07.'Celery':1.19.'Lemon':1.27}
#ToDo:Loop through the rows and update the prices.
Copy the code

Check all rows and update incorrect prices (high combat value)

For rowNum in range(2,sheet.max_row+1):

for rowNum in range(2,sheet.max_row+1) :#the first row is heading, skip it
    produceName = sheet.cell(row=rowNum,column=1).value
    if produceName in price_updates:
        sheet.cell(row=rowNum,column=2).value = price_updates[produceName]
wb.save('updatedProduceSales.xlsx')
Copy the code

Set cell font style (Focus on)

from openpyxl.styles import Font
wb = openpyxl.Workbook()
sheet = wb.get_sheet_by_name('Sheet')
italic24Font = Font(size = 24, italic = True)
sheet['A1'].font = italic24Font
sheet['A1'] = 'Hello world! '
wb.save('styled.xlsx')
Copy the code

The Font object

import openpyxl
from openpyxl.styles import Font
wb = openpyxl.Workbook()
sheet = wb.get_sheet_by_name('Sheet')

fontobj1 = Font(name='Times New Roman',italic=True) #name Font size Indicates whether to bold italic indicates whether to italic
sheet['A1'].font = fontobj1
sheet['A1'] = 'Bold Times New Roman'

fontobj2 = Font(size=24,italic=True)
sheet['B3'].font = fontobj2
sheet['B3'] ='23 pt Italic'

wb.save('styles.xlsx') # Default size 11 name Calibri
Copy the code

Formula (To understand)

Written to the formula

import openpyxl
wb = openpyxl.Workbook()
sheet = wb.get_active_sheet()
sheet['A1'] = 200
sheet['A2'] = 300
sheet['A3'] = '=SUM(A1:A2)'
wb.save('writeFormula.xlsx')
Copy the code

Only the formula in the table can be read, not the result of the formula

import openpyxl
wb = openpyxl.load_workbook('writeFormula.xlsx')
sheet = wb.get_active_sheet()
sheet['A3'].value
Copy the code

Get the result of the formula

import openpyxl
wbDataonly = openpyxl.load_workbook('writeFormula.xlsx',data_only=True)
sheet1 = wbDataonly.get_active_sheet()
sheet1['A3'].value Open the.xlsx file manually
Copy the code

Adjust rows and columns (Focus on)

Adjust row height and column width

import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active
sheet['A1'] = 'Tall row'
sheet['B2'] = 'Wide column'
sheet.row_dimensions[1].height = 70
sheet.column_dimensions['B'].width = 20
wb.save('dimensions.xlsx')
Copy the code

Merge and split cells (Focus on)

Merged cell

import openpyxl
wb = openpyxl.Workbook()
sheet = wb.active

# merge cells
sheet.merge_cells('A1:D3') 
sheet['A1'] = 'Twelve cells merged together'
sheet.merge_cells('C5:D5')
sheet['C5'] = 'Two merged cells.'
wb.save('merged.xlsx')
Copy the code

Split cell

import openpyxl 
wb = openpyxl.load_workbook('merged.xlsx') # Split cells
sheet = wb.active
sheet.unmerge_cells('A1:D3')
sheet.unmerge_cells('C5:D5')
wb.save('merged.xlsx')
Copy the code

Freeze pane (Focus on)

import openpyxl
wb = openpyxl.load_workbook('produceSales.xlsx')
sheet = wb.active
sheet.freeze_panes = 'C2' 
# "A2" means freeze row1 'B1' means freeze columnA 'C2' means freeze row1 and columnA/columnB 'A1' or None means no freezed
wb.save('freezeExample.xlsx')
Copy the code

The chart (To understand)

import openpyxl 
wb = openpyxl.Workbook()
sheet = wb.active
for i in range(1.11) :#create some data in column A
    sheet['A'+str(i)] = i

refObj = openpyxl.chart.Reference(sheet,min_row = 1,min_col = 1,max_row = 10,max_col = 1) Create data area
seriesObj = openpyxl.chart.Series(refObj,title='First series')
chartObj = openpyxl.chart.BarChart()
chartObj.title = 'My chart'
chartObj.append(seriesObj)

sheet.add_chart(chartObj,'C5') #'C5' indicates the starting position of the chart
wb.save('sampleChart.xlsx')
Copy the code

Problem sets and summaries

  • What does the openPyxl.load_workbook () function return?
    • Return the Workbook object
  • What does the get_sheet_names() workbook return?
    • Returns a list of all worksheet names
  • How to obtain a Worksheet object named “Sheet1”
    • openpyxl.get_sheet_by_name(“Sheet1”)
  • How do I get the Worksheet object of the inactive Worksheet
    • wb.get_active_sheet() / wb.active
  • How do I get the value in cell “C5”
    • sheet[‘C5’].value / sheet.cell(row=5,column=3).value
  • How to set the value in cell C5 to ‘Hello’
    • sheet[‘C5’] = ‘Hello’
  • How do I get the integers that represent the rows and columns of a cell?
    • get_column_letter(int) int –> char
    • column_index_from_string char –> int
  • What is returned by max_row/max_column, and what is the type of the returned value?
    • Sheet1.max_row returns the row number of the cell area as integers
    • Sheet1.max_column returns the column number of the cell region as an integer
  • What function is called to get the index of column ‘M’
    • column_index_from_string(‘M’)
  • What function is called to get the string name for column 14
    • get_column_letter(14)
  • How do I get tuples of all Cell objects from A1 to F1
    • tuple(sheet[‘A1’:’F1])
  • How do I save the workbook to the filename example.xlsx?
    • wb.save(‘example.xlsx’)
  • How do I set a formula in a cell?
    • sheet[‘B1’] = ‘==SUM(A1:B15)’
  • What must I do first if I want to get the result of a formula in a cell, rather than the formula itself?
    • Read with data_only = True
    • Mannual opens and saves a workbook
  • How do I set the height of line 5 to 100
    • sheet.row_dimensions[5] = 100
  • How do I set the width of column C to 70
    • sheet.column_dimensions[‘C’] = 70
  • List some features that OpenPyXl2.1.4 will not load from XLSX
    • Version of chart # has been updated, new version unknown
  • What is a freeze pane?
    • Sheet. Freeze_panes = ‘C2’ freezes the first row and columns A and B
  • Steps and methods to create a bar chart:
    • Loading data OpenPyxl.load_workbook ()
    • Create the reference object # data source area for the diagram
    • Create a series object and add the reference object to the series object
    • Create chart object and add series object to chart object (append method)
    • Add chart object to sheet (add_chart method)