Srikanth Technologies

How to use Excel in Python using openpyxl

Excel is a very popular spreadsheet software. There are lots of spreadsheets with very valuable data. In this blog, we learn how to use Excel spreadsheets from Python using a library called openpyxl.

Installing openpyxl

Install openpyxl library from pypi.org using the following pip command from command prompt.

pip install openpyxl

Creating workbook and sheet

In order to create a new .xlsx file with a new worksheet, use the following code.

from openpyxl import Workbook

wb = Workbook() # create a new workbook that contains sheets
ws = wb.active  # get access to active (default) sheet
ws.title = "Products"  # Change title of the sheet

# place data in different cells
ws['A1'] = 'Name'
ws['B1'] = 'Price'
ws['A2'] = 'iPhone 14 Pro'
ws['B2'] = 120000
ws['A3'] = 'AirPods 3rd Gen'
ws['B3'] = 20000

wb.save('products.xlsx')  # save workbook to products.xlsx file in the current directory

Reading data from a sheet

In order to open xlsx file and read data from a sheet called products, use the following code.

from openpyxl import load_workbook

wb = load_workbook(filename='products.xlsx')   # Open file products.xlsx
ws = wb["Products"]  # Access sheet with name Products

# display values from specific cells - a2 and b2
print(f"{ws['a2'].value} - {ws['b2'].value}")

Use the following code to read data from specific rows and columns using iter_rows() function.

from openpyxl import load_workbook

wb = load_workbook(filename='products.xlsx')   # Open file products.xlsx
ws = wb["Products"]  # Access sheet with name Products
 
from openpyxl import load_workbook

wb = load_workbook(filename='products.xlsx')
ws = wb["Products"]  # Access sheet with name Products

for row in ws.iter_rows(min_row=2, max_col=2, max_row=3):
    for cell in row:
        print(cell.value)

We can read all values from all cells using values() function as follows:

from openpyxl import load_workbook

wb = load_workbook(filename='products.xlsx')   # Open file products.xlsx
ws = wb["Products"]  # Access sheet with name Products
 
from openpyxl import load_workbook

wb = load_workbook(filename='products.xlsx')
ws = wb["Products"]  # Access sheet with name Products

# values returns a collection of tuples, where each tuple represents a row and contains values of columns
for row in ws.values:
    for cell in row:
        print(cell)

Converting worksheet data to JSON

We can convert values in a sheet to an array of JSON objects. In order to convert rows to an array of JSON objects, we need to first convert each row to a dictionary and add these dictionaries to a list.

Finally we can convert list of dict to an array of JSON objects using dumps() function of json module.

from openpyxl import load_workbook
import json

wb = load_workbook(filename='products.xlsx')
ws = wb["Products"]  # Access sheet with name Products

headings = [ws['a1'].value, ws['b1'].value]    # take values from first row 

products = []
for row in ws.iter_rows(min_row=2, max_col=2, max_row=3): # take two rows from 2nd row 
    # convert each row to a dict by using heading as key and cell value as value
    product = {} 
    for heading, cell in zip(headings, row):
        product[heading] = cell.value 

    products.append(product)    # add dict to list

# Convert a list of dict to an array of JSON Objects 
print(json.dumps(products))

# Write JSON array to file 
json.dump(products, open("products.json","wt"))

Load data from worksheet to table

Now we take data from worksheet and insert rows into PRODUCTS table (with id, name and price columns) in Sqlite database.

from openpyxl import load_workbook
import sqlite3

# connect to database
con = sqlite3.connect(r"c:\dev\python\test.db")
cur = con.cursor()

# Open Excel file and access sheet
wb = load_workbook(filename='products.xlsx')
ws = wb["Products"]  # Access sheet with name Products

for row in ws.iter_rows(min_row=2, max_col=2, max_row=3):
    values = []   # create a list that contains values of current row 
    for cell in row:
        values.append(cell.value)

    # insert a row into PRODUCTS table. Convert list to tuple and pass it as second parameter for execute()
    cur.execute("insert into products(name, price) values(?, ?)", tuple(values))

# make insertions permanent 
con.commit()

# close connection and cursor
cur.close()
con.close()

In this blog, I tried to give you an overview of what can be done with openpyxl library.

Do read its documentation if you want to know what else can be done.

All the best!

Srikanth Pragada