In this article we want to learn about Mastering Excel Manipulation with Python Openpyxl.
Introduction
Python has become a go-to language for data manipulation, and one of the most common data formats is Excel files. While Excel files can be manually modified, using Python to manipulate them can provide significant benefits. One popular library for working with Excel files in Python is openpyxl. In this blog post, we’ll explore openpyxl, and also how to be Mastering Excel Manipulation with Python Openpyxl.
What is Openpyxl ?
Openpyxl is a Python library for reading and writing Excel files (XLSX) and is an alternative to XlsxWriter, which we have covered in another blog post. It was created to support Excel 2010 and later, and it’s been a popular choice among developers since its creation.
Features of Openpyxl
Openpyxl provides several features for working with Excel files, these are some features:
- Reading and writing cell data: With openpyxl you can read and write cell data from Excel files. this includes text, numbers, dates and formulas.
- Formatting cells: You can also format cells in Excel files with openpyxl. This includes setting the font, color, and size of cells, as well as alignment, border styles and more.
- Working with charts: openpyxl supports working with charts allowing you to add charts to Excel files and customize their appearance.
- Data validation: You can add data validation to cells in Excel files with openpyxl. This ensures that users input valid data in specific cells.
- Working with worksheets: openpyxl provides several features for working with worksheets in Excel files, including adding new sheets, renaming sheets and deleting sheets.
How to use Openpyxl ?
To use openpyxl you first need to install the library. You can use pip which is package installer for Python to install openpyxl. Run the following command in your command prompt:
1 |
pip install openpyxl |
Once openpyxl is installed, you can start working with Excel files in Python.
Here’s a simple example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import openpyxl # Open the Excel file workbook = openpyxl.load_workbook('example.xlsx') # Select a worksheet worksheet = workbook['Sheet1'] # Read cell data cell_value = worksheet['A1'].value print(cell_value) # Write cell data worksheet['B1'] = 'Hello, World!' # Save the changes workbook.save('example.xlsx') |
In this example we first opened an Excel file called “example.xlsx” and selected worksheet named “Sheet1”. After that we read the value of cell A1 and printed it to the console. Next we wrote the string “Hello, World!” to cell B1 and saved the changes to the file.
These are some more examples of what you can do with openpyxl in Python:
- Adding a new worksheet:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import openpyxl # Open the Excel file workbook = openpyxl.load_workbook('example.xlsx') # Create a new worksheet worksheet = workbook.create_sheet('New Sheet') # Write data to the new worksheet worksheet['A1'] = 'New data' # Save the changes workbook.save('example.xlsx') |
In this example, we added a new worksheet called “New Sheet” to the Excel file and wrote data to cell A1.
- Formatting cells:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import openpyxl from openpyxl.styles import Font, Alignment, Border, Side # Open the Excel file workbook = openpyxl.load_workbook('example.xlsx') # Select a worksheet worksheet = workbook['Sheet1'] # Set cell font, alignment, and border font = Font(name='Arial', size=12, bold=True, color='FF0000') alignment = Alignment(horizontal='center', vertical='center') border = Border(left=Side(style='thin'), right=Side(style='thin'), top=Side(style='thin'), bottom=Side(style='thin')) cell = worksheet['A1'] cell.font = font cell.alignment = alignment cell.border = border # Save the changes workbook.save('example.xlsx') |
In this example, we formatted cell A1 with a red, bold, Arial font, centered alignment, and a thin border on all sides.
- Working with charts:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
import openpyxl from openpyxl.chart import BarChart, Reference, Series # Open the Excel file workbook = openpyxl.load_workbook('example.xlsx') # Select a worksheet worksheet = workbook['Sheet1'] # Create a chart chart = BarChart() chart.title = 'Sales by Region' chart.x_axis.title = 'Region' chart.y_axis.title = 'Sales' # Define chart data data = Reference(worksheet, min_col=2, min_row=1, max_col=3, max_row=6) categories = Reference(worksheet, min_col=1, min_row=2, max_row=6) # Add chart data and categories chart.add_data(data, titles_from_data=True) chart.set_categories(categories) # Add the chart to the worksheet worksheet.add_chart(chart, 'E1') # Save the changes workbook.save('example.xlsx') |
In this example, we created a bar chart showing sales by region using data from columns B and C and categories from column A.
These examples demonstrate the versatility and power of openpyxl for working with Excel files in Python.
Learn More on Python
- Python-Docx: Creating Microsoft Word Documents with Python
- Python and Microsoft Word: A Beginner’s Guide to Automating Documents
- How to Install docx2python: Python Library for Word Documents
- Merge Microsoft Word Documents with Python Docxcompose
- Master Excel File Manipulation with Python XlsxWriter
- Python Treq: An Introduction to a Powerful HTTP Client Library
- Introduction to Python httplib2 Library
- An Introduction to Python’s urllib Library
- Python httpx: A High-Performance HTTP Client for Python 3
What are Other Python Libraries Instead of Openpyxl
There are several other Python libraries that can be used for working with Excel files. Some of these include:
- xlwt: A library for writing data to Excel files in the older .xls format.
- xlrd: A library for reading data from Excel files in the older .xls format.
- pandas: A library for data analysis and manipulation that includes functionality for reading and writing Excel files.
- pyxlsb: A library for reading and writing Excel files in the newer .xlsb binary format.
- xlsxwriter: A library for creating Excel files from scratch, with support for formatting and charting.
- win32com: A library for interacting with Microsoft Office applications, including Excel, using COM (Component Object Model) technology.
Each of these libraries has its own strengths and weaknesses, and the choice of which one to use will depend on the specific task and requirements of the project. For example, if you need to read and manipulate large amounts of data from Excel files, pandas may be a good choice due to its powerful data analysis tools. If you need to create an Excel file from scratch, xlsxwriter may be a good choice due to its formatting and charting capabilities. Ultimately, the best library to use will depend on your specific use case and preferences.
Final Thoughts
Openpyxl is popular Python library for working with Excel files, providing features for reading and writing cell data, formatting cells, working with charts and many more. With openpyxl, you can manipulate Excel files programmatically, automate data processing tasks and create reports. The library is easy to use and well-documented making it popular choice among developers.