This article is for Ver 3.0.2 or earlier. OpenPyXL is being improved day by day, and in the future it may be possible to resolve the issues described here.
OpenPyXL allows you to brute force create an Excel sheet from a brand new workbook with just Python code, but creating it from scratch makes the code more difficult to read and separates the code and design. It is more advantageous for maintenance, so I think it is a standard practice to read a template Excel workbook designed to some extent with OpenPyXL and set the necessary data.
However, OpenPyXL works in a completely different way from Excel, so there are some things that can be done easily with Excel VBA but not with OpenPyXL.
Except for the following conditions, it is often difficult to implement processing using the template Excel workbook only with the current OpenPyXL.
--Those without images or shapes --Those that do not require a copy of the worksheet --Those that do not make heavy use of merged cells --Do not add / remove rows or columns in the middle of the worksheet
--Things with images and shapes
In OpenPyXL, images and shape information are ignored when the book is opened. If you want to handle images and shapes in OpenPyXL, you need to set it in the OpenPyXL process after opening the book.
--What you need to copy the worksheet
The function workbook.copy_worksheet ()
that copies worksheets behaves differently than that of Excel (Worksheets.Copy
) and has some information that is not copied, such as conditional formatting.
Rewriting the OpenPyXL source code can help to some extent, but it still doesn't copy as perfectly as Excel.
--Those that make heavy use of merged cells
OpenPyXL can handle merged cells themselves, but there are many restrictions such as drawing ruled lines and setting formats. (It is easy to cause trouble) For the parts operated by OpenPyXL, it is necessary to avoid using merged cells as much as possible by devising the design, or to confirm that there is no problem in the operation of OpenPyXL before using it.
--Do not add rows or columns in the middle of the worksheet
worksheet.insert_rows ()
and worksheet.insert_col ()
that insert rows / columns are ** pretty ** behaviors of Excel's (row (). Insert
, col (). Insert
) Is different.
Even if the above function is executed, the attribute information (defined name, format, conditional format) given to the cell, merged cell, image information placed on the cell, function formula, etc. do not follow, so the current situation The version is ** virtually useless **.
If you add (overwrite) rows and columns to the end instead of in the middle of the worksheet, the problem should be relatively small.
OpenPyXL can handle a template Excel workbook (xlsm) containing VBA macros by adding keep_vba = True
to the open argument.
Therefore, it is possible to realize difficult / impossible processing with OpenPyXL by letting Excel VBA macros perform difficult processing with OpenPyXL.
(Unfortunately, it cannot be done in an environment where execution of Excel VBA macros is not permitted due to internal circumstances etc.)
We will organize the processing performed by OpenPyXL and the processing performed by Excel VBA macros, and perform code processing and settings for Python and Excel VBA respectively. For example, you can copy a worksheet and add / delete matrices in the middle of a worksheet with an Excel VBA macro.
For the data required by the Excel VBA macro, create a data worksheet separate from the template worksheet and set the data in it with OpenPyXL.
If you need to handle images that can only be created on the OpenPyXL side, create a data worksheet and place the images on it with OpenPyXL. If you cannot set an identification name for the image created by ʻadd_image () `of OpenPyXL and you need to pass multiple images, you can distinguish the images because the Excel VBA macro cannot distinguish the images. It is also necessary to devise something like this.
I think it's faster to take a look at the sample than to explain it in a long way. The main processing of both Python and Excel VBA is about 100 lines. To run the sample, Python requires the OpenPyXL package as well as the Pillow package for editing images. (If you get an error, please install the package if necessary)
It is available on Github (https://github.com/umazular/openpyxl).
--Standard Python 2.7.5 + OpenPyXL 2.6.4 on CentOS 7 --Python 3.7.3 + OpenPyXL 3.0.2 on Raspbian (Debian 10.2) --Python 3.7.2 + OpenPyXL 3.0.0 installed on Windows10
I have confirmed at. (I try not to depend on the environment as much as possible)
The sample was created by imagining the output of the receipt. Read one or more CSV receipt data + images and output the result to Excel. In the sample, CSV files and images for 2 sheets are created in advance. The default CSV file of the sample is Shift-JIS, so replace it with UTF-8 according to your environment.
I tried to set the template report sheet ("receipt") that seems difficult with OpenPyXL.
--Conditional formatting to change the background color for each line in the line --A function formula that sets the product of quantity and unit price in the amount --A function formula that sets the total amount of money --Format when printing a sheet (settings such as margins and printing of all columns)
Excel VBA can be opened by selecting "Development"-> "Visual Basic" from the Excel menu. If the "Development" tab is not displayed, open the Excel options screen with File-> Options and check "Development" on the main tab of the user settings on the ribbon to display it.
Running a python program will generate an Excel workbook (sampleoutput.xlsm) with the data set.
$ ls *.xlsm
sample.xlsm
$ python sample.py
$ ls *.xlsm
sample.xlsm sampleoutput.xlsm
$
When the output Excel workbook is opened in an environment where macro execution is permitted, the data setting macro processing is executed only for the first time and the result is created.
--Official document (English) https://openpyxl.readthedocs.io/
--Notes on how to use openpyxl in python https://qiita.com/sky_jokerxx/items/dc9d8827d946b467ba4b
Recommended Posts