Overview

I had a chance to process an Excel file for a while in my work, so of the processing at that time I tried to summarize the process of adding columns to the Excel file and deleting rows with specific conditions. It may be useful when you need to process the template or the original data a little.

Things necessary

python 3.7.2
pandas
numpy
openpyxl

There is no exe this time.

Publication place

It is published on github.

How to use

Perform various operations on the reference data .xlsx in the input directory.
Modify the contents of reference data .xlsx based on the definition file .xlsx in the input directory.
It is as shown under each item

Item number	Sheet name	Column name	Description
1	Additional columns	item name	Set the value corresponding to the column name of the reference data Add as column name
2	Additional columns	Additional columns	Criteria data Add to the specified column Set starting from column A
3	Additional columns	additional data	Specify the sheet that describes the data to be added. For data, set the value corresponding to the row position.
4	Delete line	item name	Specify the set value corresponding to the column name of the reference data. Specify the column to be used for the condition of the row to be deleted.
5	Delete line	conditions	削除する行のconditionsを指定する。

Source description

The following processing is part of the column addition processing. A sheet that acquires the column name and the position to be added from the acquired file and further describes the data I am getting the data to add from. ʻThe column is added at the position specified by the inser_cols` method. The value is set in the column that acquired the data and added it. I also draw a ruled line when setting the value.

        for columninfo in addColumnInfos:
            addColumnName=columninfo[1]
            addColumnPosition=columninfo[2]
            addColumnData=columninfo[3]
            addColmunDatas=inputWorkbook.parse(addColumnData)
            addDatas=np.asarray(addColmunDatas)
            baseSheet.insert_cols(addColumnPosition,1)
            baseSheet.cell(column=addColumnPosition,row=2,value=addColumnName).border=blackBorder
            for i,data in enumerate (addDatas):
                baseSheet.cell(column=addColumnPosition,row=i+3,value=data[1]).border=blackBorder

The following processing is the line deletion part. Acquires the column name and deletion condition for which the condition to be deleted is specified from the acquired file. After that, after deciding the position of the row to be deleted (the condition is requested by pandas query) The position is deleted by the delete_rows method. If you delete a line, it will be clogged by the deleted amount, so the position of the line is corrected.

        for columninfo in addColumnInfos:
            delTargetColumnName=columninfo[1]
            delRowCondition=columninfo[2]
            deleteRows=targetDeleteRowSheet.query(delTargetColumnName+'=="'+delRowCondition+'"')
            deleteIndexs=deleteRows.index

            log.info(deleteIndexs)
            for delcount,delindex in enumerate(deleteIndexs):
                
                baseSheet.delete_rows(startIndex+delindex-delcount)

How to use

This time, it is deleted based on the conditions of the file defined appropriately, but For example, the row whose summary table result is 0 is deleted. It may be used as one of the methods when processing Excel, such as adding a column to some form and making it for submission.

Excel file column addition and row deletion processing using Python Openpyxl

Overview

Things necessary

Publication place

How to use

Source description

How to use