Notes for reading and generating xlsx files from Java using Apache POI

About this article

The goal is for people who want to use Apache POI to be able to manipulate Excel files while copying and pasting. The implementation example shown here is just the minimum implementation for running Apache POI, and error handling is required separately. The version that has been confirmed to work with the code is 4.1.0.

Preparing Apache POI

You can prepare it from here. https://poi.apache.org/download.html

Objects in Apache POI

ʻOrg.apache.poi.ss.usermodel.Workbook` represents the entire workbook in the xlsx file.

For the xlsx file to be read / written, after creating the Workbook object, first specify the target sheet. This is ʻorg.apache.poi.ss.usermodel.Sheet`.

After getting the Sheet object from the Workbook object, specify the line to read / write. This is ʻorg.apache.poi.ss.usermodel.Row`.

After getting the Row object from the Sheet object, specify the cell to read or write. Specify the column and specify the cell. This is ʻorg.apache.poi.ss.usermodel.Cell`.

After getting the Cell object from the Row object, you can read and write values to the cell as you normally would in Excel.

Read .xlsx file

Get Cell object

First, get the Workbook object from the file path.

Workbook workbook = WorkbookFactory.create(new FileInputStream(filePath));

The Sheet object can be obtained from the sheet name and index (starting from 0 from the leftmost sheet).

//Obtained from the sheet name
Sheet sheet = workbook.getSheet(name);

//Get from index
Sheet sheet = workbook.getSheetAt(index);

From the Sheet object, get the Row object and then the Cell object.

Row row = sheet.getRow(rowIndex);
Cell cell = row.getCell(columIndex);

Basically, to get the Row object and Cell object, it is necessary to specify the index of 0 start such as what row and column, but by using CellReference, it is familiar in Excel. It is also possible to specify in A1 format.

CellReference reference = new CellReference(a1Position);
Row row = sheet.getRow(reference.getRow());
Cell cell = row.getCell(reference.getCol());

Acquisition of the contents of Cell

After getting the Cell object, if you want to get the contents of the cell as a string, execute theCell.getCellType ()method to get the type of value contained in the cell, and how many according to each Or you need to design the corresponding method individually.

If the cell contains a string (STRING), it can be read with the standardCell.getStringCellValue ().

In the case of a numerical value (NUMERIC), the date is also treated as a numerical value, so it is better to pay attention to this and prepare your own method. In the case of dates, in most cases you will want a value that has a styled display format, so you can handle this with your own method.

In the case of the truth value (BOOLEAN), you can get the value of boolean with the standard Cell.getBooleanCellValue (), so convert it to a character string.

For a function value (FORMULA), if you just want the function expression as a string, you can get it withCell.getCellFormula (). However, in general, I think that in most cases you want the calculation result of the function, so in that case, prepare your own method.

If there is no value (BLANK), it is often a merged cell. The merged cell is supposed to have a value in the upper left cell of the merged area, so reading another cell in the merged area will be treated as blank. Since the merged cells are saved in a special way for Excel, we will prepare a separate method to handle this.

Here's a simple implementation of a method that takes a Cell object and returns a String: We'll talk about our own methods below.

String readCell(Cell cell) {
    switch (cell.getCellType()) {
    //String(standard)
    case STRING:
    	return cell.getStringCellValue();
    //Numerical value(Own)
    case NUMERIC:
    	return getStringNmericValue(cell);
    //Truth value(standard)
    case BOOLEAN:
    	return Boolean.toString(cell.getBooleanCellValue());
    //Function value(Own)
    case FORMULA:
    	return getStringFormulaValue(cell);
    //Blank(Own)
    case BLANK:
    	return getStringRangeValue(cell);
    default:
    	System.out.println("Unexpected Type: "+cell.getCellType());
    	return null;
    }
}

Own method of number (NUMERIC)

If the value in the cell is just a number, you can convert it to a string and return it, but if it is a date, adapt the style and then get it as a string.

For standard date formats, you can get it as a string by using java.time, but for user-defined types, use ʻorg.apache.poi.ss.format.CellFormat` to style it. Gets the value of that text after adapting.

The following is a simple implementation example.

//Get the number considering the date
String getStringNmericValue(Cell cell) {
	//For standard dates
	if(DateUtil.isCellDateFormatted(cell)) {
		Date date = cell.getDateCellValue();
		//java.time.Stringify with DateTimeFormatter
		DateTimeFormatter dateTimeFormatter = DateTimeFormatter.ofPattern("yyyy/MM/dd HH:mm:ss.SSS");
		LocalDateTime localDateTime = LocalDateTime.ofInstant(date.toInstant(), ZoneId.systemDefault());
		return dateTimeFormatter.format(localDateTime);
	}
	//If the date is set in the user-defined type
	if(BuiltinFormats.FIRST_USER_DEFINED_FORMAT_INDEX <= cell.getCellStyle().getDataFormat()) {
		CellFormat cellFormat = CellFormat.getInstance(cell.getCellStyle().getDataFormatString());
		return cellFormat.apply(cell).text;
	}
	//If it is not a date, it returns a number.
	return Double.toString(cell.getNumericCellValue());
}

Own method of function value (FORMULA)

To get the calculation result of the function, pass the target Cell object to the FormulaEvaluator object and get the calculation result.

However, since the calculation result is returned as a CellValue object instead of a character string, it is judged again byCellValue.getCellType ()whether the calculation result is a character string, a numerical value, or a truth value, and each , Read the value with the standard POI method.

Also note that there are some functions that the FormulaEvaluator object does not support. Please refer to the list of currently supported functions at the end. If you execute an unsupported function, ʻEvaluationException` will be thrown.

If you want to execute a user-defined function, please refer to here. https://poi.apache.org/components/spreadsheet/user-defined-functions.html

The following is a simple implementation example of the own method of FORMULA's own method.

//Execute function and read
String getStringFormulaValue(Cell cell) {
	try {
        //Function execution
		CreationHelper helper = workbook.getCreationHelper();
		FormulaEvaluator evaluator = helper.createFormulaEvaluator();
		CellValue value = evaluator.evaluate(cell);

        //Change the method call destination according to the contents of the calculation result.
		switch (value.getCellType()) {
		case STRING:
			return value.getStringValue();
		case NUMERIC:
			return Double.toString(value.getNumberValue());
		case BOOLEAN:
			return Boolean.toString(value.getBooleanValue());
		default:
			System.out.println("Unexpected Type: "+value.getCellType());
			return null;
		}
	}catch(Exception e) {
		return cell.getCellFormula();
	}
}

Blank (BLANK) own method

In the case of BLANK, it is not a problem if it cannot be read if it is just a blank cell, but in the case of a merged cell, it is a problem if it cannot be read, so make it correspond to the merged cell.

The merged cells are collectively stored in a special area of the sheet, and to determine whether the cell to be read is a merged cell, take out the list of the merged cell area and read it. Determines if there is a merged area that contains the cells in.

If there is a merged area where the cell to be read is included in the area, specify the upper left of the area to read the value.

At this time, the type of cell contents cannot be determined without using Cell.getCellType (), so pass the upper left cell to the method that returns String from Cell again. (ReadCell method in the above implementation example)

The following is a simple implementation example.

//Read merged cells
String getStringRangeValue(Cell cell) {
    //Get the index of the cell to be read
	int rowIndex = cell.getRowIndex();
	int columnIndex = cell.getColumnIndex();

    //Search carefully to see if it is included in the combined area.
	int size = sheet.getNumMergedRegions();
	for (int i = 0; i < size; i++) {
		CellRangeAddress range = sheet.getMergedRegion(i);
        //If there is an area included
		if (range.isInRange(rowIndex, columnIndex)) {
			//Get the upper left cell
			Cell firstCell = sheet.getRow(range.getFirstRow()).getCell(range.getFirstColumn());
            //Cell->Throw to the method of String.
			return readCell(firstCell);
		}
	}
	return null;
}

Write .xlsx file

Creating Workbook, Sheet, Row, Cell objects

When creating a new workbook in xlsx format, create an instance of ʻorg.apache.poi.xssf.usermodel.XSSFWorkbookand receive it as a variable of typeWorkbook`.

If you want to create a Sheet object, it is better to convert it to a name that can be used for the sheet withWorkbookUtil.createSafeSheetName ()and then create it.

Regarding the Row object and Cell object, in the Sheet object in the initial state, all rows and cells are null, so each object is created once before writing to the cell. is needed. If you want to specify rows / columns in A1 format, use CellReference as you did in the above read.

//Generate Workbook object
Workbook workbook = new XSSFWorkbook();

//Generate Sheet object
String safeName = WorkbookUtil.createSafeSheetName(name);
Sheet sheet = workbook.createSheet(safeName);

//Conversion from A1 format to index
CellReference reference = new CellReference(a1Position);
int rowIndex = reference.getRow();
int columIndex = reference.getCol();

//Generating a Row object
Row row = sheet.createRow(rowIndex);

//Generating a Cell object
Cell cell = row.createCell(columIndex);

File output

After making the various changes shown below to the Workbook object, use FileOutputStream to finally export the xlsx file. (Although it is omitted for the sake of brevity, please write try-catch appropriately in the actual code.)

FileOutputStream fileOut = new FileOutputStream(filePath);
workbook.write(fileOut);
fileOut.close();

Writing a string to a cell

Writing a string is very simple, just call Cell.setvalue (). However, when creating a method that writes by specifying the index of the cell, it is better to check the null of the Row object or Cell object at that position.

Here is a simple implementation example.

void writeCellAt(int rowIndex, int columIndex, String value) {
    //Row,Cell null check
	Row row = sheet.getRow(rowIndex);
    if(row == null) {
        row = sheet.createRow(rowIndex);
    }
	Cell cell = row.getCell(columIndex);
    if(cell == null) {
		cell = row.createCell(columIndex);
	}

    //Writing a string
	cell.setCellValue(value);
}

Basically, as long as you can write a character string, I think that complicated processing can be implemented using Java functions and libraries.

If it is a simple tabular format, it is possible with a csv file, but I think that there are some cell decorations as a reason to use xlsx. I think that there are many people in the environment where xlsx is used as the first document rather than PDF or docx, so the basic decoration method is shown below. It is still better to check the null of Row and Cell for the decorations shown below.

Merge cells

This time, use Sheet.addMergedRegion () to register with the Sheet object, just as it was obtained from the Sheet object when reading the merged cells.

//topRow: Row index of the upper left cell
//topColum: Column index of the upper left cell
//bottomRow: Row index of the lower right cell
//bottomColum: Column index of the lower right cell
CellRangeAddress cellRangeAddress = new CellRangeAddress(topRow, bottomRow, topColum, bottomColum);
sheet.addMergedRegion(cellRangeAddress);

Change row height

You can change the row height with Row.setHeightInPoints (). If it is difficult to specify a specific height, you can specify it in multiples of the default value.

//multiple
float mult = 2.0f;
//Set to twice the default value
row.setHeightInPoints(mult*sheet.getDefaultRowHeightInPoints());

To wrap and display the whole thing, use CellStyle for the target Cell object as follows:

CellStyle style = workbook.createCellStyle();
style.setWrapText(true);
cell.setCellStyle(style);

If this does not give you a good and dynamic fit, please refer to here. https://stackoverflow.com/questions/19145628/auto-size-height-for-rows-in-apache-poi

Also, if the height adjustment does not work, there seems to be a way to count line breaks and set it in multiples. https://stackoverflow.com/questions/45387206/how-to-increase-the-height-of-the-excel-row-using-apache-poi-having-merged-cell/45388037

Ruled line drawing

To draw a ruled line, specify the target Cell object individually using CellStyle as shown below. The thickness is specified by BorderStyle. It's a confusing name, but BorderStyle.THIN generally represents a thick border and BorderStyle.MEDIUM represents a thick border.

CellStyle style = workbook.createCellStyle();
style.setBorderTop(BorderStyle.THIN);
style.setBorderBottom(BorderStyle.THIN);
style.setBorderLeft(BorderStyle.THIN);
style.setBorderRight(BorderStyle.THIN);

cell.setCellStyle(style);

Change font color

The font color can also be changed with CellStyle for the target Cell object.

For colors, a standard color palette is provided as a variable in advance. Please refer to here for color samples. https://lincolnminto.wordpress.com/2014/04/07/apache-poi-referencia-para-estilos-de-celulas-apache-poi-cellstyles-reference/

CellStyle style = workbook.createCellStyle();
Font font = workbook.createFont();
font.setColor(IndexedColors.WHITE.getIndex());
style.setFont(font);
cell.setCellStyle(style);

Cell fill

The cell fill can also be changed with CellStyle for the Cell object.

Again, confusingly, use CellStyle.setFillForegroundColor () instead of CellStyle.setFillBackgroundColor ().

CellStyle style = workbook.createCellStyle();
style.setFillForegroundColor(IndexedColors.CORAL.getIndex());
style.setFillPattern(FillPatternType.SOLID_FOREGROUND);
cell.setCellStyle(style);

I'm not sure if it works properly, but there seems to be a way to specify RGB for font color and cell fill color as follows.

IndexedColorMap colorMap = workbook.getStylesSource().getIndexedColors();
XSSFColor grey = new XSSFColor(new java.awt.Color(192,192,192), colorMap);
cellStyle.setFillForegroundColor(grey);

Reference: https://stackoverflow.com/questions/10912578/apache-poi-xssfcolor-from-hex-code

Paste image

Paste the image on the Workbook object.

After converting to a byte string with ʻIOUtils.toByteArray (), registering it in the workbook with Workbook.addPicture ()`, the index of the image is acquired as the return value. Some of the file formats that can be specified at this time are shown below.

constant format
PICTURE_TYPE_DIB bmp
PICTURE_TYPE_JPEG jpeg
PICTURE_TYPE_PNG png

The position and size of the image are specified using ClientAnchor. Specify the upper left cell and the lower right cell to adjust the position and size. However, ClientAnchor.setCol2 () and ClientAnchor.setRow2 (), which specify the lower right cell, are actually expanded because they are expanded to the specified index-1 position (one upper left cell). Specify the value of index + 1 in the lower right cell of the desired range.

Anchor type is what Excel calls the shape formatting properties field, where you can specify whether to make changes to the image as cells move or resize. The following is a list of constants and settings.

constant Description
DONT_MOVE_AND_RESIZE Even if the row / column is resized, the image is not moved / resized.
DONT_MOVE_DO_RESIZE The image is not moved, but it is resized to fit the anchor cell.
MOVE_AND_RESIZE Move and resize.
MOVE_DONT_RESIZE It moves according to the cell of the anchor,

Finally, use the Drawing object to paste the image with the specified index at the specified anchor position. The following is a simple implementation example. (Although it is omitted for the sake of brevity, please write try-catch appropriately in the actual code.)

//Convert to byte string
InputStream in = new FileInputStream(filePath);
byte[] bytes = IOUtils.toByteArray(in);
in.close();

//Image registration, index acquisition
int pictureIndex = workbook.addPicture(byteWorkbook.PICTURE_TYPE_PNG);

//Specify the position and size with the anchor
ClientAnchor anchor = workbook.getCreationHelper().createClientAnchor();
anchor.setCol1(topColum);
anchor.setRow1(topRow);
anchor.setCol2(bottomColum+1);
anchor.setRow2(bottomRow+1);
anchor.setAnchorType(AnchorType.MOVE_AND_RESIZE);

//Paste the image of the specified index at the position of the specified anchor
Drawing patriarch = sheet.createDrawingPatriarch();
patriarch.createPicture(anchor, pictureIndex);

[Appendix] Built-in functions supported by POI

Reference: https://poi.apache.org/components/spreadsheet/eval-devguide.html

ABS ACOS ACOSH ADDRESS AND
ASIN ASINH ATAN ATAN2 ATANH
AVEDEV AVERAGE BIN2DEC CEILING CHAR
CHOOSE CLEAN CODE COLUMN COLUMNS
COMBIN COMPLEX CONCATENATE COS COSH
COUNT COUNTA COUNTBLANK COUNTIF COUNTIFS
DATE DAY DAYS360 DEC2BIN DEC2HEX
DEGREES DELTA DEVSQ DGET DMIN
DOLLAR EDATE EOMONTH ERROR.TYPE EVEN
EXACT EXP FACT FACTDOUBLE FALSE
FIND FIXED FLOOR FV HEX2DEC
HLOOKUP HOUR HYPERLINK IF IFERROR
IMAGINARY IMREAL INDEX INDIRECT INT
INTERCEPT IPMT IRR ISBLANK ISERR
ISERROR ISEVEN ISLOGICAL ISNA ISNONTEXT
ISNUMBER ISODD ISREF ISTEXT LARGE
LEFT LEN LN LOG LOG10
LOOKUP LOWER MATCH MAX MAXA
MDETERM MEDIAN MID MIN MINA
MINUTE MINVERSE MIRR MMULT MOD
MODE MONTH MROUND NA NETWORKDAYS
NOT NOW NPER NPV OCT2DEC
ODD OFFSET OR PERCENTILE PI
PMT POISSON POWER PPMT PRODUCT
PROPER PV QUOTIENT RADIANS RAND
RANDBETWEEN RANK RATE REPLACE REPT
RIGHT ROMAN ROUND ROUNDDOWN ROUNDUP
ROW ROWS SEARCH SECOND SIGN
SIN SINH SLOPE SMALL SQRT
STDEV SUBSTITUTE SUBTOTAL SUM SUMIF
SUMIFS SUMPRODUCT SUMSQ SUMX2MY2 SUMX2PY2
SUMXMY2 T TAN TANH TEXT
TIME TODAY TRANSPOSE TRIM TRUE
TRUNC UPPER VALUE VAR VARP
VLOOKUP WEEKDAY WEEKNUM WORKDAY YEAR
YEARFRAC

Referenced sites / articles

--Hishidama's homepage Apache POI http://www.ne.jp/asahi/hishidama/home/tech/apache/poi/

--Let's Programming Operate Excel with Apache POI https://www.javadrive.jp/poi/

--Sukero's story Insert an image into EXCEL with POI https://www.sukerou.com/2019/03/poiexcel.html

--Qiita Apache POI memorandum Excel reading https://qiita.com/panage/items/4476ff91d1fcec67f525

Recommended Posts

Notes for reading and generating xlsx files from Java using Apache POI
[Java] Text extraction from PowerPoint (ppt) using Apache POI
[Java] Handle Excel files with Apache POI
[Java] Reading and writing files with OpenCSV
Reading and writing gzip files in Java
Review notes for Java 1.7 and later file copies
[Review] Reading and writing files with java (JDK6)
Switched property files for development and production using JMockit
JSP error display from Servlet
A Simple CRUD Sample Using Java Servlet / JSP and MySQL
Classes and instances Java for beginners
Notes for reading and generating xlsx files from Java using Apache POI
[For myself] Transfer of Servlet class processing
Save files and folders using File Manager
Write code using Ruby classes and instances
Convert large XLSX files to CSV with Apache POI
Switched property files for development and production using JMockit
[Java] Development with multiple files using package and import
How to write and notes when migrating from VB to JAVA
Tips for using Salesforce SOAP and Bulk API in Java
Excel operation using Apache POI
Try scraping using java [Notes]
Apache Hadoop and Java 9 (Part 1)
apache POI personal notes crossfish21
Using Docker from Java Gradle
Java while and for statements
Excel output using Apache POI!
Notes on building Kotlin development environment and migrating from Java to Kotlin