[JAVA] I want to avoid OutOfMemory when outputting large files with POI

There is a process to output an Excel file using POI. The specification is to output up to 10000 rows of data in 48 columns, but when I tried to output 10000 rows, an OutOfMemory error occurred. Make a note of how to deal with such cases.

Don't say that Java doesn't handle such a large amount of data in Excel in the first place. ..

Implementation of the problem

The image of the source where the problem occurred is as follows.

Source of problem


	File file = new File(tempPath);	//Temporary file path File created in advance Use this as a template
	XSSFWorkbook workbook = (XSSFWorkbook) WorkbookFactory.create(file);
	XSSFSheet sheet = workbook.getSheetAt(0);
	XSSFRow baseRow = sheet.getRow(sheet.getLastRowNum());	//Reference row: Copy the style of each cell
	int rowCnt = 1;
	for (1 acquired data: list) {	//list is pre-obtained, source is not specified
		sheet.createRow(sheet.getLastRowNum() + 1);
		XSSFRow newRow = sheet.getRow(sheet.getLastRowNum());
		int cellCnt = 0;
		XSSFCell originCell = null;
		XSSFCell newCell = null;
		XSSFCellStyle style = workbook.createCellStyle();
		originCell = baseRow.getCell(cellCnt);
		newCell = newRow.createCell(cellCnt++);
		//Copy cell style
		style.cloneStyleFrom(originCell.getCellStyle());
		newCell.setCellStyle(style);
		//Copy of cell type
		newCell.setCellType(originCell.getCellType());
		newCell.setCellValue(1 acquired data.value);
		//Repeat setting for 14 columns
	}
	//Set creation date
	sheet.getRow(0).getCell(2).setCellValue(LocalDate.now().format(DateTimeFormatter.ofPattern("yyyy/MM/dd")));
	//Delete base line
	sheet.shiftRows(baseRow.getRowNum() + 1, sheet.getLastRowNum(), -1);
	//Write to the output stream and get a byte array
	ByteArrayOutputStream baos = new ByteArrayOutputStream();
	workbook.write(baos);
	excelData = baos.toByteArray();
	baos.close();
	workbook.close();
	//Delete temporary files
	file.delete();
	//Put the byte array in the response and download it

The outline of the process is as follows.

--Read the file created as a template file --Copy the style set in the bottom line (base line) of the template and reflect it in the additional line. --The reference line is deleted after adding all lines. --Be sure to set the created date on the first line.

An OutOfMemory error occurred when calling XSSFWorkbook.write ().

Countermeasures

When I googled the countermeasure method, it is said that processing will be lighter if SXSSF Workbook is used instead of XSSF Workbook. SXSSFWorkbook I referred to the following documents.

https://poi.apache.org/spreadsheet/#SXSSF+%28Since+POI+3.8+beta3%29

It is an extension of XSSF. XSSF can access all rows and can use all API functions, but the memory occupancy increases because all row information is expanded in memory. On the other hand, SXSSF seems to reduce the memory occupancy by reducing the number of accessible lines. (Please point out if there is a misunderstanding)

Make corrections

Fix 1

Try replacing the previous code using the SXSSF Workbook.

Replace XSSF Workbook with SXSSF Workbook


	File file = new File(tempPath);	//Temporary file path File created in advance Use this as a template
	SXSSFWorkbook workbook = (SXSSFWorkbook) WorkbookFactory.create(file);
	SXSSFSheet sheet = workbook.getSheetAt(0);
	SXSSFRow baseRow = sheet.getRow(sheet.getLastRowNum());	//Reference row: Copy the style of each cell
	int rowCnt = 1;
	for (1 acquired data: list) {	//list is pre-obtained, source is not specified
		sheet.createRow(sheet.getLastRowNum() + 1);
		SXSSFRow newRow = sheet.getRow(sheet.getLastRowNum());
		int cellCnt = 0;
		SXSSFCell originCell = null;
		SXSSFCell newCell = null;
		CellStyle style = workbook.createCellStyle();
		originCell = baseRow.getCell(cellCnt);
		newCell = newRow.createCell(cellCnt++);
		//Copy cell style
		style.cloneStyleFrom(originCell.getCellStyle());
		newCell.setCellStyle(style);
		//Copy of cell type
		newCell.setCellType(originCell.getCellType());
		newCell.setCellValue(1 acquired data.value);
		//Repeat setting for 14 columns
	}
	//Set creation date
	sheet.getRow(0).getCell(2).setCellValue(LocalDate.now().format(DateTimeFormatter.ofPattern("yyyy/MM/dd")));
	//Delete base line
	sheet.shiftRows(baseRow.getRowNum() + 1, sheet.getLastRowNum(), -1);
	//Write to the output stream and get a byte array
	ByteArrayOutputStream baos = new ByteArrayOutputStream();
	workbook.write(baos);
	excelData = baos.toByteArray();
	baos.close();
	workbook.close();
	//Delete temporary files
	file.delete();
	//Put the byte array in the response and download it

Doing this will result in an error. .. ..

error contents


java.lang.ClassCastException: org.apache.poi.xssf.usermodel.XSSFWorkbook cannot be cast to org.apache.poi.xssf.streaming.SXSSFWorkbook

WorkbookFactory.create (file); does not support SXSSF Workbook. After investigating, it is possible to specify an XSSF Workbook instance as an argument when creating an instance of SXSSF Workbook.

Fix 2

Instantiation of SXSSF Workbook


	File file = new File(tempPath);	//Temporary file path File created in advance Use this as a template
	XSSFWorkbook original = (XSSFWorkbook) WorkbookFactory.create(file);
	SXSSFWorkbook workbook = new SXSSFWorkbook(original);
	SXSSFSheet sheet = workbook.getSheetAt(0);
	SXSSFRow baseRow = sheet.getRow(sheet.getLastRowNum());	//Reference row: Copy the style of each cell
	original.close();	//Loaded into SXSSF Workbook(?)Close for

Replace the instance creation with the above and execute it again, and when the above SXSSFSheet.getLastRowNum () is executed, it becomes a nullpo.

Error content 2


java.lang.NullPointerException

Since the template file is being read by instantiating the XSSF Workbook, I imagined that the last line of the existing line of the template file could be accessed, but an error occurred. Therefore, specify the last number of existing lines "6" and execute again. The following error occurs.

Error content 3


java.lang.IllegalArgumentException: Attempting to write a row[6] in the range [0,6] that is already written to disk.

Apparently, the existing line of the file read by XSSF Workbook is inaccessible at the time of instantiation of SXSSF Workbook.

Fix 3

However, due to the implementation specifications this time, I would like to enable access to existing lines. As a result of various trials and errors, it was found that the existing line can be accessed in the following form. (I haven't confirmed the correct method, but it can be done)

Access an existing line in the template file


	File file = new File(tempPath);	//Temporary file path File created in advance Use this as a template
	XSSFWorkbook original = (XSSFWorkbook) WorkbookFactory.create(file);
	XSSFSheet orgSheet = original.getSheetAt(0);
	XSSFRow baseRow = sheet.getRow(sheet.getLastRowNum());	//Reference row: Copy the style of each cell
	SXSSFWorkbook workbook = new SXSSFWorkbook(original);
	SXSSFSheet sheet = workbook.getSheetAt(0);
	original.close();	//Loaded into SXSSF Workbook(?)Close for

In the above execution, an error occurred when calling SXSSFWorkbook.write ().

Error content 4


java.io.IOException: Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data. This may indicate that the file is used to inflate memory usage and thus could pose a security risk. You can adjust this limit via ZipSecureFile.setMinInflateRatio() if you need to work with files which exceed this limit. Counter: 820224, cis.counter: 8192, ratio: 0.009987515605493134Limits: MIN_INFLATE_RATIO: 0.01

After investigating, it is possible to avoid it by calling ZipSecureFile.setMinInflateRatio ().

Completed version

The final finished source with the call to ZipSecureFile.setMinInflateRatio () is shown below.

Completed source


	File file = new File(tempPath);	//Temporary file path File created in advance Use this as a template
	XSSFWorkbook original = (XSSFWorkbook) WorkbookFactory.create(file);
	XSSFSheet orgSheet = original.getSheetAt(0);
	SXSSFWorkbook workbook = new SXSSFWorkbook(original);
	SXSSFSheet sheet = workbook.getSheetAt(0);
	Row baseRow = orgSheet.getRow(orgSheet.getLastRowNum());
	int rowCnt = 1;
	int rowNum = 6;
	boolean isFirst = true;
	for (1 acquired data: list) {    //list is pre-obtained, source is not specified
		SXSSFRow newRow = sheet.createRow(rowNum++);
		int cellCnt = 0;
		XSSFCell originCell = null;
		SXSSFCell newCell = null;
		CellStyle style = workbook.createCellStyle();
		if (isFirst) {    //Execute only the first line
			sheet.changeRowNum(newRow, 5);
			rowNum = 6;
			baseRow = newRow;
			isFirst = false;
		}
	}
	//Write to the output stream and get a byte array
	ByteArrayOutputStream baos = new ByteArrayOutputStream();
	ZipSecureFile.setMinInflateRatio(0.001);	//It seems that it is checking the compression rate and the size of one entry...
	workbook.write(baos);
	excelData = baos.toByteArray();
	baos.close();
	workbook.close();
	original.close();
	//Delete temporary files
	file.delete();
	//Put the byte array in the response and download it

When run, the OutOfMemory error no longer occurs.

Recommended Posts

I want to avoid OutOfMemory when outputting large files with POI
Avoid Zip bomb errors when reading large files with POI
Convert large XLSX files to CSV with Apache POI
[Rails] I want to add data to Params when transitioning with link_to
I want to use DBViewer with Eclipse 2018-12! !!
I want to test Action Cable with RSpec test
I want to use java8 forEach with index
I want to play with Firestore from Rails
I want to perform aggregation processing with spring-batch
[Rails] I want to load CSS with webpacker
I want to delete files managed by Git
I want to dark mode with the SWT app
I want to monitor a specific file with WatchService
I want to authenticate users to Rails with Devise + OmniAuth
I want to transition screens with kotlin and java!
I want to get along with Map [Java beginner]
I want to redirect sound from Ubuntu with xrdp
I want to set devise_parameter_sanitizer individually when I create two devises
I want to randomly generate information when writing test code
I want to push an app made with Rails 6 to GitHub
I want to make a list with kotlin and java!
I want to make a function with kotlin and java!
When you want to explicitly write OR or AND with ransack
Even in Java, I want to output true with a == 1 && a == 2 && a == 3
I want to easily back up files used at work
I want to manually send an authorization email with Devise
I want to distinct the duplicated data with has_many through
I want to implement various functions with kotlin and java!
docker-compose.yml when you want to keep mysql running with docker
lombok.config when you want to pass @Qualifier to @RequiredArgsConstructor with lombok
I want to pass the startup command to postgres with docker-compose.
[Java] I want to test standard input & standard output with JUnit
I want to convert characters ...
I want to judge the necessity of testing by comparing the difference of class files when refactoring Java
I want to recursively search for files under a specific directory
I want to make a button with a line break with link_to [Note]
I want to connect SONY headphones WH-1000XM4 with LDAC on ubuntu 20.04! !!
I want to convert an array to Active Record Relation with Rails
I want to hook log file generation / open with log4j # FileAppender
When I push to Heroku, I get angry with Precompiling assets failed.
I want to add a browsing function with ruby on rails
I want to return to the previous screen with kotlin and java!
I want to INSERT Spring Local Time with MySQL Time (also milliseconds)
I got an IllegalAccessError when trying to use PowerMock with JUnit
I want to operate cron with GUI, so I will install Dkron
What I was addicted to when implementing google authentication with rails
[Java] I want to perform distinct with the key in the object
I want to perform asynchronous processing and periodic execution with Rail !!!
I want to extract between character strings with a regular expression
I want to notice that I forgot to specify arg when building Docker
Swift: I want to chain arrays
I want to use FormObject well
To avoid errors when starting miChecker
I want to convert InputStream to String
I tried to interact with Java
Download large files with Apache JMeter
I want to docker-compose up Next.js!
I want to display images with REST Controller of Java and Spring!
Problems I was addicted to when building the digdag environment with docker
Rubocop gets angry when I try to fill zeros (fill 0s) with Ruby
I want to select multiple items with a custom layout in Dialog