build.gradle
dependencies {
    compile group: 'org.apache.poi', name: 'poi', version: '3.15'
    compile group: 'org.apache.poi', name: 'poi-ooxml', version: '3.15'
}
I want to read an excel file in Apache POI and get the result of the formula in the cell.
The code is below.
SampleApachePoiUtil.java
public class SampleApachePoiUtil {
    /**
     *Calculates the formula in the cell and returns the result as a String.
     * @param formulaCell Cell containing the formula
     * @return Formula result (String))
     */
    public static String getStringFormulaValue(Cell formulaCell) {
        Workbook book = formulaCell.getSheet().getWorkbook();
        CreationHelper helper = book.getCreationHelper();
        FormulaEvaluator evaluator = helper.createFormulaEvaluator();
        CellValue value = evaluator.evaluate(formulaCell);
        //[Caution] Assuming that it is a String
        return value.getStringValue();
    }
    public static void sample(File file) throws IOException, InvalidFormatException {
        Workbook workbook = WorkbookFactory.create(file);
        Sheet sheet = workbook.getSheet("main");
        //Cell containing the formula("main"A1 cell of the sheet)
        Cell formulaCell = sheet.getRow(0).getCell(0);
        //Calculate formula
        String result = getStringFormulaValue(formulaCell);
        System.out.println("result = " + result);
        workbook.close();
    }
}
When I tried to get the result of = Ai! A1, a formula that refers to the cell of the sheet" Ai ", I got a FormulaParseException.

Exception in thread "main" org.apache.poi.ss.formula.FormulaParseException: Specified named range 'Ah' does not exist in the current workbook.
	at org.apache.poi.ss.formula.FormulaParser.parseNonRange(FormulaParser.java:898)
	at org.apache.poi.ss.formula.FormulaParser.parseRangeable(FormulaParser.java:490)
	at org.apache.poi.ss.formula.FormulaParser.parseRangeExpression(FormulaParser.java:311)
	at org.apache.poi.ss.formula.FormulaParser.parseSimpleFactor(FormulaParser.java:1509)
	at org.apache.poi.ss.formula.FormulaParser.percentFactor(FormulaParser.java:1467)
	at org.apache.poi.ss.formula.FormulaParser.powerFactor(FormulaParser.java:1454)
	at org.apache.poi.ss.formula.FormulaParser.Term(FormulaParser.java:1827)
	at org.apache.poi.ss.formula.FormulaParser.additiveExpression(FormulaParser.java:1955)
	at org.apache.poi.ss.formula.FormulaParser.concatExpression(FormulaParser.java:1939)
	at org.apache.poi.ss.formula.FormulaParser.comparisonExpression(FormulaParser.java:1896)
	at org.apache.poi.ss.formula.FormulaParser.intersectionExpression(FormulaParser.java:1869)
	at org.apache.poi.ss.formula.FormulaParser.unionExpression(FormulaParser.java:1849)
	at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:1997)
	at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:170)
	at org.apache.poi.xssf.usermodel.XSSFEvaluationWorkbook.getFormulaTokens(XSSFEvaluationWorkbook.java:85)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:315)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:259)
	at org.apache.poi.xssf.usermodel.BaseXSSFFormulaEvaluator.evaluateFormulaCellValue(BaseXSSFFormulaEvaluator.java:65)
	at org.apache.poi.ss.formula.BaseFormulaEvaluator.evaluate(BaseFormulaEvaluator.java:101)
	at SampleApachePoiUtil.getStringFormulaValue(SampleApachePoiUtil.java:21)
	at SampleApachePoiUtil.sample(SampleApachePoiUtil.java:33)
	at Main.main(Main.java:7)
I was told that the cell range named "A" does not exist. Apparently, the sheet name could not be parsed correctly and it was judged as "cell name" for some reason.
I changed the name of the referenced sheet and the extension of the file and checked if an error occurred.
| extension | Sheet name referenced by the formula | result | 
|---|---|---|
| xlsx | Ah | NG | 
| xlsx | Ai | OK | 
| xlsx | ・ | NG | 
| xlsx | Oh 〒 | NG | 
| xls | Ah | OK | 
Special characters such as "・" and "〒", not caused by non-ASCII characters? Seems to be the cause.
The following is a stack trace when the sheet name is "・". The message was different from when the sheet name was "Ai".
Exception in thread "main" org.apache.poi.ss.formula.FormulaParseException: Parse error near char 0 '・' in specified formula '・!A1'. Expected cell ref or constant literal
	at org.apache.poi.ss.formula.FormulaParser.expected(FormulaParser.java:262)
	at org.apache.poi.ss.formula.FormulaParser.parseSimpleFactor(FormulaParser.java:1514)
	at org.apache.poi.ss.formula.FormulaParser.percentFactor(FormulaParser.java:1467)
	at org.apache.poi.ss.formula.FormulaParser.powerFactor(FormulaParser.java:1454)
	at org.apache.poi.ss.formula.FormulaParser.Term(FormulaParser.java:1827)
	at org.apache.poi.ss.formula.FormulaParser.additiveExpression(FormulaParser.java:1955)
	at org.apache.poi.ss.formula.FormulaParser.concatExpression(FormulaParser.java:1939)
	at org.apache.poi.ss.formula.FormulaParser.comparisonExpression(FormulaParser.java:1896)
	at org.apache.poi.ss.formula.FormulaParser.intersectionExpression(FormulaParser.java:1869)
	at org.apache.poi.ss.formula.FormulaParser.unionExpression(FormulaParser.java:1849)
	at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:1997)
	at org.apache.poi.ss.formula.FormulaParser.parse(FormulaParser.java:170)
	at org.apache.poi.xssf.usermodel.XSSFEvaluationWorkbook.getFormulaTokens(XSSFEvaluationWorkbook.java:85)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluateAny(WorkbookEvaluator.java:315)
	at org.apache.poi.ss.formula.WorkbookEvaluator.evaluate(WorkbookEvaluator.java:259)
	at org.apache.poi.xssf.usermodel.BaseXSSFFormulaEvaluator.evaluateFormulaCellValue(BaseXSSFFormulaEvaluator.java:65)
	at org.apache.poi.ss.formula.BaseFormulaEvaluator.evaluate(BaseFormulaEvaluator.java:101)
	at SampleApachePoiUtil.getStringFormulaValue(SampleApachePoiUtil.java:21)
	at SampleApachePoiUtil.sample(SampleApachePoiUtil.java:33)
	at Main.main(Main.java:7)
In debugging, I checked the contents of formulaCell._cell.
<xml-fragment r="A1" s="1" t="str" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xmlns:main="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <main:f>Ah!A1</main:f>
  <main:v>hoge</main:v>
</xml-fragment>
The formula was stored in <main: f>, and the result of the formula was stored in <main: v>.
The stored values are as expected, so reading the Excel file seems to be fine.

<main: v> is not the value calculated by Apache POI, but the value stored in the Excel file. The calculation result seems to be cached.The Excel file format (both .xls and .xlsx) stores a "cached" result for every formula along with the formula itself.
https://poi.apache.org/spreadsheet/eval.html Quote
12539 in decimalhttps://www.fileformat.info/info/unicode/char/30fb/index.htm https://ja.wikipedia.org/wiki/%E4%B8%AD%E9%BB%92
When I raised the Apache POI version from 3.15 to 3.16 (strictly 3.16-beta2), the error disappeared and it worked as expected.
Apache POI 3.16-beta2 change history
Below are related bug reports. Here, it seems that an error occurred when executing the shiftRows method for the sheet name including" ・ ".
Partial support for unicode sheet names
This is the modified source difference. https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/formula/FormulaParser.java?r1=1778418&r2=1778417&pathrev=1778418
The ʻisUnquotedSheetNameChar method of FormulaParser.java` was the direct cause.
The sheet name included in the formula is obtained by the parseSheetName method of ʻorg.apache.poi.ss.formula.FormulaParser.java`.
java:poi-3.15-source.jar\org.apache.poi.ss.formula.FormulaParser.java
private SheetIdentifier parseSheetName() {
//...
        // unquoted sheet names must start with underscore or a letter
        if (look =='_' || Character.isLetter(look)) {
            StringBuilder sb = new StringBuilder();
            // can concatenate idens with dots
            while (isUnquotedSheetNameChar(look)) {
                sb.append(look);
                GetChar();
            }
            NameIdentifier iden = new NameIdentifier(sb.toString(), false);
            SkipWhite();
            if (look == '!') {
                GetChar();
                return new SheetIdentifier(bookName, iden);
            }
            // See if it's a multi-sheet range, eg Sheet1:Sheet3!A1
            if (look == ':') {
                return parseSheetRange(bookName, iden);
            }
            return null;
        }
//...
This is the definition of the ʻisUnquotedSheetNameChar method called by the parseSheetName` method.
java:poi-3.15-source.jar\org.apache.poi.ss.formula.FormulaParser.java
    /**
     * very similar to {@link SheetNameFormatter#isSpecialChar(char)}
     */
    private static boolean isUnquotedSheetNameChar(char ch) {
        if(Character.isLetterOrDigit(ch)) {
            return true;
        }
        switch(ch) {
            case '.': // dot is OK
            case '_': // underscore is OK
                return true;
        }
        return false;
    }
The return value of the ʻisUnquotedSheetNameChar` method is as follows.
| letter | Unicode Category | return value of isUnquotedSheetNameChar | 
|---|---|---|
| Ah | Letter, Other [Lo] | true | 
| ・ | Punctuation, Other [Po] | false | 
| I | Letter, Other [Lo] | true | 
When passing ・ to the ʻisUnquotedSheetNameChar` method, false is returned, so the sheet name could not be obtained correctly and an error occurred.
java:poi-3.16-beta2-source.jar\org.apache.poi.ss.formula.FormulaParser.java
    /**
     * very similar to {@link SheetNameFormatter#isSpecialChar(char)}
     * @param ch unicode codepoint
     */
    private static boolean isUnquotedSheetNameChar(int ch) {
        if(Character.isLetterOrDigit(ch)) {
            return true;
        }
        // the sheet naming rules are vague on whether unicode characters are allowed
        // assume they're allowed.
        if (ch > 128) {
            return true;
        }
        switch(ch) {
            case '.': // dot is OK
            case '_': // underscore is OK
                return true;
        }
        return false;
    }
There are two changes:
"・" Now returns true.
FormulaParseException
--XLSX format
--Refers to a cell in a sheet that contains characters such as "・" in the sheet name.Character.isLetterOrDigit method returns falsehttps://github.com/apache/poi/blob/trunk/src/java/org/apache/poi/ss/formula/FormulaParser.java Quote
 This class parses a formula string into a List of tokens in RPN order.
 Inspired by
           Lets Build a Compiler, by Jack Crenshaw
 BNF for the formula expression is :
 <expression> ::= <term> [<addop> <term>]*
 <term> ::= <factor>  [ <mulop> <factor> ]*
 <factor> ::= <number> | (<expression>) | <cellRef> | <function>
 <function> ::= <functionName> ([expression [, expression]*])
 
 For POI internal use only
Let's Build a Compiler, by Jack Crenshaw
It has nothing to do with Apache POI. The category of "・" is "Punctuation, Other [Po]", but before Unicode 4.1, it was "Punctuation, Connector [Pc]". http://www.unicode.org/reports/tr44/tr44-4.html#Change_History
In this connection, it seems that there was a problem that "In Java 6," ・ "was used in the method name, but it can no longer be used in Java 7." That's horrible. .. ..
https://www.hos.co.jp/blog/20111004/
Using the debugger, I found the following.
parseSheetName method of FormulaParser does not passThe JavaDoc of WorkbookEvaluator says" Keep a cache of calculation results ".
For performance reasons, this class keeps a cache of all previously calculated intermediate cell values.
https://poi.apache.org/apidocs/org/apache/poi/ss/formula/WorkbookEvaluator.html Quote
Didn't you get an error because you are referring to the calculation result cache?
https://svn.apache.org/viewvc/poi/trunk/src/java/org/apache/poi/ss/formula/FormulaParser.java?r1=1778418&r2=1778417&pathrev=1778418
Recommended Posts