When I output a certain log to an xlsx file with OpenPyXl and open it in Excel 2011 on Mac, the following dialog appears. I did.
After analyzing the problematic part, we found that it could not be opened due to the following two reasons.
Mobile emojis (U + E000-U + F8FF, very loosely) will pass OpenPyXl itself without any problems and an xlsx file will be generated. However, Excel does not seem to assume that these invisible characters will be written as they are in the file, and the above error will occur and character conversion (?) Will be performed. If you read the tsv file or copy and paste it into a cell, it seems that it will be converted by Excel at this stage, and the converted characters will be stored in the cell.
In this work, I'm glad I scraped the things that can't be displayed, so I scraped them below.
invalid_code = re.compile(ur'[\u0000-\u001F\u007f\uE000-\uF8FF]')
invalid_code.sub('', s)
In fact, the one that supports emoji is \ uE000- \ uF8FF
. \ u0000- \ u001F \ u007f
is a control character, but it is included because OpenPyXl does not pass it. (Aside from the problem of what a log contains such characters) Also, s
is a unicode string.
Like pictograms, it seems that Excel considers numbers and the beginning of'=' as an error. (I think that if it is a formula, it should be entered in the cell of the formula) And, by default, OpenPyXl writes to the cell as a number ([Be wary of setting Cell values text starting with "=".](Https:: //bitbucket.org/openpyxl/openpyxl/issues/442)), so it seems that Excel is still throwing an error.
As a countermeasure,
There is, and this time I chose the former.
ws.cell('A1').data_type = 's'
With the above, the generated file can be opened without any error. If there are other patterns, I would like to write them again.