Reading and writing sjis files. The final result is quite normal, but ...
First of all, there is an example using codecs.getreader / writer when you go around normally.
So not possible. In conclusion, use string.encode (), decode ()
Read:
for line in open('file.txt', 'rt'):
linedec = line.decode('cp932')
...
Also, writing is
str(a_unicode_string)
Due to the nice spec orz that results in UnicodeEncodeError, untyped objects When handling it, you cannot rely on the default operation, and you need to convert it to a string by yourself. Why is this part notation?
'%s' % obj
You can do it. (I think it can be the same as the behavior str () of this code)
Also, when I try to write a unicode string to a file that I opened without setting any character code, I get a UnicodeEncodeError. Moreover, if you open it in text mode, it will occur at flush time instead of writing it, so error recovery will not be possible, which is quite a problem. It doesn't seem to treat it as a UTF16 binary file.
So export:
f=open('sjis.txt', 'wt')
lineenc = linestr.encode('cp932')
print >>f, lineenc
print >>f, ('%s' % some_object).encode('cp932')
Recommended Posts