About character string handling that can be placed in JSON communication

Japanese manual in json communication

due to json specifications, Japanese does not need to unicode escape If you do not unicode escape, it is not good for confidentiality leakage, but it is not good because it can be deciphered if it is decoded. Even so, security such as XSS is okay because you can escape the control string with the \ mark (→ What do you mean?)

↓ This is what you mean. Cross-site scripting

By registering the data enclosed in the script tag, the data may be read as a script tag in the browser and an invalid script may be executed. So the control string needs to be escaped.

--Session hijacking --The tag of the input form is embedded and personal information is stolen. --Displaying fake information on web pages --Forced operation on a web page

Japanese unicode escape problem in python

json standard library

Japaneseization can be realized by json.dumps (str, ensure_ascii = False).

If> ensure_ascii is true (the default value), then the output guarantees that all non-ASCII characters entered are escaped. If ensure_ascii is false, these characters will be printed as is.

However, if the above implementation is left as it is, the control string will not be escaped, which creates a vulnerability.

MUST be escaped JSON specification Control character code

the characters MUST be escaped: ※quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

No. String
1. "(double quotation)
2. (backslash)
3. NULl
4. Start Of Heading
5. Start of TeXt (text start)
6. End of TeXt
7. End Of Transmission
8. ENQuiry (Inquiry)
9. ACKnowledge (acknowledgement)
10. BELl
11. Back Space
12. Horizontal Tabulation
13. Line Feed
14. Vertical Tabulation
15. Form Feed (page break)
16. Carriage Return
17. Shift Out
18. Shift In
19. Data Link Escape (Transmission control extension)
20. Device Control 1
21. Device Control 2
22. Device Control 3
23. Device Control 4
24. Negative AcKnowledge
25. SYNchronous idle (synchronous signal)
26. End of Transmission Block
27. End of Transmission Block
28. CANcel
29. End of Medium
30. SUBstitute
31. ESCape (extended)
32. File Separator
33. Group Separator
34. Record Separator
35. Unit Separator

Recommended Posts

About character string handling that can be placed in JSON communication
Building Sphinx that can be written in Markdown
Basic algorithms that can be used in competition pros
ANTs image registration that can be used in 5 minutes
Morphological analysis and tfidf (with test code) that can be done in about 1 minute
[Django] About users that can be used on template
Goroutine (parallel control) that can be used in the field
Goroutine that can be used in the field (errgroup.Group edition)
Scripts that can be used when using bottle in Python
Evaluation index that can be specified in GridSearchCV of sklearn
I made it because I want JSON data that can be used freely in demos and prototypes
About psd-tools, a library that can process psd files in Python
Make a Spinbox that can be displayed in Binary with Tkinter
A timer (ticker) that can be used in the field (can be used anywhere)
Python standard input summary that can be used in competition pro
Make a Spinbox that can be displayed in HEX with Tkinter
Confirmation that rkhunter can be installed
Easy padding of data that can be used in natural language processing
AtCoder C problem summary that can be solved in high school mathematics
Serverless LINE Bot that can be done in 2 hours (source identifier acquisition)
Maximum number of function parameters that can be defined in each language
A story that heroku that can be done in 5 minutes actually took 3 days