Source code character code check script

Introduction

A long time ago, it became necessary to check the character code of the source code, so I made it with python and made it run regularly with Jenkis, so that memo

Why is it necessary to check the character code of the source code?

Implementation

mojicodecheck.py


#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""
	@brief Character code check
	@author DM
	@date   2012/10/13
"""

import os, os.path, sys, re, time
import glob
import codecs
import chardet

#Folders to ignore
ignorelist  = ["tgs","tools_src","shader","ext"]

#File extension to check
extention  = [".c",".cpp",".h",".inc",".inl",".hpp"]

# 	main
def main():
	sys.stdout = codecs.getwriter("shift_jis")(sys.stdout) #output
	sys.stdin = codecs.getreader("shift_jis")(sys.stdin) #input

	#Usage display
	if len(sys.argv) < 2 or len(sys.argv) > 4:
		__usage()
		sys.exit(0)
		
	start_time = time.clock()

	#Project directory
	path = os.path.normpath(sys.argv[1])	#Get the module file path
	print(path)
	
	
	ret = True

	
	for (root, dirs, files) in os.walk(path):
		for file in files:
			file_path = os.path.join(root, file)
			
			base, ext = os.path.splitext(file_path)
			
			ignore = False
			for d in ignorelist:
				if file_path.find(d) > 0:
					ignore = True
					break;
			
			if ignore == True:
				continue
			
			if ext in extention:
				try:
					f = open(file_path, 'r')
					data = f.read()
					p = chardet.detect(data)
					if p['encoding'] == 'SHIFT_JIS':
						print('error',p['encoding'],file_path)
						ret = False
				except IOError:
					print(file_path, 'cannot be opened.')
				finally:
					f.close()

	end_time = time.clock()
	print("complete![time: ",(end_time - start_time),"sec]")
	
	if ret == False:
		CODE='S001'
		MESSAGE='shift-jis still exists!!'
		EXPECTED='S999-error'
		ACTUAL='sample ' + CODE +' '+ MESSAGE
		sys.exit(EXPECTED+' '+ACTUAL)
	
# __main__
if __name__ == '__main__':
	#psyco.full()
	main()

python mojicodecheck.py ${BUILD_DIR}

If you execute

[MojiCodeCheck] $ /bin/sh -xe /tmp/hudson2121374754708119058.sh
+ python /var/lib/jenkins/workspace/soilproject/common/tools/mojicodecheck.py /var/lib/jenkins/workspace/soilproject
/var/lib/jenkins/workspace/soilproject
('error', 'SHIFT_JIS', '/var/lib/jenkins/workspace/soilproject/common/tgl/src/TGLSystemTypes.h')
('error', 'SHIFT_JIS', '/var/lib/jenkins/workspace/soilproject/common/tgl/src/Effect/TGLEffectEmit.h')
('error', 'SHIFT_JIS', '/var/lib/jenkins/workspace/soilproject/common/tgl/src/Effect/Program/EPrgZanplu.h')
('error', 'SHIFT_JIS', '/var/lib/jenkins/workspace/soilproject/common/tgl/src/Effect/Program/EPrgSpiral.cpp')
('complete![time: ', 15.200000000000001, 'sec]')
S999-error sample S001 shift-jis still exists!!
Build step 'Execute shell' marked build as failure
Notifying upstream projects of job completion
Finished: FAILURE

The s-Jis source is output to the log like this.

That's it.

Recommended Posts

Source code character code check script
Character code
python character code
About Python3 character code
Install ansible from source code
Tool to check code style
2.x, 3.x character code of python
Check the code with flake8
Character code learned in Python
Always check PEP8 while editing Python source code in Emacs
Character code † darkness † encounter report part1
Code: 2 "Hello World" in "Choregraphe-Python script"
Check python code styles using pep8
Linux commands related to character code
[Python] Read the Flask source code
Jinja2 source code formatting using atom-beautify
Count Source lines of code (SLOC)