It's not uncommon to say that copy-and-paste code clones of source code are more reliable, but in many cases the presence of code clones often afflicts us at the last minute.
After the person in charge ran away, corrected the source code until late at night and released it, in fact, he said, "I'm making it with copy and paste, so please fix everything else and test it" and start over. Is very, very sad.
It is a humiliation to modify almost the same source code by subtly changing it like a spot the difference.
At the last minute, to avoid such punishment games, the frequency of copying should be monitored at all times, and any unusual frequency should be corrected immediately.
This section describes the tools for detecting copy and paste.
PMD-CPD PMD is a tool for detecting potential problems in Java source code implemented in Java. http://pmd.sourceforge.net/snapshot/
Part of this feature is the CPD command, which detects duplicate code and can detect duplicates in the following programming languages: ・ Java ・ JSP ・ C ++ ・ Ruby ・ Fortran ・ PHP ・ C # ・ PLSQL ・ Ecmascript
Download one of the following files, unzip it, and extract it to any folder. http://sourceforge.net/projects/pmd/files/pmd/
cpd --minimum-tokens 50 --language ecmascript --format text --encoding utf8 --files C:\tool\clonedigger\test\ > result.txt
bin/run.sh cpd --minimum-tokens 35 --format xml --language ruby --files /var/lib/redmine/app/ > result.xml
Parameters are shared between windows and linux.
Parameters | Description |
---|---|
--minimum-tokens | Specify the number of tokens to detect duplicates. |
--format | text,xml,You can select csv. If you output it in xml, you can use it in jenkins. |
--language | Specifies the type of programming language. |
--files | Specify the directory of the source code to be inspected. This is detected recursively. |
--encoding | Specifies the encoding of the source code to be inspected |
GUI You can also work with GUI by running bin / cpdgui.bat.
Clonedigger Clonedigger is a copy / paste detection tool implemented in Python and Java. http://clonedigger.sourceforge.net/
The discoverable programming languages are:
・ Python ・ Java ・ Lua ・ Javascript
There are some tips below for detecting programming languages other than python.
easy_install clonedigger
__ If you specify a file __
clonedigger -l python -o ./test.html C:\tool\clonedigger\test\test_utf8.py
__If you specify a folder __
clonedigger -l python -o ./test.html C:\tool\clonedigger\test\test_utf8.py
If you specify a folder, subfolders are also detected. Creates the following HTML in the file specified by -o.
If you use the --cpd-output option as follows, it will be output in XML format. This output format is the same as PMD / CPD.
clonedigger -l python --cpd-output -o test.xml C:\tool\clonedigger\test\python\
When detecting source code other than Python, it will not work unless java_antlr exists in the current directory. For Windows and Python 2.7, the following operations are required.
cd C:\Python27\Lib\site-packages\clonedigger-1.1.0-py2.7.egg\clonedigger
clonedigger -l java --cpd-output -o test.xml C:\tool\clonedigger\test\test.java
When detecting JavaScript, it will not work unless js_antlr exists in the current directory. For Windows and Python 2.7, the following operations are required.
cd C:\Python27\Lib\site-packages\clonedigger-1.1.0-py2.7.egg\clonedigger
clonedigger -l js --cpd-output -o test.xml C:\tool\clonedigger\test\test.js
JavaScript that runs in the browser also works with the following sloppy implementation.
//Extra last comma
var questions = [
{message: "Ah ah", category: Category.emotionalExhaustion} ,
];
However, in clonedigger, if there is such an invalid description, the analysis will be interrupted and an error will occur. At this time, the following error message will be displayed, which will be a hint for correction.
line 14:2 rule arrayItem failed predicate: { input.LA(1) == COMMA }?
AIST CCFinderX You can download AIST CC Finder X, which was told in the comments, from the following page. http://www.ccfinder.net/ccfinderxos-j.html
It requires 32-bit Java and Python 2.6 (no top or bottom) to work. If you download the Windows binary, it will not work unless you run it on 32bit, so you need to modify gemx.bat as follows.
gemx.bat
set PATH=C:\Windows\SysWOW64;C:\TracLight\python;C:\TracLight\python\python\Scripts\;%~dp0\scripts
set CCFINDERX_PYTHON_INTERPRETER_PATH=C:\TracLight\python\python.exe
The following image is an example of the screen detected by gemx.bat.
In addition, you can also display scatter plots. These GUIs are fascinating.
When running from the command line:
ccfx p java -d c:\dev\java\
The a.ccfxd output at this time is a binary file that can be opened with GemX. You can convert it to text format with the following command.
>ccfx p a.ccfxd > test.txt
Visual Basic is rarely supported as a tool of this kind. It seems that it is also analyzing the code made with VB6 and VBA. However, it does not recognize the cls file, so modify ccfx_prep_scripts.ini as follows.
visualbasic=.vb;.bas;.frm;.cls
-Processing for sources handling multibyte may have failed. ・ As far as I checked, it seems unlikely that cooperation with Jenkins will be easy. -Since the update history page is a dead link, it seems that it is no longer maintained. I have the source code, so I'll have to fix it myself if needed.
You can monitor code clone transitions using the Jenkins Violations plugin. https://wiki.jenkins-ci.org/display/JENKINS/Violations
Generate XML on the workspace on the Jenkins script.
In the post-build process of the project settings, specify the XML of 1. in the cpd of Report Violations.
Each build will create the following report.
Recommended Posts