[JAVA] Compound descriptor: I calculated the CDK

What is this article?

I want to do various things, but first of all, I have to have the original data. For the time being, I'm planning to use RDKit, but I decided to use CDK because it's a different story.

■CDK (ver 1.4.6) http://www.rguha.net/code/java/CDKDescUI-1.4.6.jar

No, I didn't really know Java or jar, so I had a lot of trouble. I mean, I didn't really understand the arguments of the CDK. ..

What kind of person is it for?

No, after all it's for me. However, since I have gone to the point where I can go with one command, please use it as it is.

What environment did you try?

Windows 8 Pro (x64) jdk-11.0.2

Thing you want to do

So CDK calculation.

What i did

This is shown in chronological order.

1. Get the file

This is from the above URL After hearing that CDK can be calculated with GUI, first search for "CDK GUI". It was okay to use 1.4.8, which came out, but for some reason it was 1.4.6. But I wanted to use it as a tool, so I used the CUI version.

2. Call from Java

java.exe -jar xxxxxx.jar It seems that you can instruct the execution of the jar file with. That's right ... hey ... thank you. Then the GUI version starts up easily. For the time being, I tried to do some calculations, but what I want is CUI operation. java.exe -jar CDKDescUI-1.4.6.jar -h Anyway, I got help, so please refer to it ...

・ ・ ・

And this is what I was able to do. I should really give priority to readability and thin out lines such as progress logs, but I wanted to make it usable by copying.

echo "just start"
echo %time%

echo " --- descriptors ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -t all          -o descriptors.csv    Compound_000000001_000025000.sdf -b

echo " --- f_estate ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f estate       -o f_estate.csv       Compound_000000001_000025000.sdf -b

echo " --- f_extended ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f extended     -o f_extended.csv     Compound_000000001_000025000.sdf -b

echo " --- f_graph ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f graph        -o f_graph.csv        Compound_000000001_000025000.sdf -b

echo " --- f_standard ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f standard     -o f_standard.csv     Compound_000000001_000025000.sdf -b

echo " --- f_pubchem ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f pubchem      -o f_pubchem.csv      Compound_000000001_000025000.sdf -b

echo " --- f_substructure ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f substructure -o f_substructure.csv Compound_000000001_000025000.sdf -b

echo " --- f_signature ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f signature    -o f_signature.csv    Compound_000000001_000025000.sdf -b

echo " --- f_circular ---"
echo %time%
java -jar CDKDescUI-1.4.6.jar -f circular     -o f_circular.csv     Compound_000000001_000025000.sdf -b

echo "finished"
echo %time%

Somehow a murmur?

The information that can be output by GUI and CUI is different ... Is it wrong? To be honest, I was surprised to compare the files by issuing Descriptors in both GUI and CUI. It's not that CrLf and Lf are different, but the order of the columns is different. .. .. No, if you drop each column into the DB with the column name, and if it is consistent in the file, it's still ...

Hmm ... it's still creepy.

Blurred snake legs

I feel that paying for Java has eliminated the biggest benefits of non-Windows. I mean, both .net and powershell seem to work on Linux. C # can replace the charm that I felt in Java from the perspective of Windows. I wonder if I don't need Java ... No, if it disappears, Java-dependent CDKs won't work, right? Well, it won't disappear.

Recommended Posts

Compound descriptor: I calculated the CDK
I investigated the enclosing instance.
I summarized the collection framework.
I tried the Docker tutorial!
I tried the VueJS tutorial!
I read the Kotlin startbook
I tried the FizzBuzz problem
[Java] I implemented the combination.
I studied the constructor (java)