XSLT is generally the mainstream mechanism for converting XML documents. However, around 2009, Anakia using Apache (Jakarta) Velocity as a template engine was getting a lot of excitement, and I myself (recently updated). I'm using it for blogging (not).

Velocity itself is still under development and is known as a template engine for Java-based web applications. Unfortunately, anakia has not been developed and is not included in the latest Velocity Engine 2.0 release. The one based on the defined schema like DocBook is used, but the mechanism to convert any own XML format is not widely used, not limited to Anakia, and the mechanism like markdown is widespread. I think you are doing it.

Since XSLT is Turing complete and has a strong taste for functional languages, it has been completed by describing recursive processing and following the XML grammar, the amount of description is larger than that of general programming languages by the amount of tags. On the other hand, there are some elements that are more troublesome than easy to use.

XSLT and XSL-FO are great technologies, but I also think that the prejudice that they have to be used limits the use of XML.

Recently, I personally started writing articles in Markdown (md) and Asciidoc (adoc) with Hugo as the back end for building static websites, but md Although the and adoc format is simple, I personally have the following complaints.

[Md format] When you want to express a chord piece or keyboard input in a document, you can highlight it with the same markup (such as enclosing it with'+' or'*'), but both have the same expression. There is no way to distinguish between contexts.
[Adoc format] Unlike md, the expressions are abundant and can be used in various situations, but the expressions are not intuitive.
[Common to md and adoc] In order to verify the file, it is limited to the level of whether it can be parsed, so even if an error is expected due to the wide range of tolerance, it will be output to the screen as it is.
[Common to md and adoc] There is not much room to intervene in the conversion.

Since the purpose is different in the first place, it is unreasonable that you can not do what you have realized with XML, but when writing a technical document, you want to describe keyboard input and screen output more clearly and separately. I feel it.

The adoc format is convenient in this respect because you can arbitrarily add the class attribute and id attribute. If you unify the writing style, you will be able to absorb it with CSS. However, I think that there are still some unfamiliar things about whether readability is good, whether the built-in functions are really convenient, but it is necessary to operate with a strong awareness of screen output rather than pure context. I feel it.

Also, the adoc format feels difficult to force others to use. It's much better than the roff format, but even if it's an option for me, if I want to unify the format with an unspecified number of people, I think the verification function is poor or difficult to use.

Therefore, I tried to place the HTML file output by Anakia in a part of Hugo's contents / hierarchy. In addition, the schema is defined by Relax-NG and verified by Jing.

Markdown is enough, and I think there are people who think that simple is better. I personally feel close to the belief that document data and output representation processing should be separated, so I think that XML (like) fundamentalism also has a strong influence.

This is the introduction, and I wanted to use the document written in the original XML format created by anakia in the past, so I summarized the issues I felt at the moment.

environment

Ubuntu 18.04
Java 8 (1.8.0_191)

The downloaded file is located in ~ / Downloads /. Also, each command is executed from the top working directory each time. I'm repeatedly running commands like $ cd tools, but be aware that you can skip it if you've already moved it.

Obtaining and setting up Anakia

There is a link to anakia-1.0 at the bottom of the Official Velocity Site Download Page, but I don't use it.

As will be described later, get velocity-1.7 from the release page of the official website, and anakia We will use velocity 1.7 instead of -1.0. I think that the only change is the package name of AnakiaTask, so if you are referring to the anakia-1.0 documentation, you should be careful when creating an ant task.

$ mkdir tools
$ cd tools
$ tar xvzf ~/Downloads/velocity-1.7.tar.gz
$ ln -s velocity-1.7 velocity

Tools other than Anakia

Since RelaxNG and Jing are used to verify the XML file, we will also deploy tools for that.

Jing's official website exists on relaxng.org, but the source code is distributed on Github ([relaxng / jing-trang](https:: Go to //github.com/relaxng/jing-trang)).

The JAR file required for execution is downloaded from the Official Site Download Page for 20091111.

$ cd tools   ##Move to the same directory as anakia
$ unzip ~/Downloads/jing-20091111.zip
$ ln -s jing-20091111 jing

Since anakia starts as a task of ant, get the latest version of from download site 1.9 series.

$ cd tools  ## anakia、jing,Move to the same directory as trnag
$ tar xvzf ~/Downloads/apache-ant-1.9.13-bin.tar.gz 
$ ln -s apache-ant-1.9.13 apache-ant

So far, the tools directory looks like this:

$ ls -F
apache-ant@  apache-ant-1.9.13/  jing@  jing-20091111/   velocity@  velocity-1.7/

Preparation of execution environment

Prepare the bin directory in parallel with the tools directory and prepare to execute the command.

$ mkdir bin   ##Create it in the same location as the tools directory.

First, prepare the bin / envrc file.

`bin/envrc`


## Please change the following line for your correct JDK location.
JAVA_HOME=/opt/jdk1.8.0_191
export JAVA_HOME

WD="$(pwd)"
SCRIPTFILE="$(readlink -f $0)"
BASEDIR="$(dirname $SCRIPTFILE)"
TOPDIR="${BASEDIR}/.."
export WD SCRIPTFILE BASEDIR TOPDIR

TOOLDIR="${TOPDIR}/tools"
ANT_HOME=${TOOLDIR}/apache-ant
VELOCITY_HOME=${TOOLDIR}/velocity
JING_HOME=${TOOLDIR}/jing

export ANT_HOME VELOCITY_HOME JING_HOME
export PATH=${ANT_HOME}/bin:${PATH}

CP_ANT=$(find ${ANT_HOME}/. -name '*.jar' | tr '\n' ':')
CP_VELOCITY=$(find ${VELOCITY_HOME}/. -name '*.jar' | tr '\n' ':')
CP_JING=$(find ${JING_HOME}/. -name '*.jar' | tr '\n' ':')
export CLASSPATH=${CLASSPATH}:${CP_VELOCITY}:${CP_ANT}:${CP_JING}

This time, we are preparing an xml file that describes the ant task in bin.

`bin/run-anakia.xml`


<project name="build-site" default="doc" basedir=".">
  <property environment="env" />
  <!-- Please change the following property variables -->
  <property name="docs.infilepattern" value="20*.xml" />
  <property name="docs.basedir" value="${env.WD}" />
  <property name="docs.destdir" value="${env.WD}" />
  <property name="docs.vslfilename" value="blog.vsl"/>
  <property name="docs.projfilename" value="project.xml"/>
  <property name="docs.propfilepath" value="${env.BASEDIR}/velocity.properties"/>
  
  <taskdef name="jing" classname="com.thaiopensource.relaxng.util.JingTask"/>
  
  <target name="validate_relaxng">
    <jing rngfile="${docs.basedir}/blog.rng">
      <fileset dir="${docs.basedir}" includes="${docs.infilepattern}"/>
    </jing>
  </target>
  
  <target name="doc" depends="validate_relaxng">
    <taskdef name="anakia"
	     classname="org.apache.velocity.anakia.AnakiaTask"/>
    <anakia basedir="${docs.basedir}"
	    includes="${docs.infilepattern}"
	    destdir="${docs.destdir}"
	    extension=".html"
	    style="${docs.vslfilename}"
	    projectFile="${docs.projfilename}"
	    velocityPropertiesFile="${docs.propfilepath}"
	    lastModifiedCheck="true" >
    </anakia>
  </target>
  
</project>

Prepare bin / run-anakia.sh as a wrapper script to perform these operations.

`bin/run-anakia.sh`


#!/bin/bash

SCRIPTFILE="$(readlink -f $0)"
BASEDIR="$(dirname $SCRIPTFILE)"
. "${BASEDIR}/envrc"

## main ##
ant -f ${BASEDIR}/run-anakia.xml "$@"

When this script is executed, Jing will perform validation and HTML file generation at the same time, but it is also possible to execute only the validation task like `` `$ bin / run-anakia.sh validate_relaxng```.

Next, prepare velocity.properties. The default language is Latin-1 (ISO-8859-1), so it is required.

`bin/velocity.properties`


input.encoding=UTF-8
output.encoding=UTF-8

Finally, the following files are prepared.

$ cd bin
$ ls
envrc  run-anakia.sh  run-anakia.xml  velocity.properties

Article preparation

Then go to the target directory and prepare the three files.

${target_dir}/project.xml
${target_dir}/blog.vsl
${target_dir}/blog.rng

The first project.xml is almost empty because I haven't used it in particular.

`${target_dir}/project.xml`


<?xml version="1.0" encoding="UTF-8"?>
<project name="Anakia"
         href="http://velocity.apache.org/anakia">
</project>

Since the RelaxNG schema and VSL template used are long, the VSL template and the XML file of the article are matched with the RNG file extracted from OASIS Tutorial. Is posted as a sample.

`${target_dir}/blog.vsl`


<html>
#set($cards = $xpath.applyTo("/card", $root))
#foreach($card in $cards)
<ul>
#foreach($c in $card.getContent())
#if($c.name == "name")
  <li>Name: $c.getValue()</li>
#elseif($c.name == "email")
  <li>EMail: $c.getValue()</li>
#end
#end
</ul>
#end
</html>

`${target_dir}/blog.rng`


<?xml version="1.0" encoding="UTF-8" ?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0">
  <start>
    <element name="addressBook">
      <zeroOrMore>
        <element name="card">
          <ref name="cardContent"/>
        </element>
      </zeroOrMore>
    </element>
  </start>
  <define name="cardContent">
    <element name="name">
      <text/>
    </element>
    <element name="email">
      <text/>
    </element>
  </define>
</grammar>

Document generation

After placing the configuration file, place the appropriate document in the same directory.

`${target_dir}/20190114.xml`


<addressBook>
  <card>
    <name>John Smith</name>
    <email>[email protected]</email>
  </card>
  <card>
    <name>Fred Bloggs</name>
    <email>[email protected]</email>
  </card>
</addressBook>

If you launch the script in this directory, you will get the following results:

$ cd ${target_dir}
$ ls -F
20190114.xml  blog.rng*  blog.vsl   project.xml  velocity.log
$ ../bin/run-anakia.sh
...
$ cat 20190114.html 
<html>
<ul>
  <li>Name: John Smith</li>
  <li>EMail: [email protected]</li>
</ul>
<ul>
  <li>Name: Fred Bloggs</li>
  <li>EMail: [email protected]</li>
</ul>
</html>

Challenges in collaboration with Hugo

Since Hugo itself does not have a mechanism to add its own preprocessor, it is necessary to output an HTML file under content / as preprocessing separately from hugo execution.

Therefore, place the source file written in XML format in a directory that is not under the control of Hugo, and use Anakia Task to output the HTML file to any location under content.

Since it is necessary to add Front Matter when the HTML file is placed, it is better to add Title in the VSL file. I think it will be.

Velocity itself does not have a good way to know the update time of the file, so to add date to Front Matter, specify an XML file with the file name etc. and the character string of the file creation date as CustomContext, etc. It seems that some ingenuity is needed.

Problems that occur when using anakia-1.0

If you use the distributed anakia-1.0, it may not work as described in the anakia documentation due to the included velocity-1.5.jar.

Use $ velocityHasNext

In the VSL template file, the `` `$ velocityHasNext``` required to describe the processing to be performed at the beginning and end of the #foreach loop is not implemented in velocity-1.5.

It is set as the default value for these velocity.properties. directive.foreach.counter.name = velocityCount directive.foreach.iterator.name = velocityHasNext

For example, when writing tags in TOML format in Hugo's Front Matter, it is necessary to enclose the front and back in parentheses like `tags = ["tag1 "," tag2 "]` because it is written in list format. There is. If the tag name was originally stored in a \ $ tags variable in list format, you would need code like this, as there is no good way to compare the result of \ $ tags.size () with \ $ velocityCount. \ $ velocityHasNext is required.

#foreach ($tag in $tags)
  #if ($tags.size() == 1)
tags: [ "$tag.getValue()" ]
  #elseif ($velocityCount == 1)
tags: [ "$tag.getValue()"##
  #elseif ($velocityHasNext)
, "$tag.getValue()"##
  #else
, "$tag.getValue()" ] 
  #end
#end

In addition to this, versions prior to velocity-1.7 have various problems, so it is not recommended to use anakia-1.0, which depends on velocity-1.5.

Summary-Why use Anakia

As an industry standard, I think that the theory is to use XSLT, XSL-FO, but the original format XML file defined by the schema such as RelaxNG creates a well-formed document with some meaning. I think it's a good way to meet the needs of converting to HTML etc.

Personally, I have no objection to the spread of Markdown and Asciidoc, but I can't think that it's good because it's convenient.

I think that it is necessary to add annotations that can be used as clues as reference materials for processing by computers, not limited to AI, and XML that can be used as clues even if RDF or Ontology is not used. I think the document is good.

Anakia is easier to learn and use than XSLT, so I would like to use it as a tool at hand and spread it.

[JAVA] I tried using anakia + Jing now

environment

Obtaining and setting up Anakia

Tools other than Anakia

Preparation of execution environment

bin/envrc

bin/run-anakia.xml

bin/run-anakia.sh

bin/velocity.properties