I've been addicted to J language lately, but unfortunately Qiita's syntax highlighting doesn't support J: qiitan-cry :. ~~ J is a particularly ugly language, so I think it will motivate both the article writer and the reader to have syntax highlighting.
So I decided to implement it myself. I will keep that record here.
It may be helpful when implementing syntax highlighting for other languages, or the J language may be too different from other languages to be very helpful.
Qiita's syntax highlighting currently uses a Ruby library called Rouge. I wanted to add J highlights by sending a pull request to this library.
After forking Rouge, start by looking at the lexer Development Guide.
First, add the file that defines the lexer and the spec.
lib/rouge/lexers/j.rb
# -*- coding: utf-8 -*- #
# frozen_string_literal: true
module Rouge
module Lexers
class J < RegexLexer
title 'J'
desc "The J programming language (www.jsoftware.com)"
tag 'j'
filenames '*.ijs', '*.ijt'
#Write an implementation of lexer here
end
end
end
spec/lexers/j_spec.rb
# -*- coding: utf-8 -*- #
# frozen_string_literal: true
describe Rouge::Lexers::J do
let(:subject) { Rouge::Lexers::J.new }
describe 'guessing' do
include Support::Guessing
it 'guesses by filename' do
assert_guess :filename => 'foo.ijs'
assert_guess :filename => 'foo.ijt'
end
end
describe 'lexing' do
include Support::Lexing
#Write a test here
end
end
In addition to this, you need lib / rouge / demos / j
and spec / visual / samples / j
, but you can start with an empty file.
After that, perform the following steps in parallel. It's faster to look at the actual code than to explain it in words, so I won't explain it in detail. If you look at lexer, a language you know / use, you'll probably find out.
If you know how to write RSpec, you shouldn't have any problems [^ 1].
Use ʻassert_tokens_equal` for testing.
assert_tokens_equal "code",Token 1,Token 2, ...
Tokens are represented by a pair of [name, text]
. See List (https://github.com/rouge-ruby/rouge/wiki/List-of-tokens) for the names of the tokens.
In fact, it seems that many languages have few specs written, so * it may not be necessary to write in too much detail *.
DSL (EDSL) is also used to describe lexer.
state symbol do
rule regular expression,token
...
end
The details are not explained here. If you don't understand, it may be helpful to look at lexer in other languages [^ 2].
The visual sample (spec / visual / samples / j
) is a text file that allows you to visually check if it is highlighted correctly. It can be a program of a certain size or just a list of tokens.
demo (lib / rouge / demos / j
) is the short code that appears in rouge.jneen.net.
As stated in the README, specs are tested using rake. The visual sample is checked by running rackup. (localhost: 9292
shows the demo and localhost: 9292 / j
shows the visual sample.)
The most important thing in highlighting J's code is that ** not all symbols should be treated as operators **. If you can't color code the symbols, the syntax highlighting will be halved.
So I decided to treat the verb as a function (Name.Function
) and the adverb / conjunction as an operator (ʻOperator). This makes expressions like
>: @ i.` easier to read.
There is another trick for readability. When the definition part of explicit definition is a string literal (eg dyad:'x + y'
), the inside of the literal is highlighted as an expression.
Actually, this is the first time I wrote Ruby properly, but I think it was unexpectedly easy to write [^ 3]. It was easy to debug visually.
I sent a pull request to Rouge and it was successfully merged: tada :. It is included in the recently released v3.24.0 (Demo).
All I have to do is wait for Qiita to respond. : qiitan:
Excerpt from comments on Rouge v3.24.0 release:
This release has two new lexers: one for e-mails * (...) * and one for J (why not another language starting with J?).
~~ After all J is treated as a material language …… ~~
[^ 1]: Minitest is actually used, but the basic writing style is the same. [^ 2]: Obviously, copy and paste is not good. [^ 3]: I feel like I just wrote EDSL on Ruby rather than Ruby.