Qiita doesn't have syntax highlighting for J language, so I implemented it myself

Introduction

I've been addicted to J language lately, but unfortunately Qiita's syntax highlighting doesn't support J: qiitan-cry :. ~~ J is a particularly ugly language, so I think it will motivate both the article writer and the reader to have syntax highlighting.

So I decided to implement it myself. I will keep that record here.

It may be helpful when implementing syntax highlighting for other languages, or the J language may be too different from other languages to be very helpful.

Rough procedure

Qiita's syntax highlighting currently uses a Ruby library called Rouge. I wanted to add J highlights by sending a pull request to this library.

After forking Rouge, start by looking at the lexer Development Guide.

First, add the file that defines the lexer and the spec.

lib/rouge/lexers/j.rb


# -*- coding: utf-8 -*- #
# frozen_string_literal: true

module Rouge
  module Lexers
    class J < RegexLexer
      title 'J'
      desc "The J programming language (www.jsoftware.com)"
      tag 'j'
      filenames '*.ijs', '*.ijt'

      #Write an implementation of lexer here
    end
  end
end

spec/lexers/j_spec.rb


# -*- coding: utf-8 -*- #
# frozen_string_literal: true

describe Rouge::Lexers::J do
  let(:subject) { Rouge::Lexers::J.new }

  describe 'guessing' do
    include Support::Guessing

    it 'guesses by filename' do
      assert_guess :filename => 'foo.ijs'
      assert_guess :filename => 'foo.ijt'
    end
  end

  describe 'lexing' do
    include Support::Lexing

    #Write a test here
  end
end

In addition to this, you need lib / rouge / demos / j and spec / visual / samples / j, but you can start with an empty file.

After that, perform the following steps in parallel. It's faster to look at the actual code than to explain it in words, so I won't explain it in detail. If you look at lexer, a language you know / use, you'll probably find out.

write spec

If you know how to write RSpec, you shouldn't have any problems [^ 1].

Use ʻassert_tokens_equal` for testing.

assert_tokens_equal "code",Token 1,Token 2, ...

Tokens are represented by a pair of [name, text]. See List (https://github.com/rouge-ruby/rouge/wiki/List-of-tokens) for the names of the tokens.

In fact, it seems that many languages have few specs written, so * it may not be necessary to write in too much detail *.

Write lexer

DSL (EDSL) is also used to describe lexer.

state symbol do
rule regular expression,token
  ...
end

The details are not explained here. If you don't understand, it may be helpful to look at lexer in other languages [^ 2].

Write a visual sample / demo

The visual sample (spec / visual / samples / j) is a text file that allows you to visually check if it is highlighted correctly. It can be a program of a certain size or just a list of tokens.

demo (lib / rouge / demos / j) is the short code that appears in rouge.jneen.net.

test

As stated in the README, specs are tested using rake. The visual sample is checked by running rackup. (localhost: 9292 shows the demo and localhost: 9292 / j shows the visual sample.)

Points to be particular about

The most important thing in highlighting J's code is that ** not all symbols should be treated as operators **. If you can't color code the symbols, the syntax highlighting will be halved.

So I decided to treat the verb as a function (Name.Function) and the adverb / conjunction as an operator (ʻOperator). This makes expressions like >: @ i.` easier to read.

There is another trick for readability. When the definition part of explicit definition is a string literal (eg dyad:'x + y'), the inside of the literal is highlighted as an expression.

in conclusion

Actually, this is the first time I wrote Ruby properly, but I think it was unexpectedly easy to write [^ 3]. It was easy to debug visually.

I sent a pull request to Rouge and it was successfully merged: tada :. It is included in the recently released v3.24.0 (Demo).

All I have to do is wait for Qiita to respond. : qiitan:

Digression

Excerpt from comments on Rouge v3.24.0 release:

This release has two new lexers: one for e-mails * (...) * and one for J (why not another language starting with J?).

~~ After all J is treated as a material language …… ~~

[^ 1]: Minitest is actually used, but the basic writing style is the same. [^ 2]: Obviously, copy and paste is not good. [^ 3]: I feel like I just wrote EDSL on Ruby rather than Ruby.

Recommended Posts

Qiita doesn't have syntax highlighting for J language, so I implemented it myself