I've been struggling to explain the difference between an interpreter and a compiler for many years. The causes include various things such as misunderstandings caused by confusing the programming language with its processing system, those caused by the difference between the inside and outside of the processing system, and those due to the historical background. I am. Therefore, it is difficult to briefly describe the definitions of the interpreter and compiler, and it is thought that a structure has been created in which misunderstandings cause misunderstandings. Therefore, in this article, I would like to explain the difference between the interpreter and the compiler using Venn diagrams in order to improve the situation as much as possible.
(This article was inspired by "Explanation of the difference between an interpreter and a compiler using a formal language-Qiita". .)
First, I will briefly explain " interpreter
"and" compiler
". These are the concepts associated with the programming language "processing system
", and in simple terms, they differ in the implementation method of the programming language. The interpreter "interprets
" and "executes
" the target programming language. Interpretation here is to decide how to execute based on the semantics of the programming language. Unlike the interpreter, the compiler does not involve "execution
" and only translates
the target language into another language. Any programming language can basically be implemented as an "interpreter" or a "compiler" [^ 1].
It seems like a simple and clear definition to write this much, but in reality, the above is the definition of ** "broadly defined" **, and classification based on reality requires a little more ingenuity.
[^ 1]: This article assumes a programming language that can be executed on a von Neumann computer. I won't cover it in this article because it gets complicated when I come up with quantum computing.
When I first tried to explain it with a diagram, I was very worried about how to express it, but in the end I got the following Venn diagram.
I intended to write it as easily as possible, but I think there are some parts where the intention is difficult to read, so I would like to explain it briefly.
First, a compiler in a broad sense simply means translation
to a different programming language, but the image of a general compiler is from a high-level language to a low-level language (machine language or close to machine language) like a C or Java compiler. It translates into bytecode). Therefore, this image is defined as "compiler in a narrow sense".
Next, " translator
"is a processing system that converts to the same level programming language. It is ambiguous to say "same level", but if you consider only three categories, machine language, bytecode, and other high-level languages, they are roughly the same. The assembler translates assembly language (mnemonics) into machine language, which can be considered a translator because it is about the same level of conversion [^ 2]. The parser refers to the process of converting a programming language into an abstract syntax tree (AST) in this figure, but I think that the parser can also be classified as a translator if AST is regarded as a kind of formal language. [^ 3]. The processing system of AltJS (TypeScript, CoffeeScript, Dart, PureScript, Elm, Scala.js, Opal ...) is also a kind of translator and is often called " transpiler
". These processing systems are alternative languages of JavaScript (origin of AltJS) because the conversion destination language is fixed to JavaScript.
[^ 2]: We don't actually call assemblers "translators" or "compilers" just because they can be conceptually classified that way. [^ 3]: The conversion to the abstract syntax tree does not change the semantics of the original programming language, so I felt it was closer to a translator than a compiler, but it is not a general interpretation.
The interpreter originally interpreted and executed the language directly, like the processing system of the early BASIC language. Therefore, the interpreter of this image is called "` interpreter in a narrow sense ". Few implementations are implemented in this way in modern high-level languages. This is because modern high-level languages have been devised in various ways to make them easier for humans to handle, and are not suitable for direct execution by computers. Conversely, a low-level language processing system (JVM) such as Java bytecode rather than a high-level language is also called a "bytecode interpreter" and falls under the "narrow sense interpreter".
If you want to implement a modern high-level language processor as an interpreter, you will almost certainly implement a indirect interpretation interpreter
. The "indirect interpretation" interpreter is a method of temporarily converting the original source code into an easy-to-execute intermediate representation such as an abstract syntax tree (AST) or bytecode and executing it. This is a typical processing system of so-called lightweight programming languages (LL) such as Python, Ruby, and JavaScript. As some of you may have already noticed, the conversion to "intermediate representation" potentially means that compiler-equivalent processing is implemented in the interpreter. That is why Venn diagrams have broader interpreters and indirect interpretation circles that overlap with the compiler. Especially in recent interpreter processing systems, the bytecode format, for which optimization research is progressing, is often adopted for "intermediate representation", and it is developed by making full use of the technology developed for compilers.
The division of compiler and interpreter will change if the context such as viewpoint, viewpoint, and field of view changes. For example, if you explain with the recent CRuby processing system (Ruby processing system written in C language), the CRuby processing system is recognized as an interpreter from the user's point of view of the Ruby language. That's because the CRuby processor seems to interpret and execute the Ruby language. However, if you move your viewpoint inside the processing system of the CRuby processing system, the scenery you see will change. In the CRuby processing system, the process of converting the Ruby language to an abstract syntax tree is performed first, and then it is converted to the bytecode unique to the CRuby processing system. After that, the bytecode interpreter unique to the CRuby processing system operates and executes the generated bytecode. In this way, if you look at it in detail from the inside, translator" that converts the Ruby language to AST and ʻexempte the bytecode of the CRuby processing system and the" compiler "in the narrow sense that converts AST to the bytecode of the CRuby processing system. You can see that the "interpreter in the narrow sense"
is working. If you move your perspective further inside the bytecode interpreter, you can also see that the JIT compiler
is running there [^ 4]. If you broaden your horizons this time, the Ruby language also has a processing system called JRuby that runs on the JVM. JRuby comes with a pre-compiler
(Ahead-Of-Time compiler), which allows you to pre-compile Ruby scripts into Java byte code.
To give an example of Java, Jshell was introduced from Java 9, and since this is a so-called REPL (Read-Eval-Print Loop), it can be seen as a ** interpreter processing system ** from the user's point of view. I will. Also, since Java 11 allows you to execute java files immediately, this is the correct behavior of the interpreter, which is the execution method that lightweight programming languages such as Ruby have been good at.
In this way, programming languages and their processing systems are now greatly evolving, becoming more complex and diversified. As a result, interpreters and compilers are nested, have multiple stages, and have multiple processing systems, so it is very important from which context you are looking at the "compiler" or "interpreter". In other words, what you see as an interpreter from your own point of view may look different from other points of view, so you should pay close attention to the context when talking about it.
[^ 4]: JIT compiler (Experimental) is available as an option from Ruby 2.6.
In this article, I tried to explain the difference between the interpreter and the compiler using Venn diagrams. The summary is as follows.
--"Interpreter" and "compiler" are implementation methods of language processing system, and are not linked to programming languages.
--The processing system of a programming language can basically be implemented as both an interpreter and a compiler.
--The interpreter "interprets
" and "executes
" the target programming language.
--Unlike the interpreter, the compiler does not involve " execution
", it just" translates `" the target language into another language.
--Interpreters and compilers have narrow and broad definitions for historical reasons.
--See Venn diagram
--Modern programming languages and their processing systems have evolved significantly, become more complex and diversified, and as a result, compilers and interpreters are intertwined in a very complex way.
――Be careful when talking about it because it looks different in different contexts.
The Venn diagram also organizes modern interpreters and compilers in a rational and intuitive way to make them easier to understand, while also respecting historical interpretations. The overlapping of circles is also devised to express that the upper layer may include the lower layer.
We hope this article will help you better understand your interpreter and compiler, and help you discover new things.
-Explanation of the difference between an interpreter and a compiler using a formal language --Qiita -[Interpreter-Wikipedia](https://ja.wikipedia.org/wiki/%E3%82%A4%E3%83%B3%E3%82%BF%E3%83%97%E3%83%AA% E3% 82% BF) -Compiler-Wikipedia -[Transcompiler-Wikipedia](https://ja.wikipedia.org/wiki/%E3%83%88%E3%83%A9%E3%83%B3%E3%82%B9%E3%82%B3 % E3% 83% B3% E3% 83% 91% E3% 82% A4% E3% 83% A9)
Recommended Posts