[RUBY] The languages that influenced Rust

When using Rust, I sometimes think ** "What? This feature doesn't look like other languages?" **. The first thing I thought about was traits, which I understood to be equivalent to Haskell's typeclasses and Scala's typeclass patterns using Implicits. I felt that the closure syntax was similar to Ruby's block notation. I wanted to dig a little deeper into such ** "similar" **, so I looked into the languages that influenced Rust.

(This article is a reprint from My Blog.)

TL;DR

--There was already a cohesive page about the languages that influenced Rust

Languages affected by Rust

I arranged the languages [^ 1] listed in "Influences" in chronological order and made a matrix of the characteristics of the languages. For the selection of features, we have selected the paradigm that Rust is focusing on, except for GC. I've also added Rust itself for comparison.

Appearance age FP OOP Concurrent calculation Static typing Parametric polymorphism Ad hoc polymorphism GC
C 1972 o
Scheme 1975 o o
C++ 1983 o o o
Newsqueak 1984 o o o
Erlang 1986 o o o
SML 1990 o o o o
Haskell 1990 o o o o o
Alef 1992 o o o
Limbo 1995 o o o
Ruby 1995 o o
OCaml 1996 o o o o o
ML Kit 1997[^2] o o o △[^3]
C# 2000 o o o o
Cyclone 2006 o o o △[^4]
Rust 2010 o o o o o o
Swift 2014 o o o o o[^5]

The meaning of each column is as follows. The characteristics of the language are mainly based on Wikipedia, but please note that accurate classification is difficult and includes some dogmatism and prejudice.

--Date of appearance ――It is the age when programming languages appeared. Please overlook the error of 3 years before and after --FP (Functional Programming) --Indicates if the language strongly supports FP --Indicates a △ if it supports moderately --OOP (Object Oriented Programming) --Indicates if the language strongly supports OOP --Concurrent calculation --Indicates whether the language strongly supports the features of actors and CSP / π computational models. --You can do it with an external library! Exclude things like --Static typing --Indicates if the language's most important processor supports static typing --Parametric polymorphism --Indicates if the language supports parametric polymorphism --Includes so-called generics (Java), templates (C ++), let polymorphism (ML language), etc. --Ad hoc polymorphism --Indicates if the language supports ad hoc polymorphism --Includes what are called typeclasses (Haskell), traits (Rust), protocols (Swift), etc. --Simple overloading is not included --GC (garbage collection) --Indicates if the language's most important processing system employs garbage collection

[^ 1]: Unicode Annex is not a language, so I excluded it. NIL and Hermes were effects on features that have already been discontinued, so I also excluded them. C language was added because it was described in Why Rust?-#Influences | Learning Rust. I was worried about whether to include F #, but I couldn't include it because it was difficult to understand the functions that specifically affected it with only "Functional Programming". [^ 2]: Based on the year of publication of Programming with Regions in the ML Kit I will. [^ 3]: Some ML Kit implementations have GC added to region-based memory management, so they are marked with △. (References) [^ 4]: Since the use of GC is optional, it is marked with △. Specifically, garbage collection is available for heap regions. [^ 5]: Swift uses ARC (Automatic Reference Counting). It is arguable whether to classify ARC as GC, but in this article it is classified as "GC" due to the existence of runtime overhead of ARC.

Features affected by Rust

From here, I would like to look at each of the affected functions.

Algebraic data types

Algebraic data types are widely used in languages that support functional programming. It is the sum of the direct product type. In Rust, algebraic data types can be realized by using Enum.

rust


enum IpAddr {
    V4(u8, u8, u8, u8),
    V6(String),
}

The above expresses the IP address type, but V4 type is a direct product of u8 type, and the sum of V4 type and V6 type is IpAddr type.

--Affected language - SML, OCaml

Pattern matching

Roughly speaking, pattern matching is an enhanced version of common branching structures such as ʻif and switch. Pattern matching is available in Rust with match`.

rust


let x = Some(2);
match x {
    Some(10) => println!("10"), // `Option`The mold is disassembled and matched
    Some(y) => println!("x = {:?}", y),
    _ => println!("error!"),
}

Pattern matching can be matched by decomposing the data structure as in the above code, so it is also compatible with algebraic data types.

Regarding pattern matching, I wrote an article "Dedicated to all programmers! Illustrated" pattern matching "", so please refer to that as well.

--Rust book -Pattern matching

Type inference

Type inference is gradually gaining in popularity, as it has been adopted by Java and C # in statically typed languages. Rust's type inference is based on ** powerful Hindley-Milner type inference **, and the front type may be inferred from the back expression or statement.

rust


let mut x = HashMap::new(); //HashMap key and value types are inferred from subsequent expressions and statements. Convenient!

x.insert(String::from("Key"), 10);
println!("{:?}", x);

In the above example, the HashMap key and value types are inferred from the arguments of the subsequent insert function. This is not possible with Java or C # type inference. That's because the type inference introduced in these languages is local variable type inference, which is different from Rust type inference.

--Affected language - SML, OCaml

Semicolon statement separation

Rust functions are primarily composed of "statements" and end with a "statement" or "expression". Rust statements are separated by a semicolon ";". Therefore, it is possible to distinguish between a statement and an expression depending on whether or not a semicolon is added at the end.

rust


fn test1() -> i32 {
    let mut x = 2; //Sentence
    x =  x * 3 ;   //Sentence
    x + 5          //Expression (return value is 11)
}

fn test2() -> () {
    let mut x = 2; //Sentence
    x =  x * 3 ;   //Sentence
    x + 5;         //Statement (return value is unit`()`)
}

The interesting thing about Rust is that you can convert it into a "sentence" just by adding a semicolon ";" after every "expression". And when it comes to statements, the return value when placed at the end of the function is the unit (). Explaining with the above code, the return value changes depending on whether there is a semicolon (;) in the last line [^ 6]. In other words, in Rust, a "statement" can also be thought of as a type of "expression" that returns a "value." Rust is also referred to as a ** expression-oriented language ** because control structures such as ʻif, match, and while` also return values.

Similarly, in the affected language OCaml, "expressions" are placed at the base of the program components and separated by semicolons.

--References -OCaml Program Structure-OCaml --Affected language - SML, OCaml

[^ 6]: If the return value is the unit (), you can omit the return value from the function signature. So fn test2-> () {...} is synonymous with fn test2 () {...}.

References

References can be created by adding & to variables, and you can create "aliases" for variables. It's similar to a C pointer, except that there is no ** null pointer **. In other words, it is a prerequisite that there is always a reference. The reference destination can be referenced by using the dereference operator "*". In the case of a variable (with mut), the reference destination can be rewritten.

rust


let mut x = "hoge".to_string();
let y = &mut x;
println!("y = {}", *y); // hoge
*y = "mohe".to_string(); // `*y`Rewrite x to rewrite
println!("x = {}", x); // mohe

This reference is considered to be a characteristic feature of C ++ and is undoubtedly an influence of C ++, but there are also major differences. It's a link to Rust's "ownership and lifetime," and unlike C ++, Rust references are never ** dangling references **.

--Affected language - C++

RAII(Resource Acquisition Is Initialization)

RAII literally translates to "resource allocation is (variable) initialization." However, in order to better understand this concept, it seems more urgent to think of it as ** "freeing resources is the destruction of variables" **. A typical example of a common resource is memory, which is allocated when a variable is initialized and freed when a variable is destroyed. The term RAII is rarely used explicitly in Rust, but it is incorporated into the idea of ** "ownership" ** and ** "scope" **.

rust


{
    let x = 10; //At this point, the variable x is initialized and memory is allocated.
    {
        let y = "hoge".to_string(); //At this point, the variable y is initialized and memory is secured.

        //Process various things

    } //Here the variable y goes out of scope and is destroyed, freeing memory.

    //Process various things

} //Here the variable x goes out of scope and is destroyed, freeing memory.

In the code example above, the variable's scope extends from the variable's initialization to the end of the innermost scope (the range enclosed in curly braces {}). In languages with garbage collection (GC) such as Java and Ruby, even after variables are destroyed, memory is not freed until the garbage collector reclaims memory [^ 7]. Also, in the above code example, memory is used as a resource, but other than memory, it may be linked to "use" and "return" such as opening and closing files. In fact, many Rust standard libraries use RAII.

[^ 7]: Even in Java, primitive types etc. are reserved on the stack when declared as local variables and meet RAII requirements, but most other objects are reserved on the heap and are subject to garbage collection. Become.

Smart pointers

A smart pointer is a type of pointer that not only points to a memory address but also has ** additional functions **. Smart pointers in Rust are the first standard library types, such as String and Vec , that allocate and free heap memory ** "smart" **. The key to distinguishing smart pointers in Rust is whether they implement ** "Deref traits" ** or ** "Drop traits" **.

rust


{
    let a = String::from("hoge"); //String"hoge"Is reserved in the heap
    let b = vec![1, 2, 3]; //Vector is allocated in heap
    let c = Box::new(5); //i32 type integer is allocated in the heap
} //Variable a, b,c is destroyed and at the same time the memory allocated in the heap is released

--Rust book -Smart Pointer --Affected language - C++

Move semantics

Roughly speaking, move semantics means that ** ownership is transferred ** when assigning a value to a variable or passing a function as an argument.

rust


let s1 = "Rust Life".to_string();
println!("{}", s1); // OK

let s2 = s1; //Move Semantics: Ownership`s1`From`s2`Is moving to

println!("{}", s2); // OK
println!("{}", s1); //Compile error: Ownership`s2`Because it has moved to`s1`Not accessible to

In the code above, let s2 = s1; is the move semantics. This is what Rust does because ownership of a value is always limited to one.

--Rust book -What is ownership?

Monomorphization

Rust generics are expanded at compile time into specific types that are used in your program, which is called "monophase". Since "monophase" determines the function to be called at compile time, it is ** static dispatch **, and there is no run-time call overhead associated with abstraction.

In affected C ++, monophasicization is known as template instantiation, specialization.

--Affected language - C++

Memory model

The memory model has several implications, but the memory model in this context is about ** consistency of shared memory access in a multithreaded environment **. In general, there are ** "atomic operations" ** that can be safely operated from multithreading, but a memory model is required to realize these. For more information, see the Rust documentation below.

--Standard library documentation - std::sync::atomic - std::sync::atomic::Ordering --Affected language - C++

Region based memory management

Region-based memory management divides memory into areas called "regions" and then associates them with the type system for memory management. In Rust, it seems to be heavily involved in the lifetime management of references.

--References - Region-Based Memory Management in Cyclone - Cyclone: Memory Management Via Regions --Affected language - ML Kit, Cyclone

Type classes, type families

"Type class" is a Haskell-derived word, and the corresponding function in Rust is ** trait **, which is used to define the behavior common to ** types. There is something close to the Java interface, but the feature is that you can implement traits later ** after defining the type, not when defining the type **.

rust


trait Greeting { //Trait definition
    fn greet(&self) -> String;
}

fn print_greet<T: Greeting>(person: T) { //Functions using trait boundaries
    println!("{}!", person.greet());
}

struct Japanese { name: String, }        // `struct`Type definition using
struct American { name: String, age: u32,}

impl Greeting for Japanese { //Trait implementation
    fn greet(&self) -> String { "Hello".to_string() }
}

impl Greeting for American {
    fn greet(&self) -> String { "Hello".to_string() }
}

impl Greeting for i32 { //Traits can also be implemented in embedded types!
    fn greet(&self) -> String { self.to_string() }
}

fn main() {
    let person_a = Japanese {name: "Taro".to_string(),};
    let person_b = American {name: "Alex".to_string(), age: 20,};

    // print_The greet function can be called for different types that implement Greeting(Ad hoc polymorphism)
    print_greet(person_a);
    print_greet(person_b);
    print_greet(123);
}

Explaining in the above code, the print_greet () function is a function that can be called if the trait Greeting is implemented. And if the type Japanese is already defined, you can call it with theprint_greet ()function by implementing the Greeting trait (ʻimpl Greeting for Japanese). What's interesting is that traits can be retrofitted to embedded types like ʻi32. Like this print_greet () function, the property of a function that can increase the types that can be passed later is called ** ad hoc polymorphism **.

"Type family" is a function that realizes a type function that receives a type and returns a type. In Rust, there is a connection with ** "related type" **. The definition and usage example are quoted from ʻAdd` of the standard library.

rust


pub trait Add<Rhs = Self> {
    type Output; //Related type
    fn add(self, rhs: Rhs) -> Self::Output;
}

struct Point {
    x: i32,
    y: i32,
}

impl Add for Point {
    type Output = Self; //Related type

    fn add(self, other: Self) -> Self {
        Self {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

assert_eq!(Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
           Point { x: 3, y: 3 });

Related types are declared using type in the trait, as in the code above. You can do something similar to Generics, but it's also useful. If you are interested, please check the following references for detailed situations.

--Related references -Situations where related types are needed --Rust By Example Japanese version -Use of related types of Rust | Happy Hacκing Blog of κeen   --Affected language - Haskell

Channels, concurrency

Channels are primitives for asynchronous communication. ** The sender and receiver can pass data asynchronously through the channel **. The following code is a quote from Channel --Rust By Example Japanese version ( Comments changed to original ones).

rust


use std::sync::mpsc::{Sender, Receiver};
use std::sync::mpsc;
use std::thread;

static NTHREADS: i32 = 3;

fn main() {
    let (tx, rx): (Sender<i32>, Receiver<i32>) = mpsc::channel(); //Creating a channel
    let mut children = Vec::new();

    for id in 0..NTHREADS {
        let thread_tx = tx.clone();

        let child = thread::spawn(move || { //Creating a thread
            thread_tx.send(id).unwrap(); //Sending data through a channel
            println!("thread {} finished", id);
        });

        children.push(child);
    }

    let mut ids = Vec::with_capacity(NTHREADS as usize);
    for _ in 0..NTHREADS {
        ids.push(rx.recv()); //Receive data sent from child threads
    }

    for child in children {
        child.join().expect("oops! the child thread panicked");
    }

    println!("{:?}", ids);
}

The above code is code that creates a channel, passes it to a child thread, sends data from the child thread through the channel, and receives it from the parent thread.

--Affected language - Newsqueak, Alef, Limbo

Message passing, thread failure

I couldn't find out, so I'll omit it.

--Affected language - Erlang

Optional binding

Optional binding is a Swift feature that, as the name implies, binds a variable and executes a block of code when an Optional value exists. The corresponding function of Rust is ʻif let, but various pattern matching is available, not limited to ʻOption.

rust


let num = Some(10);

if let Some(i) = num {
    println!("num =  {}", i);
}

--Rust book -Concise control flow with if let --Affected language - Swift

Hygienic macros

A hygienic macro is a macro that guarantees that the variable name introduced in the macro and the variable name of the macro caller do not collide. Below is a sample code for a simple Rust macro.

rust


macro_rules! my_macro { //macro
    ($x:expr) => {
        {
            let a = 2;
            $x + a
        }
    };
}

fn main() {
    let a = 5;
    println!("{}", my_macro!(a)); // 7
}

Rust macros are hygienic, so even if the my_macro! macro is passed the variable ʻa, it will be treated differently from the variable ʻa introduced in the internal let. This can conflict with C macros and Lisp macros, so I had to deliberately choose variables that wouldn't conflict.

--Affected language - Scheme

Attributes

Attributes are mainly ** additional information (metadata) ** added to the declaration. A common sight in Rust is the # [test] attribute, which marks unit tests.

rust


#[test] //attribute(Test function marking)
fn test_hoge() {
    // test code
}

#[allow(dead_code)] //attribute(Warning suppression for unused functions)
fn foo() {}

#[derive(Debug, Clone, Copy, Default, Eq, Hash, Ord, PartialOrd, PartialEq)] //attribute(Automatic implementation of traits)
struct Num(i32);

Closure syntax

You can see that this is similar if you compare Ruby's block notation and Rust's closure notation.

ruby


ia = [1,2,3]

ia.each {|e| puts e } #Ruby block(`each`Arguments of)

rust


let ia = [1, 2, 3];

ia.iter().for_each(|e| println!("{}", e)); //Rust closure(`for_each`Arguments of)

--Affected language - Ruby

Visualization of impact

I tried to visualize the languages that influenced Rust. The languages are roughly arranged clockwise in chronological order. Colors are roughly categorized by FP, OOP, concurrency, etc.

Looking at this figure, we can see how the languages of various paradigms are influenced in a well-balanced manner.

Summary

I've roughly categorized the languages that influenced Rust into a table for further visualization. We also gave a rough introduction to the individual features that influenced it. The original Rust reference Influences is listed below.

At first glance, Rust seems to be packed with many advanced features, but many of them are ** based on research findings and proven languages **. Even the often-talked-about ownership systems and borrowing checkers that characterize Rust are heavily influenced by languages such as C ++, ML Kit, and Cyclone. And when I looked at the individual effects, it was interesting to see the flow of ** being able to break away from the complete GC that could not be achieved with ML Kit or Cyclone with Rust **.

I didn't think so many features came from other languages until I looked it up. The ages and paradigms of the affected languages are also diverse, and while researching Rust, I fell into the feeling of ** learning the history of language evolution **. And I feel that Rust has shown us the process of overcoming the shortcomings of each language and successfully integrating the good points.

It is true that if you are influenced by such a language, it is said that the learning curve is steep for beginners, but if you think about it the other way around, it is a compilation of the various languages that have been influenced. While writing this, I thought that it could be said that it is a very rewarding and profitable language ** that can be learned at.

We hope this article helps those who are interested in Rust.

References

Recommended Posts

The languages that influenced Rust
This and that of the JDK
Hit the Docker API in Rust
Was that so, the user_signed_in? method
An attempt at "a math puzzle that trains the Rust brain more".
"Mathematical puzzles that train the program brain more" _Q39 (code: Ruby)-> Rust
"Mathematical puzzles that train the program brain more" _pp.018-020 (code: Ruby)-> Rust
"Mathematical puzzles that train the program brain more" _Q02 (code: Ruby)-> Rust
"Mathematical puzzles that train the program brain more" _Q17 (code: Ruby)-> Rust
"Mathematical puzzles that train the program brain more" _Q01 (code: Ruby)-> Rust