Introduction and Inspirations

What is Bleach and why it was created?

Bleach is a programming language designed to give Computer Science students a more interesting and rewarding experience while taking their first "Introduction to Compilers" course.

The language was implemented with this purpose due to the fact that, based on my personal experience and on certain studies (which I will cite at the end of the chapter), such courses tend to be too much focused on its theoretical aspects to the detriment of its practical ones.

Therefore, it's common for students that are taking this course to sometimes find it boring or even uninteresting and demotivating.

Based on this, the motivation to build Bleach was born. The cornerstone idea behind Bleach is: a language that can be used in a classroom environment to teach the most fundamental ideas and concepts of Programming Languages implementation in an incremental way, using well-known languagues and techniques as a basis to such objective.

A glimpse on Bleach's features

High-Level

Since an introductory Compilers course generally lasts 1 semester in most CS programs, Bleach can't be a big or complex programming language like Java, Rust, C++. Instead, it must be compact while also having some core features that any minimally functional language has.

Usually, when most people (myself included) think about a small but useful language, some options come to mind. To be more precise, two options come to mind: JavaScript, Lua and Scheme.

What all of these options have in common? All of them are considered to be "high-level" scripting programming languages. (I'll explain below why choose such type of language to implement).

When it comes to syntax, Bleach looks most like JavaScript. This was an intentional choice. Since JavaScript has its syntax rooted in C, I thought it would a good idea to follow this approach because most CS students have already worked (at least a little) with these two languages. Thus, familiarity would kick in.

However, you will notice, as you go through this book, that Bleach also has similarities with Java, Lox, Python, Ruby and Rust when it comes to syntax.

Dynamically Typed

Here is the first reason to make Bleach a "high-level" scripting language. Our three inspirations that fit in this category are all dynamically typed languages.

If you don't remember (or don't know) what a dynamically typed language is, don't get nervous. A dynamically typed language is simply a programming language where the variables don't store type information. Instead, the value stored inside a variable holds the type information. This essentialy implies two things:

  • Variables are able to store a value of any type.
  • Variables are able to store values of different types at different points in time.

In contrast to dynamically typed programming languages, there are also statically typed ones. This begs the question:

Why not make Bleach a statically typed one?

The answer is straightforward: If I decided to make it static, I would need to implement a static type system for it and this is simply too much work to learn and implement. Moreover, type systems are no joke. There is a reason this subject has its own course in master's or doctoral programs. This being said, I think it's obvious to conclude that such a thing could not be teached in a complete way in an undergraduate and introductory course. So... yeah, if I made this decision, I would be contradicting myself about Bleach's reason to exist, which is (in case you don't remember) teach the most fundamental ideas and concepts of Programming Languages implementation. Thus, I didn't follow design decisions that led to advanced ideas and concepts in the area of ​​compilers and programming languages

Automatic Memory Management

One of the reasons that motivates the creation of "high-level" scripting languages is the need to free programmers from the burden of manually managing memory (Yes, I am looking at you: malloc, calloc, realloc and free).

Since Bleach fits into the category of "high-level" scripting languages, it's not a language with manual memory management. Instead, it has an automatic one.

Essentialy, this means that Bleach's runtime () will handle the allocation and deallocation of memory for us. Since this implementation of Bleach is a Tree-Walk Interpreter, things work in a different way than a Garbage Collector, for example. Yes, this implementation doesn't contain one written from scratch.

Briefly explaining, in a Tree-Walk Interpreter an AST (Abstract Syntax Tree) will be generated by the parser. After some static analysis steps, the AST will be given to the interpreter, so it can execute the code represented by the AST.

In such type of interpreter, the execution of code simply means that a traversal through the generated AST will be done. When doing such traversal, two things happen:

  • Dynamic Memory Allocation: When the interpreter creates objects (such as lists, dicts, instances of classes or other complex data structures), it allocates memory for these objects dynamically at runtime.
  • Garbage Collection: In most modern interpreter implementations, especially in high-level languages like those typically used for Lox (such as Java or Python), garbage collection is used to manage this dynamically allocated memory. Right now, you might be wondering "Hey, but this implementation uses C++ and such language is not garbage collected. So, how can Bleach be?". The answer is simple: We are using C++17 to implement such interpreter and this version of C++ has lots of different features that allows us to write code that don't require us to manually manage memory. Everything is done automatically. If you want to learn more about it, search for C++ smart pointers and automatic memory management features like: std::shared_ptr, std::unique_ptr, std::weak_ptr, std::make_shared and std::enable_shared_from_this.