For our last essay, we decided to create an
architecture.md file for the RustPython repository. Since the current documentation coverage of the RustPython project is less than 10%, we feel creating an
architecture.md will provide the most benefit. It will serve as a good way to help new contributors understand the code architecture of RustPython. In addition to this, it also closes the gap between occasional and core contributors.
This document contains a high-level architectural overview of RustPython, thus it’s very well-suited to get to know the codebase.
RustPython is an Open Source (MIT) Python 3 interpreter written in Rust, available as both a library and a shell environment. Using Rust to implement the Python interpreter enables Python to be used as a programming language for Rust applications. Moreover, it allows Python to be immediately compiled in the browser using WebAssembly, meaning that anyone could easily run their Python code in the browser. For a more detailed introduction to RustPython, have a look at this blog post.
RustPython consists of several components which are described in the section below. Take a look at this video for a brief walk-through of the components of RustPython. For a more elaborate introduction to one of these components, the parser, see this blog post for more information.
Have a look at these websites for a demo of RustPython running in the browser using WebAssembly:
If, after reading this, you are interested to contribute to RustPython, take a look at these sources to get to know how and where to start:
Bird’s eye view
A high-level overview of the workings of RustPython is visible in the figure below, showing how Python source files are interpreted.
Main architecture of RustPython.
The RustPython interpreter can be decoupled into three distinct modules: the parser, compiler and VM.
- The parser is responsible for converting the source code into tokens, and deriving an Abstract Syntax Tree (AST) from it.
- The compiler converts the generated AST to bytecode.
- The VM then executes the bytecode given user supplied input parameters and returns its result.
The entry points are as follows:
- The ‘main’ method of the application:
run, located in
src/lib.rs:70. This method will call the compiler, which in turn will call the parser, and pass the compiled bytecode to the VM.
parse, located in
compile_top, located in
run_code_obj, located in
Here we give a brief overview of each module and its function. For more details for the separate crates please take a look at their respective READMEs.
This module (single file at moment of writing) holds the representation of bytecode for RustPython.
Python compilation to bytecode. The interface is exposed through the porcelain crate with
compile_symtable, while the inner workings are defined in compiler/src/compile.rs, which is mostly an adaptation of the CPython implementation.
Rust language extensions and macros specific to rustpython. Here we can find the definition of
PyClass along with useful macros like
All the functionality required for parsing python sourcecode to an abstract syntax tree (AST)
- Lexical Analysis
As Python heavily relies on whitespace and indentation to organize code, the crate used for parsing, LALRPOP, the raw source code is first preprocessed by a lexer which makes sure that
Dedent tokens occur at the correct locations. Then, the parser recursively generates an AST for the code which can be processed by the compiler.
Python side of the standard libary, copied over (with care) from CPython sourcecode.
CPython test suite, which can be used to compare with CPython in terms of functionality and performance. Many of these files have been modified to fit with the current state of RustPython (when they were added), with one of three ways:
- The test has been commented out completely if the parser could not create a valid code object (Syntax Error)
- A test has been marked as
unittest.skip("TODO: RustPython")if it led to a crash of RustPython
- A test has been marked as
TODO: RustPythonleft on top if the test can run but failed
Note: This is a recommended route to starting with contributing. To get started please take a look this blog post.
- builtins: all the builtin functions
- obj: Builtin types
- stdlib: the parts of the standard library implemented in rust
The RustPython executable is implemented here, which is the interface through which users come in contact with library. Some things to note:
- The CLI is defined in
- The interface and helper for the REPL are defined in this package, but the actual REPL can be found in
Crate for WebAssembly build, which compiles the RustPython package to a format that can be run on any modern browser.
Integration ans snippet tests for the entire project.