Part 0: Introduction

Return to Rust on the CH32V003
Published on 2023-03-26 by Noxim. 3889 words, about 20 minutes

Background

Over the past year or two I have been spending more and more of my time in the embedded software development world. I started off with your average joe's ARM microcontrollers. ST Microelectronics makes the hugely popular STM32 line and there are tons of Rust projects using them. After a while I switched over to Espressif's ESP32 series offerings, primarily the awesome ESP32-C3. The C3 is a WiFi and BLE capable microcontroller that packs quite a punch. Even better, it is built on the free RISC-V instruction set. I got tons of projects using the chip, because it's beefy enough for most things while still being a few € a piece. But at the same time, it leaves me wanting. It's too good. I mean, the damn thing runs at 160 Mhz and has 400 kilobytes of RAM. Someone used to Chrome might think thats nothing, but for a microcontroller it is definitely on the bigger side.

But it does leave me wanting. Not more performance, but more constraints. I am a strong believer in the idea that constraints breed creativity. I also like making games, where the princible I feel is especially true. Constraints help me focus on what is most important, and makes me think outside the box to achieve specific goals.

This year I want to start visiting Rust meetups somewhere abroad. And the best thing you can do at a meetup is show and share something neat you made yourself. What if I could combine my interest for games and embedded, in a tangible format? What about a small game-trinket-keychain-thing! Imagine the old Nintendo Game & Watch -brand of game portables, but instead running Rust. These days getting custom PCBs manufactured is dirt cheap and the bill of materials for one game board won't be much. I am setting a hard limit of 1€ per board, assembled, for the cost of these giveaway minigames. Idea is to have <= 4 buttons and a handful of LEDs, nothing more. Perhaps some very simple LCD panel, if I go wild.

So, what microcontroller do I choose? I won't need a lot of horse power, and like I said less is more. Few kilobits of RAM is enough and I won't need any fancy hardware features like USB or wireless capabilities. After browsing a bit, I found a few popular candidates for a chip.

NameCoreRAMROMManufacturer
ATmega328PAVR2K32KMicrochip
STM32C011ARM6K16KSTMicroelectronics
PY32F003ARM2K16KPuya
CH32V003RISC-V2K16KWinChipHead

Alright, let's figure out whos the best. Right at the top we have Microchip's ATmega328P which is a very popular chip. This part has risen to fame with its use in the Arduino Uno and has tons of documentation out there. It is built on an 8 bit architecture called AVR, which Rust gained support for in 2020. However, this chip goes for roughly 2.5€ leaving it out of the race.

Next we have the STMicroelectronics part STM32C011. This one is a bit beefier than the rest with a whopping 6 kilobytes of RAM. It also runs a more traditional ARM instruction set with very good ecosystem support. Yet again, these features come at a cost of roughly 1.5€.

Last up we have the -003's. These are both much lesser known and recent chips. Puya, maker of the PY32 and WinChipHead, maker of the CH32 are not very well known in the west. The hardware features on either won't turn any heads, but the cost will: Both chips can be had for roughly 0.10€. This is a big drop from the others and makes our < 1€ BoM much more feasible. Both have same amount of RAM and similar feature sets. The only real difference is that the WCH part is based on RISC-V, while Puya opted for more traditional ARM. I personally chose to pick the CH32V003 as the microcontroller for my new project. RISC-V is a nice open ISA and quite frankly... I didn't hear of the PY32 until after I had ordered a reel of 100x the CH32.

So. Our target is a CH32V003. It runs 32 bit RISC-V, with 2 kilobytes of RAM and 16 kilobytes of Flash storage for the program. It has a couple of clocks, max of 24 GPIO pins, serial busses like I2C and SPI and even an analog-to-digital converter. You can build a beast with these. To constrain myself even more (and to make soldering easier) I opted for the SOIC-8 package, which together with 3-5V power exposes 6 GPIO pins for us to play with. That is not a lot, but it's exactly what I wanted. With that settled, let's get to the meat: Porting Rust

CH32V003J4M6 packages the -V003 in an 8-pin SOIC. This little roll has 100 of themCH32V003J4M6 packages the -V003 in an 8-pin SOIC. This little roll has 100 of them

Being a fairly new and fairly obscure chip, there isn't much support for the CH32V003 yet. WinChipHead provides C support through MounRiver Studio, which is an Eclipse-based IDE using GCC as the toolchain. If we want to use Rust, we need a few things: A Rust compiler, startup assembly and crates to access hardware features. This series will go through all three steps, each in their own post.

The compiler

Rust uses the LLVM compiler project to do most of the heavy lifting. LLVM has pretty wide hardware support, and does have targets for RISC-V. The RISC-V instruction set is an open standard designed around modular ISA extensions. The core idea is that you can benefit from shared tooling by modularising hardware features behind extensions. The RISC-V ISA comes in two base flavours, 32 and 64 bit. These are called the RV32I and RV64I. RV stands for RISC-V, the number denotes bitness and then all the letters beyond describe what hardware extensions are in use. The I here stands for Integer instructions, and provides the base functionality in any RISC-V processor. A bare RV32I processor only has a handful of control flow instructions, addition, subtraction, memory access etc. Every RISC-V processor is guaranteed to implement these base instructions, so any RV32I compiler can compile to any RV32I processor. As the name might give away, our CH32V003 is indeed a 32 bit RISC-V processor.

Then there are a number of optional extensions. Some extensions are

  • M extension: Provides integer multiplication and division instructions
  • A extension: Provides atomic memory instructions
  • C extension: Provides compressed (2-byte) shorthands for some I extension instructions
  • F extension: Provides single precision floating point instructions and registers

There are many more, some still under development. Processor designer can choose pretty freely what extensions to include in their design. For example, RV32IMAC is a common combination in microcontrollers, implementing integer arithmetic, atomics and compressed code format.

LLVM, and Rust in extension, supports these fairly well. In fact you can check what your Rust toolchain currently supports by running

$ rustc --print target-list
...
riscv32gc-unknown-linux-gnu
riscv32gc-unknown-linux-musl
riscv32i-unknown-none-elf
riscv32im-unknown-none-elf
riscv32imac-unknown-none-elf
riscv32imac-unknown-xous-elf
riscv32imc-esp-espidf
riscv32imc-unknown-none-elf
riscv64gc-unknown-freebsd
riscv64gc-unknown-linux-gnu
riscv64gc-unknown-linux-musl
riscv64gc-unknown-none-elf
riscv64gc-unknown-openbsd
riscv64imac-unknown-none-elf
...

You can see bunch of different variants, for different operating systems and extensions. Note that the G in riscv32gc is not a "real" base extension, but rather a shorthand for riscv32imafd, meaning the real target is RV32IMAFDC. We are targeting bare metal without any operating system, so we only care about the -unknown-none- variants.

So, things are looking good. Rust can out of the box target RV32I, so we should be golden!

... Except for the fact that our CH32V003 is not RV32I. When I earlier said that all RISC-V processors implement the I extension, I lied. If we look up the datasheet for the CH32V003, we find that the core (Called QingKe RISC-V2A) actually implements an instruction set called the RV32EC. E extension is an alternative base extension for RISC-V, that is specifically designed for very low spec processors. It is mostly identical to the RV32I base extension, except it only specifies 16 general purpose registers instead of 32. A smaller register file saves quite a lot of space on the silicon die, in turn reducing cost. The problem is, the E extension is still in development. It has mostly been set in stone and manufacturers are already implementing it in hardware, but the spec is not officially frozen so compilers are still catching up.

Creating a new target specification

Fine. rustc might not have a built in RV32E target for us to use, so let's make one. In addition to choosing a preset with --target flag, we can specify our own target specification file for rustc to use. Its a simple JSON file that describes various low level properties about the target. We can get a good base to start from by dumping the specification for an existing target. riscv32i-unknown-none-elf is the most bare bone, so let's start with it.

$ rustc +nightly --target riscv32i-unknown-none-elf -Z unstable-options --print target-spec-json

Printing target-spec-json is an unstable option, so we need to use the nightly compiler instead of stable. This is with 1.70.0-nightly (ff4b772f8 2023-03-10)

{
  "arch": "riscv32",
  "atomic-cas": false,
  "cpu": "generic-rv32",
  "data-layout": "e-m:e-p:32:32-i64:64-n32-S128",
  "eh-frame-header": false,
  "emit-debug-gdb-scripts": false,
  "is-builtin": true,
  "linker": "rust-lld",
  "linker-flavor": "ld.lld",
  "llvm-target": "riscv32",
  "max-atomic-width": 0,
  "panic-strategy": "abort",
  "relocation-model": "static",
  "target-pointer-width": "32"
}

Lets save this file as the base for our own target, riscv32ec-unknown-none-elf.json. Most of the options are alright for our chip, but we need to change few. First off, since we are now using a custom target let's remove the is_builtin: true.

RISC-V has a lot of extensions, so having an individual target for each of the permutations of extensions wouldn't really be feasible from maintenance perspective. Instead, LLVM only has 2 generic RISC-V targets riscv32 and riscv64 which map to RV32I and RV64I respectively. Each instruction extension is then configured as a target feature. For most applications LLVM's target features configure SIMD support. For example, your desktop computer is probably an Intel or AMD x86-64 processor with some degree of AVX instruction support. If it was produced in the last 10 years, it probably supports AVX2, but it needs to be a higher end model from last 5 years to support AVX-512. By default you want your program to only use the more common AVX2 instructions so anyone can run it, but if you are deploying it to a well known server environment you can instruct LLVM to generate AVX-512 instructions with the avx512f target feature. RISC-V extensions are implemented similarily: The c target feature adds support for the C extension for example. Let's check what target features LLVM supports for our riscv32ec-unknown-none-elf.json

$ rustc --print target --target riscv32ec-unknown-none-elf.json
Features supported by rustc for this target:
    a                    - 'A' (Atomic Instructions).
    c                    - 'C' (Compressed Instructions).
    d                    - 'D' (Double-Precision Floating-Point).
    e                    - Implements RV32E (provides 16 rather than 32 GPRs).
    f                    - 'F' (Single-Precision Floating-Point).
    m                    - 'M' (Integer Multiplication and Division).
    v                    - 'V' (Vector Extension for Application Processors).
    zba                  - 'Zba' (Address Generation Instructions).
    ...
Code-generation features supported by LLVM for this target:
    lui-addi-fusion      - Enable LUI+ADDI macrofusion.
    no-default-unroll    - Disable default unroll preference..
    no-rvc-hints         - Disable RVC Hint Instructions..
    relax                - Enable Linker relaxation..
    reserve-x1           - Reserve X1.
    reserve-x10          - Reserve X10.
    reserve-x11          - Reserve X11.
    reserve-x12          - Reserve X12.
    ...

I've cut out some entries here, but you can see a large list of target features. And bingo! Among the list are the two we need for our RV32EC microcontroller. The e target feature changes the RV32I instruction set to the RV32E set by reducing the number of available registers. The c extension is optional and gives us a compressed instruction format for smaller binary sizes. The are also some code generation features that configure what sort of code LLVM generates. These can be used to (for example) control loop unrolling behaviour or reserve some registers for interrupt handler code. We only really care about the e and c target features however.

Let's modify our riscv32ec-unknown-none-elf.json to use these features by adding a features: "+e,+c" field.

{
  "arch": "riscv32",
  "atomic-cas": false,
  "cpu": "generic-rv32",
  "data-layout": "e-m:e-p:32:32-i64:64-n32-S128",
  "eh-frame-header": false,
  "emit-debug-gdb-scripts": false,
  "features": "+e,+c",
  "linker": "rust-lld",
  "linker-flavor": "ld.lld",
  "llvm-target": "riscv32",
  "max-atomic-width": 0,
  "panic-strategy": "abort",
  "relocation-model": "static",
  "target-pointer-width": "32"
}

Putting it to use

Let's take our new Rust target for a spin. Let's create a new crate

$ cargo new --bin hello-wch

and replace the src/main.rs contents with a bare bones no_std start

#![no_std]

fn main() -> ! {
    loop {}
}

Let's try and build it with our custom target:

$ cargo build --target riscv32ec-unknown-none-elf.json
   Compiling hello-wch v0.1.0 (C:/Users/Aarop/repos/hello-wch)
error[E0463]: can't find crate for `core`
  |
  = note: the `riscv32ec-unknown-none-elf` target may not be installed
  = help: consider downloading the target with `rustup target add riscv32ec-unknown-none-elf`

error[E0463]: can't find crate for `compiler_builtins`

error: requires `sized` lang_item

For more information about this error, try `rustc --explain E0463`.
error: could not compile `hello-wch` due to 3 previous errors

There are some core crates in the Rust ecosystem that are distributed as prebuilt binaries instead of being built from source on every compilation. The std is one of these, as are compiler_builtins or even core itself. The help message here is actually incorrect, because rustup does not distribute files for the RV32EC we need. Luckily we can build the core crates ourselves.

Let's setup a cargo configuration file in .cargo/config.toml

[build]
target = "riscv32ec-unknown-none-elf.json"

[unstable]
build-std = ["core", "compiler_builtins"]

We've done two things here. First we set the default target to our JSON spec so we do not need to specify it each time. Secondy, we have configured build-std to compile core and compiler_builtins from scratch. It is currently an experimental feature, so we need to specify the nightly toolchain to use it.

$ cargo +nightly build
   Compiling core v0.0.0 (C:/Users/Aarop/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/library/core)
   Compiling compiler_builtins v0.1.87
   Compiling rustc-std-workspace-core v1.99.0 (C:/Users/Aarop/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/library/rustc-std-workspace-core)
LLVM ERROR: Codegen not yet implemented for RV32E
error: could not compile `core` (lib)
warning: build failed, waiting for other jobs to finish...
LLVM ERROR: Codegen not yet implemented for RV32E
error: could not compile `compiler_builtins` (lib)

Okay, well, that's no good. Cargo did try to build both core and compiler_builtins, but the LLVM backend errored out to say RV32E is not supported yet. Uh oh.

We can check what LLVM version our Rust compiler is using

$ rustc +nightly --version --verbose
rustc 1.70.0-nightly (ff4b772f8 2023-03-10)
binary: rustc
commit-hash: ff4b772f805ec1e1c1bd7e189ab8d5a4e3a6ef13
commit-date: 2023-03-10
host: x86_64-pc-windows-msvc
release: 1.70.0-nightly
LLVM version: 15.0.7

If we search around in LLVM documentation, we find the following mention:

Currently, LLVM fully supports RV32I, and RV64I. RV32E and RV64E are supported by the assembly-based tools only.

As it turns out, LLVM 15 does not support RV32E yet. Neither does LLVM 16 or 17. In fact, no release version of LLVM supports RV32E fully yet! This is the ugly side of sitting on the bleeding edge. The tooling know there is an E extension, but the support is not complete and for us, its missing the most important part: actual code generation.

So what can we do? Are we out of luck with Rust on the CH32V003? Implementing a full code generation feature in LLVM is a bit out of our scope and definitely out of the time-budget I have.

cursed ferris interjects: Hmm... Couldn't we just mark the higher 16 as reserved?

Cursed idea

Now that's an idea. A very cursed idea, but an idea nontheless. Okay, let's back up a little. Remember when we listed all the target features LLVM supports and there were a few of those reserve-x__ features? When enabled, these features mark a register as being reserved. To LLVM this means that the register is used for some specific use and cannot be written to. This is a bit rare but very useful capability. Imagine you are building a system that requires very fast interrupt handlers. An interrupt handler is a special piece of code that the processor will automatically jump to when some interrupt occurs. Interrupts can be triggered from a number of sources, but a common one is an electrical signal arriving into a microcontroller pin. For a number of safety critical applications, the latency between an interrupt triggering and some action being performed by your application can make a world's difference. A microsecond spent fetching out-of-cache data might be too much. So, you can tell LLVM to reserve a register X31, and write your own interrupt handler routine to store its most relevant data in X31, so that is always immediately ready, without needing to access any on or offchip memory. But how is this relevant to RV32E?

If you remember, the defining feature of RV32E is lack of upper 16 registers. Regular RV32I has 32 registers x0 through to x31, but to save silicon area RV32E cuts it down to x0 - x15. That really is the only difference for hardware. What if, instead of needing a proper E-aware LLVM implementation, we could just use RV32I and reserve-x__ the upper x16 - x31 registers? LLVM would no longer generate any code that accesses them, making it functionally equivalent to RV32E machine code. The idea is cursed enough that it might actually work. Let's give it a shot and reserve all the upper registers in our target specification.

{
  "arch": "riscv32",
  "atomic-cas": false,
  "cpu": "generic-rv32",
  "data-layout": "e-m:e-p:32:32-i64:64-n32-S128",
  "eh-frame-header": false,
  "emit-debug-gdb-scripts": false,
  "linker": "rust-lld",
  "linker-flavor": "ld.lld",
  "llvm-target": "riscv32",
  "features": "+reserve-x16,+reserve-x17,+reserve-x18,+reserve-x19,+reserve-x20,+reserve-x21,+reserve-x22,+reserve-x23,+reserve-x24,+reserve-x25,+reserve-x26,+reserve-x27,+reserve-x28,+reserve-x29,+reserve-x30,+reserve-x31",
  "max-atomic-width": 0,
  "panic-strategy": "abort",
  "relocation-model": "static",
  "target-pointer-width": "32"
}

Note that we still need a nightly compiler for build-std support

$ cargo +nightly build
   Compiling compiler_builtins v0.1.87
   Compiling core v0.0.0 (~/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/library/core)
   Compiling hello-wch v0.1.0 (~/hello-wch)
   Compiling rustc-std-workspace-core v1.99.0 (C:/Users/Aarop/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/library/rustc-std-workspace-core)
error: <unknown>:0:0: in function _ZN4core3num7flt2dec19to_shortest_exp_str17h9129ea2faaa1fcd7E void (ptr, double, i1, i16, i16, i1, ptr, i32, ptr, i32): Argument register required, but has been reserved.
error: could not compile `core` (lib)

Damn, not that easy. This error is telling us that even though some register has been reserved, a function is trying to use it. After squint-your-eyes demanging, you can tell that the function in question is part of float conversion code. This is a function internal to Rust's corelib and in fact there is nothing special about this function. If you are following along, you probably saw some other function get mentioned. The problem really here is that LLVM is still trying to use the RV32I's calling convention. A calling convention is a standard that defines how parameters are passed across function calls (among other things). RISC-V defines the ilp32 and lp64 calling conventions, for RV32I and RV64I respectively. When the F and/or D extensions are enabled, there are also the -f and -d calling convention variations that define how floats are passed.

The since we are "hacking" RV32I, LLVM is using the ilp32 calling convention to pass arguments to functions. Summarised in a table, it looks like this

ilp32/

RegisterAlternative nameDescriptionSaved by
x0zeroAlways contains the value 0
x1raReturn addressCaller
x2spStack pointerCallee
x3gpGlobal pointer
x4tpThread pointer
x5 - x7t0 - t2Temporary registersCaller
x8 - x9s0 - s1Saved registersCallee
x10 - x17a0 - a7Function argumentsCaller
x18 - x27s2 - s11Saved registersCallee
x28 - x31t3 - t6Temporary registersCaller

This table describes the usage of each of the 32 registers and if they are saved to stack by the code invoking a function, or the function itself. And as you can see, the registers used for function arguments extend all the way up to x17. While LLVM can reserve a register from being used in computation, it won't be able to override the calling convention. What we need is a calling convention that won't use any of the upper 16 registers. One exists, and is called the ilp32e ABI. It is specifically designed to go with RV32E, so it's perfect for us. Let's add "llvm-abiname": "ilp32e" to our target specification to tell LLVM to use it.

$ cargo +nightly build
   Compiling compiler_builtins v0.1.87
   Compiling core v0.0.0 (~/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/library/core)
   Compiling hello-wch v0.1.0 (T:/cursed-wch)
   Compiling rustc-std-workspace-core v1.99.0 (~/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/src/rust/library/rustc-std-workspace-core)
LLVM ERROR: Don't know how to lower this ABI
error: could not compile `core` (lib)

Unfortunately just as with the e extension, the ilp32e calling convention is not actually fully implemented in LLVM yet.

Back on track

So, our cursed idea hit's a wall. Hey, it was bit of a long shot anyway. So what can we do then? LLVM doesn't have any releases with RV32E support and we can't really do it on our own. Well, we are in luck: Some very smart people have already done the heavy lifting for us. Turns out we are not the only people interested in RV32E support and some folks from T-Head (Alibaba's chip unit) are working on a patch to LLVM to support RV32E. It's not merged yet and some parts are potentially buggy, but it's a starting point for us. The patch D70401 implements codegen for RV32E and the ilp32e calling convention, so if we can apply it to a custom Rust toolchain we might actually get somewhere.

In the next post we will go over a custom LLVM and Rust build against D70401.

Next: Part 1: Custom Rust toolchain