A Mere Mortal's Guide to Webassembly

Sep 22, 2021

Let’s learn some WebAssembly. This guide is intended if you plan on getting nice and intimate with WebAssembly, whether that means writing a back-end for a compiler, hand writing WASM or inspecting the output of an existing compiler. It’s not really that useful if you’re just using Rust/C/whatever compiled to WebAssembly.

I wrote this guide because I’ve spent a lot of time trying to understand the WebAssembly specification for my own projects. Hopefully it’ll make other people’s attempts a little easier.

Basics

WebAssembly is a low level language that can be run in various contexts such as a web browser, a server side runtime, and so on. It’s intended to be a fast, sandboxed compile target. While it is low level, it’s not quite as low level as regular assembly. WebAssembly has some quite high level features such as structured control flow, functions, etc. while lacking low level features such as registers, arbitrary jumps, etc.

WebAssembly is a stack based language. What that means is that computations generally occur by manipulating a stack data structure. Operations can push data onto the stack and pop data off of it. For instance, instead of writing 1 + 2 we’d do:

i32.const 1
i32.const 2
i32.add

Don’t worry if this seems daunting. We’ll worry about the actual opcodes later. What this does is push 1, an i32, i.e. a signed 32 bit integer, onto the stack, then push 2, another i32, onto the stack, then pop both off, add them and push that sum onto the stack.

With the stack annotated:

i32.const 1 // [1]
i32.const 2 // [1, 2]
i32.add     // [3]

The main code units for WebAssembly are functions and modules. Functions are not particularly unusual. They take in a list of parameters and return a list of return values. Yes, multiple return values are allowed. You can treat functions as first class variables using call_indirect but we’ll talk about that later. Modules contain sections for various stuff, mostly tied around functions and the data they need.

While WebAssembly is a stack language, it also has local and global variables. Global and local variables act as you’d expect. Local variables are specific to a function while global variables are specific to a module. Locals are always mutable, globals can be immutable.

The types in WebAssembly are very simple, just 32 and 64 bit integers, along with 32 and 64 bit floats. WebAssembly uses 32 bit indexing (as of September 2021) for memory and most other things.

Encoding

There are two main encoding formats for WebAssembly: WAT and WASM bytecode. WebAssembly Text Format or WAT¹ is an s-expression format, basically it uses a lot of parentheses and is very readable to humans.

WebAssembly bytecode is a very compact encoding that is way less readable to humans and much more readable to machines. It uses LEB128, a variable length encoding for integers. Variable length means that a 32 bit integer doesn’t have to take up a whole 4 bytes if it’s small. For instance something like 2048 is encoded as [0x80, 0x10], using 2 bytes instead of 4. LEB128 is Little Endian. If you don’t remember your computer systems, that means the least significant byte, i.e. the byte corresponding to the lowest part of the number, is stored first. LEB128 can encode signed and unsigned integers, both of which are used in WASM.

I will occasionally say something like “a 32 bit LEB128 encoded integer”, which means that the integer is at maximum 32 bits long.

For this guide we’ll worry about WASM bytecode because it’s what most compilers output. However, valid WASM bytecode is easily convertible to WAT using wasm2wat. You can get wasm2wat as part of WABT, The WebAssembly Binary Toolkit. However invalid bytecode will not always be convertible. You can use the --no-check flag with wasm2wat but if the bytecode is badly encoded, wasm2wat can’t do much. Speaking of tooling…

Tooling

I highly recommend downloading WABT. It’s a great set of tools that makes your life a lot easier. I use wasm-objdump, wasm2wat and wasm-validate all the time. Of course good old hexdump and hexl-mode in emacs are good, especially if you’re rolling your own encoder. Binaryen is also really great for optimizing WebAssembly and has some command line tools as well.

The tooling and infrastructure around WebAssembly is still very new but there’s a couple decent libraries in the Rust ecosystem. walrus is an excellent library for code generating WebAssembly. It isn’t meant to be a full fledged compiler back-end but for small tasks it’s great. There’s other WebAssembly libraries like parity-wasm that also seem great.

I also have a library for emitting WebAssembly in JavaScript called wasup. It’s quite good if I do say so myself.

I’m not familiar enough with the different runtimes to say which one’s the best. I’ve used wasmtime and it seems to work great.

Hex

With WebAssembly, there’s a really high chance that you’re gonna have to read some hexidecimal. In theory you could use something like walrus or emit WAT and not deal with hex, but it always pops up somewhere. It’s not that bad though. Your brain is very good at pattern recognition and will start to see the patterns quite soon.

Keeping an ASCII table around too isn’t a bad idea.

Concepts

Here’s some concepts that you’ll need to understand to write WASM

Indices

WebAssembly does almost everything via indices. Functions are called by their index, local and global variables are referenced by index, etc. Indices are again 32 bit integers encoded using LEB128.

If you’re using something like walrus, indices are abstracted away which is quite nice. If you’re not, well you’ll have to keep track of them.

Sections

WebAssembly modules can have up to 12 sections plus unlimited custom sections. The sections are:

Type

Type definitions for functions, either defined in the module or imported. Note that multiple functions can share the same type definition

Import

Imports for stuff like functions, tables, memories or global variables.

Important note: imported functions share the same index space as functions defined in the module. I.e. if there are two functions imported they’ll have indicies 0 and 1, then the first function defined in the module will have index 2.

Function

Function is a deceptively simple section. All it does is map the functions defined in the module to their type definitions.

Table

Declares the tables used in the module. Tables are used to store references and can be indexed by runtime values. This can be used to implement dynamic function calls by keeping a table of function references.

Memory

Declares the module’s memory. Currently only one memory is supported.

You can also have a module import a memory if you want.

Global

Declares global variables by giving their type, whether they’re mutable and a (limited) expression to evaluate to give their initial value.

Export

Declares exports. Like imports these can be functions, tables, memories, and global variables.

Start

Just a single index of a function to be called on module instantiation. Function cannot take arguments.

Element

Declares the actual elements of the tables declared in the tables section.

Code

The code section declares the actual code for functions. It also declares the local variables used in the code body. We should note that parameters and local variables shared the same index space. So if you have a function that takes in one argument, it’ll have index 0, and the first local variable will have index 1.

Also, local variables are declared by giving a type and the number of local variables, something like:

[(2, i32), (5, f32)]

Which defines 2 i32 local variables and 5 f32 local variables.

Data

Declares constant data for programs such as strings. This data will be copied into memory. There are active and passive data segments; active segments are copied at instantiation while passive segments are copied using the memory.init instruction.

Data count

Stores the number of data sections. Used to keep validation of modules a single pass.

Vectors

A very common pattern in WebAssembly bytecode are vectors. Vectors consist of a 32 bit LEB128 length, followed by the items. These show up everywhere.

Length

Because WebAssembly is made to be deserialized very quickly, the bytecode includes length info for various constructs. Each section starts with the section id then the length of the section in bytes, again as an unsigned 32 bit LEB128 integer. Same with code bodies. This can be tricky to serialize as you need to know the length up front.

Let’s Code

Let’s write a program to encode WebAssembly. This is a simplified version of wasup, my WebAssembly encoder/decoder library. I’m going to use a JavaScript array with a whole bunch of concatenations. I’m not going to cover all of WebAssembly cause you can just read the wasup source for that. Instead I’m gonna write just enough to get a binary search program working in WebAssembly.

The preamble consists of the Unicode string \0asm and the version number, in this case 1. Note that the version is a fixed width 32 bit integer, not an LEB128 encoded int. However it is still little endian.

function encodePreamble() {
	return [0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00]
}

Next we need to encode types. Let’s start with the basic primitive types: numbers and references.

enum NumType {
  i32 = 0x7f,
  i64 = 0x7e,
  f32 = 0x7d,
  f64 = 0x7c
}

enum RefType {
  funcRef = 0x70,
  externRef = 0x6f
}

type ValueType = RefType | NumType

Function types have a very simple setup. They’re the constant 0x60 followed by a vector of parameter types and a vector of return types.

Let’s write a quick helper for vectors. This is the first case where we need LEB128 ints, so let’s write the implementation for encoding signed and unsigned LEB128. I’m not gonna go into too much detail about LEB128 encoding. It’s a fairly easy algorithm to read about.

function encodeLEB128U(
  n: number,
): number[] {
  if (!Number.isInteger(n)) {
    throw new RangeError(`n must be an integer, instead is ${n}`);
  }

  const buffer = [];
  // Loop until the next to last 7 bytes
  while (n > 0x7f) {
    buffer.push((n & 0x7f) | 0x80);
    n = n >> 7;
  }

  buffer.push(n);
  return buffer;
}

export function encodeLEB128S(
  n: number,
): number[] {
  if (!Number.isInteger(n)) {
    throw new RangeError(`n must be an integer, instead is ${n}`);
  }

  let buffer = [];
  n |= 0;
  while (true) {
    const byte = n & 0x7f;
    n >>= 7;
    if ((n === 0 && (byte & 0x40) === 0) || (n === -1 && (byte & 0x40) !== 0)) {
      buffer.push(byte);
      return buffer;
    }
    buffer.push(byte | 0x80);
  }
}

function encodeVector<T>(items: T[], encodeFn: (t: T) => number[]): number[] {
  return [
    ...encodeLEB128U(items.length),
	...items.flatMap(item => encodeFn(item))
  ]
}

Okay now function types:

interface FuncType {
  paramTypes: ValueType[];
  returnTypes: ValueType[];
}

function encodeFuncType(funcType: FuncType) {
  return [
    0x60,
    ...encodeVector(funcType.paramTypes, (a) => [a]),
    ...encodeVector(funcType.returnTypes, (a) => [a]),
  ];
}

The type section is just a vector of func types combined with the standard section preamble. Most sections have this format so let’s just make a function to encode that:

function encodeSection<T>(id: number, items: T[], encodeFn: (t: T) => number[]) {
  const bytes = encodeVector(items, encodeFn);
  return [
    id,
	...encodeLEB128U(bytes.length),
	...bytes
  ]
}

Now a specific function for the type section:

function encodeTypeSection(types: FuncType[]) {
  return encodeSection(1, types, encodeFuncType);
}

Now let’s handle the function section. The function section is very simple, just a list of type indices.

function encodeFuncSection(typeIndicies: number[]) {
  return encodeSection(3, typeIndicies, encodeLEB128U);
}

The memory section is next. Memories are encoded as limits, basically a minimum and optional maximum value.

interface Limits {
  min: number;
  max?: number;
}

function encodeLimits(limits: Limits) {
  if (limits.max) {
    return [0x01, ...encodeLEB128U(limits.min), ...encodeLEB128U(limits.max)]
  } else {
    return [0x00, ...encodeLEB128U(limits.min)]
  }
}

Encode the section:

function encodeMemorySection(memories: Limits[]) {
  return encodeSection(5, memories, encodeLimits);
}

Next are exports, since we need to access the memory object from JS in order to write the array to memory, and we need to call the function.

Let’s define them. Exports are a name, followed by a type and an index, where the type explains what kind of index.

enum ExportType {
  Function = 0,
  Table = 1,
  Memory = 2,
  Global = 3,
}

interface Export {
  name: string;
  type: ExportType;
  index: number;
}

Encoding isn’t too surprising. We use the TextEncoder API to convert the string to bytes:

function encodeExport(exportEntry: Export) {
  const nameBytes = (new TextEncoder()).encode(exportEntry.name);
  return [
    ...encodeLEB128U(nameBytes.length),
    ...nameBytes,
    exportEntry.type,
    ...encodeLEB128U(exportEntry.index),
  ];
}

function encodeExportSection(exports: Export[]) {
  return encodeSection(7, exports, encodeExport);
}

After that we need a code section and that’s it. We could put the array in the data section but that’s not strictly necessary. I’m gonna define a few instructions for our binary search.

enum InstrType {
  If = 0x04,
  BrIf = 0x0d,
  Return = 0x0f,
  Call = 0x10,
  Drop = 0x1a,
  I32Load = 0x28,
  I32Const = 0x41,
  I32GtS = 0x4a,
  I32Eq = 0x46,
  I32Add = 0x6a,
  I32Mul = 0x6c,
  I32DivU = 0x6e,
  LocalGet = 0x20,
  LocalSet = 0x21,
  LocalTee = 0x22,
}

type Instruction =
  | [InstrType.I32Load, number, number]
  | [InstrType.I32Const, number]
  | [InstrType.I32GtS]
  | [InstrType.I32Eq]
  | [InstrType.I32Add]
  | [InstrType.I32Mul]
  | [InstrType.I32DivU]
  | [InstrType.LocalGet, number]
  | [InstrType.LocalSet, number]
  | [InstrType.LocalTee, number]
  | [InstrType.Drop]
  | [InstrType.BrIf, number]
  | [InstrType.Return]
  | [InstrType.Call, number]
  | [InstrType.If, ValueType, Instruction[], Instruction[]];

We use this two type setup of InstrType and Instruction because instructions can take intermediates, basically static values that come after the instruction in the code.

Technically the definition for the if instruction is incorrect because if instructions can have more types than ValueType expresses, but that’s not necessary for this example.

Now let’s define function bodies:

interface FuncBody {
  locals: [number, ValueType][];
  instructions: Instruction[];
}

Now we can encode instructions:

function encodeInstruction(instruction: Instruction): number[] {
  switch (instruction[0]) {
    case InstrType.I32Load:
      return [
        instruction[0],
        ...encodeLEB128U(instruction[1]),
        ...encodeLEB128U(instruction[2]),
      ];
    case InstrType.I32GtS:
    case InstrType.I32Eq:
    case InstrType.I32Add:
    case InstrType.I32Mul:
    case InstrType.I32DivU:
    case InstrType.Return:
    case InstrType.Drop:
      return [instruction[0]];
    case InstrType.I32Const:
      return [instruction[0], ...encodeLEB128S(instruction[1])];
    case InstrType.LocalGet:
    case InstrType.LocalSet:
    case InstrType.LocalTee:
    case InstrType.BrIf:
    case InstrType.Call:
      return [instruction[0], ...encodeLEB128U(instruction[1])];
    case InstrType.If:
      return [
        instruction[0],
        instruction[1],
        ...instruction[2].flatMap(encodeInstruction),
        0x05,
        ...instruction[3].flatMap(encodeInstruction),
        0x0b,
      ];
  }
}

Depending on the instruction, we encode the intermediates accordingly.

Encoding the function body is similar, the only difference is that we need to know the length of the body in advance. Fortunately that’s pretty easy with our setup:

function encodeFuncBody(body: FuncBody) {
  const encodedBody = [
    ...encodeVector(
      body.locals,
      ([count, type]) => [...encodeLEB128U(count), type],
    ),
    ...body.instructions.flatMap(encodeInstruction),
    0x0b, // end opcode
  ];
  return [...encodeLEB128U(encodedBody.length), ...encodedBody];
}

function encodeCodeSection(bodies: FuncBody[]) {
  return encodeSection(10, bodies, encodeFuncBody);
}

And with this we’re done with our encoding functions! Let’s start writing our binary search implementation. But first, let’s write a TS implementation:

function exampleBinarySearch(
  arr: number[],
  n: number,
  start: number,
  end: number,
): number {
  const mid = Math.floor((start + end) / 2);

  if (n == arr[mid]) {
    return mid;
  }

  if (mid == start) {
    return -1;
  }

  if (n > arr[mid]) {
    return binarySearch(arr, n, mid, end);
  } else {
    return binarySearch(arr, n, start, mid);
  }
}

There’s the classic overflow bug in this implementation but meh. Also yes, the final else clause is there for a reason.

Let’s translate this to WebAssembly! Our function is going to take in a pointer to an array (i32), the number we’re looking for (i32), a start index (i32), and an end index (i32). It’ll return an index or -1 (i32). So the signature will be (i32, i32, i32, i32) => i32:

Our index space looks like:

arr
n
start
end

const binarySearchType = {
  paramTypes: [NumType.i32, NumType.i32, NumType.i32, NumType.i32],
  // return index
  returnTypes: [NumType.i32],
}

Now what should our local variables be? Looking at the TS implementation, we could use a local variable for mid and arr[mid]:

const binarySearchLocals = [[2, NumType.i32]]

Adding on to our index space:

mid
arr[mid]

Let’s start writing the code…

const binarySearchInstructions = [

First let’s start with computing mid. We get start and end, pop them onto the stack…

  [InstrType.LocalGet, 2],
  [InstrType.LocalGet, 3],

Then we add them together

  [InstrType.I32Add],

And divide by 2

  [InstrType.I32Const, 2],
  [InstrType.I32DivU],

Now let’s assign this result to mid

  [InstrType.LocalSet, 4],

Now let’s work on getting arr[mid] assigned to a local. Let’s get mid:

  [InstrType.LocalGet, 4],

We take mid, multiply it by 4, and add arr to it, getting arr + (mid * 4). The multiplication is because each entry is 32 bits or 4 bytes long. Adding that to arr gets us a pointer to the element.

  [InstrType.I32Const, 4],
  [InstrType.I32Mul],
  [InstrType.LocalGet, 0],
  [InstrType.I32Add],

We then load from this address, i.e. we dereference

  [InstrType.I32Load, 0, 0],

And now we assign this to a local

  [InstrType.LocalSet, 5],

Alright let’s work on the if (arr[mid] == n) { return mid } code. This is a really easy pattern to translate to WASM because we can use the br_if instruction. We start by pushing on the value we will return if the condition is true, i.e. mid

  [InstrType.LocalGet, 4],

Next let’s push on the condition:

  [InstrType.LocalGet, 5],
  [InstrType.LocalGet, 1],
  [InstrType.I32Eq],

Alright now we call br_if with a relative depth¹ of 0. This will break out of our current block if the condition is satisfied, i.e. if the topmost value on the stack is not 0. Since the current block is the function block, this will cause an early return and since mid is already on the stack, that’ll be the return value.

  [InstrType.BrIf, 0],

However if we don’t break, we have this value on the stack that’s useless, so we drop it

  [InstrType.Drop],

Alright let’s repeat this pattern but for the if (start == mid) { return -1 } code. Pop on -1

  [InstrType.I32Const, -1],

Now check if start == mid and break if so

  [InstrType.LocalGet, 4],
  [InstrType.LocalGet, 2],
  [InstrType.I32Eq],
  [InstrType.BrIf, 0],

If not, drop the -1

  [InstrType.Drop],

Now let’s do the final condition. Because the resulting value requires calling a function, we can’t do the same br_if pattern. Instead let’s use an if instruction. Again, we pop on the condition

  [InstrType.LocalGet, 1],
  [InstrType.LocalGet, 5],
  [InstrType.I32GtS],

And then we use an if instruction that returns an i32. If n > arr[mid] then we call binarySearch(arr, n, mid, end)

  [InstrType.If, NumType.i32, [
    [InstrType.LocalGet, 0],
    [InstrType.LocalGet, 1],
    [InstrType.LocalGet, 4],
    [
      InstrType.LocalGet,
      3,
    ],
    [InstrType.Call, 0],
  ],

Otherwise we call binarySearch(arr, n, start, mid)

  [
    [InstrType.LocalGet, 0],
    [InstrType.LocalGet, 1],
    [InstrType.LocalGet, 2],
    [InstrType.LocalGet, 4],
    [InstrType.Call, 0],
  ]]
];

Now let’s wrap it up in a nice function and convert it to a byte array (Uint8Array)

function encodeModule() {
  return new Uint8Array([
    ...encodePreamble(),
    ...encodeTypeSection([binarySearchType]),
    ...encodeFuncSection([0]),
    ...encodeMemorySection([{ min: 1 }]),
    ...encodeExportSection([{
      name: "binarySearch",
      type: ExportType.Function,
      index: 0,
    }, { name: "memory", type: ExportType.Memory, index: 0 }]),
    ...encodeCodeSection([{
      locals: binarySearchLocals,
      instructions: binarySearchInstructions,
    }]),
  ]);
}

Here’s a helper to write the array to memory:

function writeArrayToMemory(arr: number[], startIndex: number, memory: WebAssembly.Memory) {
  const buffer = new Uint32Array(memory.buffer);

  for (let i = 0; i < arr.length; i++) {
    buffer[startIndex + i] = arr[i];
  }
}

Now we instantiate:

const { instance } = await WebAssembly.instantiate(encodeModule());

Get the exports:

const { memory, binarySearch } = instance.exports;

Write to memory:

writeArrayToMemory([-4, -2, 0, 2, 8, 13, 21, 54], 0, memory as WebAssembly.Memory);

And now let’s run binary search! The array is at address 0, n will be 21, our start will be 0 and our end will be 7. TypeScript will complain cause it doesn’t know our export is a function but that’s okay we can tell it to be quiet.

// @ts-ignore
console.log(binarySearch(0, 21, 0, 7));

And finally let’s wrap this in a function to make it cleaner:

function wasmBinarySearch(arr: number[], n: number): number {
  writeArrayToMemory(arr, 0, memory as WebAssembly.Memory);
  // @ts-ignore
  return binarySearch(0, n, 0, arr.length - 1);
}

Congrats! You’ve successfully written binary search in WebAssembly with your own emitter!

Sources

I guess WTF was a little vulgar? ↩ ↩²