20210505_wasm_c_bare.rst (11462B)
1 wasm and C - The bare necessities 2 ################################# 3 4 :date: 2021-05-05 06:15 5 :author: Louis Holbrook 6 :category: Code 7 :tags: wasm,c,clang,llvm 8 :slug: wasm-c-bare 9 :summary: The bare minimum needed for two-way communication between wasm and C 10 :lang: en 11 :status: published 12 13 14 I am currently resuming my self-improvement task of learning Webassembly. Or, I should rather say, *using* Webassembly. 15 16 Since I have some optimized small code chunks that I want to use in an embedded environment, it seemed sensible that the first step would be to establish two-way communication with C. 17 18 In my last outing around three years ago, I was using Emscripten_ to bridge the gap. That tool adds quite a few bells and whistles, and doesn't quite yield that warm, fuzzy bare-metal rush. Emscripten_ relies on Clang_ and LLVM_, all of which seem to have gotten their :code:`wasm` support built-in in the meantime (at least on my archlinux system). This it integrates nicely with wabt_ - the swiss-army knife of Webassembly. 19 20 So how far do we get with just clang_, LLVM_ and wabt_ ? Let's see if we at least can set up a code snippet which simply writes *"foobar"* to memory. The host will write *"foo"*, and :code:`wasm` will write *"bar"*. 21 22 23 24 Without libc 25 ============ 26 27 This excellent `tutorial by Surma <https://surma.dev/things/c-to-webassembly/>`_ provides a good starting point. Go ahead and **read that first**. This text is not a Webassembly primer, so the following will make a lot more sense if you do. 28 29 That setup still adds some magic. Namely, the *memory* and *symbol table* are here added by the *wasm linker*. It would be even more fun to pass this from the host system instead. 30 31 And so we start: [1]_ 32 33 .. include:: code/wasm-c-bare/bare.c 34 :code: c 35 :number-lines: 0 36 37 Compiling this without linking gives us a hint on what needs to be defined. 38 39 .. code-block:: bash 40 41 $ clang --target=wasm32 -nostdlib -nostartfiles -o bare.wasm -c bare.c 42 $ wasm-objdump -x bare.wasm 43 bare.wasm: file format wasm 0x1 44 45 Section Details: 46 47 Type[2]: 48 - type[0] () -> nil 49 - type[1] (i32) -> nil 50 Import[4]: 51 - memory[0] pages: initial=0 <- env.__linear_memory 52 - table[0] type=funcref initial=0 <- env.__indirect_function_table 53 - global[0] i32 mutable=1 <- env.__stack_pointer 54 - func[0] sig=1 <env.call_me_sometime> <- env.call_me_sometime 55 Function[1]: 56 - func[1] sig=0 <foo> 57 Code[1]: 58 - func[1] size=138 <foo> 59 Custom: 60 - name: "linking" 61 - symbol table [count=4] 62 - 0: F <foo> func=1 binding=global vis=hidden 63 - 1: G <env.__stack_pointer> global=0 undefined binding=global vis=default 64 - 2: D <__heap_base> undefined binding=global vis=default 65 - 3: F <env.call_me_sometime> func=0 undefined binding=global vis=default 66 Custom: 67 - name: "reloc.CODE" 68 - relocations for section: 3 (Code) [5] 69 - R_wasm_GLOBAL_INDEX_LEB offset=0x000007(file=0x000099) symbol=1 <env.__stack_pointer> 70 - R_wasm_GLOBAL_INDEX_LEB offset=0x00001c(file=0x0000ae) symbol=1 <env.__stack_pointer> 71 - R_wasm_MEMORY_ADDR_SLEB offset=0x000031(file=0x0000c3) symbol=2 <__heap_base> 72 - R_wasm_FUNCTION_INDEX_LEB offset=0x000073(file=0x000105) symbol=3 <env.call_me_sometime> 73 - R_wasm_GLOBAL_INDEX_LEB offset=0x000086(file=0x000118) symbol=1 <env.__stack_pointer> 74 Custom: 75 - name: "producers" 76 77 Using nodejs as the host, we check if we can instantiate a :code:`WebAssembly` object 78 79 .. include:: code/wasm-c-bare/bare_naive.js 80 :code: javascript 81 :number-lines: 0 82 83 Running this tells us we are apparently missing a property :code:`env` in the imports object. 84 85 .. code-block:: bash 86 87 $ node bare_naive.js 88 /home/lash/src/tests/wasm/bare/bare_naive.js:8 89 const i = new WebAssembly.Instance(m, imports); 90 ^ 91 92 TypeError: WebAssembly.Instance(): Import #0 module="env" error: module is not an object or function 93 94 That seems to match with the :code:`Import` section in the :code:`objdump` output above. Let's stick the *memory* and *table* in there. [2]_ 95 96 And let's make a bold guess that the callback function :code:`call_me_sometime` needs to go in there aswell. 97 98 .. include:: code/wasm-c-bare/bare.js 99 :code: javascript 100 :number-lines: 0 101 102 The linker needs a little help from us for this: 103 104 - Our callback function will not be available at link time, so we have to :code:`--allow-undefined` to promise that the host has got this covered. 105 - :code:`--import-memory` and :code:`--import-table` to enable us to get memory and symbol table from the host. 106 - :code:`--export="foo"` to make sure we only export exactly what we intend to from our :code:`wasm`. 107 108 109 .. code-block:: bash 110 111 $ clang --target=wasm32 -nostdlib -nostartfiles -Wl,--no-entry -Wl,--export="foo" -Wl,--import-memory -Wl,--import-table -Wl,--allow-undefined -o bare.wasm bare.c 112 113 114 And that should give us: 115 116 .. code-block:: bash 117 118 $ node bare.js 119 heap is at: 66560 120 heap contains: foobar 121 122 This way of pointing to memory is of course grossly inadequate *and* unsafe *and* ridiculous for any purpose more advanced that this one. So some proper memory management would not be a bad thing. 123 124 125 Adding libc 126 =========== 127 128 And what do you know. In other news since last time I looked at this is the addition of "a libc for WebAssembly programs built on top of WASI system calls." [wasi-libc]_. Let's see if we can add a slightly less manual way of handling memory with :code:`malloc` and :code:`memcpy` 129 130 .. include:: code/wasm-c-bare/bare_libc.c 131 :code: c 132 :number-lines: 0 133 134 As you see, we need a few more parameters for the compiler and linker at this point. The :code:`--target=wasm32-unknown-wasi --sysroot /opt/wasi-libc .. /opt/wasi-libc/lib/wasm32-wasi/libc.a` is needed to hook us up with headers and symbols for the libc. 135 136 My archlinux puts that sysroot in :code:`/opt/wasi-libc`, that may of course not be the case elsewhere. 137 138 .. code-block:: bash 139 140 $ clang -DHAVE_LIBC=1 --target=wasm32-unknown-wasi --sysroot /opt/wasi-libc -nostdlib -nostartfiles -Wl,--no-entry -Wl,--export="foo" -Wl,--import-memory -Wl,--import-table -Wl,--allow-undefined -o bare.wasm bare.c /opt/wasi-libc/lib/wasm32-wasi/libc.a 141 $ wasm-objdump -x bare.wasm 142 143 bare.wasm: file format wasm 0x1 144 145 Section Details: 146 147 Type[3]: 148 - type[0] (i32) -> nil 149 - type[1] () -> nil 150 - type[2] (i32) -> i32 151 Import[3]: 152 - memory[0] pages: initial=2 <- env.memory 153 - table[0] type=funcref initial=1 <- env.__indirect_function_table 154 - func[0] sig=0 <call_me_sometime> <- env.call_me_sometime 155 Function[7]: 156 - func[1] sig=1 <foo> 157 - func[2] sig=2 <malloc> 158 - func[3] sig=2 <dlmalloc> 159 - func[4] sig=0 <free> 160 - func[5] sig=0 <dlfree> 161 - func[6] sig=1 <abort> 162 - func[7] sig=2 <sbrk> 163 Global[2]: 164 - global[0] i32 mutable=1 - init i32=67072 165 - global[1] i32 mutable=0 <__heap_base> - init i32=67072 166 Export[2]: 167 - func[1] <foo> -> "foo" 168 - global[1] -> "__heap_base" 169 Code[7]: 170 - func[1] size=171 <foo> 171 - func[2] size=10 <malloc> 172 - func[3] size=6984 <dlmalloc> 173 - func[4] size=10 <free> 174 - func[5] size=1908 <dlfree> 175 - func[6] size=4 <abort> 176 - func[7] size=78 <sbrk> 177 Data[2]: 178 - segment[0] memory=0 size=7 - init i32=1024 179 - 0000400: 6261 7a62 6172 00 bazbar. 180 - segment[1] memory=0 size=500 - init i32=1032 181 - 0000408: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 182 - 0000418: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 183 - 0000428: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 184 - 0000438: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 185 - 0000448: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 186 - 0000458: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 187 - 0000468: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 188 - 0000478: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 189 - 0000488: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 190 - 0000498: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 191 - 00004a8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 192 - 00004b8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 193 - 00004c8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 194 - 00004d8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 195 - 00004e8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 196 - 00004f8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 197 - 0000508: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 198 - 0000518: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 199 - 0000528: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 200 - 0000538: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 201 - 0000548: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 202 - 0000558: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 203 - 0000568: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 204 - 0000578: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 205 - 0000588: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 206 - 0000598: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 207 - 00005a8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 208 - 00005b8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 209 - 00005c8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 210 - 00005d8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 211 - 00005e8: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 212 - 00005f8: 0000 0000 .... 213 Custom: 214 - name: "name" 215 - func[0] <call_me_sometime> 216 - func[1] <foo> 217 - func[2] <malloc> 218 - func[3] <dlmalloc> 219 - func[4] <free> 220 - func[5] <dlfree> 221 - func[6] <abort> 222 - func[7] <sbrk> 223 Custom: 224 - name: "producers" 225 226 What luxury. And of course, our :code:`bare.wasm` file just grew from 350 bytes to 10k... 227 228 We don't have to change our :code:`javascript` code at this point. Simply run again, and get: 229 230 .. code-block:: bash 231 232 $ node bare.js 233 heap is at: 67088 234 heap contains: foobazbar 235 236 237 .. _Emscripten: https://emscripten.org/ 238 239 .. _LLVM: https://llvm.org/ 240 241 .. _clang: http://clang.org/ 242 243 .. _wabt: https://github.com/WebAssembly/wabt 244 245 .. _libc for wasi: https://github.com/WebAssembly/wasi-libc 246 247 .. 248 249 .. [1] :code:`__heap_base` will be set by default by the wasm environment, and is thus available as an external symbol. 250 251 .. 252 253 .. [2] After linking the memory symbol meeds to be called :code:`memory` instead of :code:`__linear_memory` for some reason. Thus we add both here for clarity. 254 255 256 .. 257 258 .. [wasi-libc] https://github.com/WebAssembly/wasi-libc