The Wasm Workers API enables C/C++ code to leverage Web Workers and shared WebAssembly.Memory (SharedArrayBuffer) to build multithreaded programs via a direct web-like programming API.
#include <emscripten/wasm_worker.h>
#include <stdio.h>
void run_in_worker()
{
printf("Hello from Wasm Worker!\n");
}
int main()
{
emscripten_wasm_worker_t worker = emscripten_malloc_wasm_worker(/*stackSize: */1024);
emscripten_wasm_worker_post_function_v(worker, run_in_worker);
}
Build the code by passing the Emscripten flag -sWASM_WORKERS
at both compile
and link steps. The example code creates a new Worker on the main browser thread,
which shares the same WebAssembly.Module and WebAssembly.Memory object. Then a
postMessage()
is passed to the Worker to ask it to execute the function
run_in_worker()
to print a string.
To explicitly control the memory allocation placement when creating a worker,
use the emscripten_create_wasm_worker()
function. This function takes a
region of memory that must be large enough to hold both the stack and the TLS
data for the worker. You can use __builtin_wasm_tls_size()
to find out at
runtime how much space is required for the program’s TLS data.
In WebAssembly programs, the Memory object that contains the application state can be shared across multiple Workers. This enables direct, high performance (and if explicit care is not taken, racy!) access to synchronously share data state between multiple Workers (shared state multithreading).
POSIX Threads (Pthreads) API, and
Wasm Workers API.
The Pthreads API has a long history with native C programming and the POSIX standard, while Wasm Workers API is unique to Emscripten compiler only.
These two APIs provide largely the same feature set, but have important differences, which this documentation seeks to explain to help decide which API one should target.
The intended audience and use cases of these two multithreading APIs are slightly different.
The focus on Pthreads API is on portability and cross-platform compatibility. This API is best used in scenarios where portability is most important, e.g. when a codebase is cross-compiled to multiple platforms, like building both a native Linux x64 executable and an Emscripten WebAssembly based web site.
Pthreads API in Emscripten seeks to carefully emulate compatibility and the features that the native Pthreads platforms already provide. This helps porting large C/C++ codebases over to WebAssembly.
Wasm Workers API on the other hand seeks to provide a “direct mapping” to the web multithreading primitives as they exist on the web, and call it a day. If an application is only developed to target WebAssembly, and portability is not a concern, then using Wasm Workers can provide great benefits in the form of simpler compiled output, less complexity, smaller code size and possibly better performance.
However this benefit might not be an obvious win. The Pthreads API was designed to be useful from the synchronous C/C++ language, whereas Web Workers are designed to be useful from asynchronous JavaScript. WebAssembly C/C++ programs can find themselves somewhere in the middle.
Pthreads and Wasm Workers share several similarities:
Both can use emscripten_atomic_* Atomics API,
Both can use GCC __sync_* Atomics API,
Both can use C11 and C++11 Atomics APIs,
Both types of threads have a local stack.
Both types of threads have thread-local storage (TLS) support via
thread_local
(C++11),_Thread_local
(C11) and__thread
(GNU11) keywords.Both types of threads support TLS via explicitly linked in Wasm globals (see
test/wasm_worker/wasm_worker_tls_wasm_assembly.c/.S
for example code)Both types of threads have a concept of a thread ID (
pthread_self()
for pthreads,emscripten_wasm_worker_self_id()
for Wasm Workers)Both types of threads can perform an event-based and an infinite loop programming model.
Both can use
EM_ASM
andEM_JS
API to execute JS code on the calling thread.Both can call out to JS library functions (linked in with
--js-library
directive) to execute JS code on the calling thread.Neither pthreads nor Wasm Workers can be used in conjunction with
-sSINGLE_FILE
linker flag.
However, the differences are more notable.
Only pthreads can use the MAIN_THREAD_EM_ASM*()
and MAIN_THREAD_ASYNC_EM_ASM()
functions and
the foo__proxy: 'sync'/'async'
proxying directive in JS libraries.
Wasm Workers on the other hand do not provide a built-in JS function proxying facility. Proxying a JS
function with Wasm Workers can be done by explicitly passing the address of that function to the
emscripten_wasm_worker_post_function_*
API.
If you need to synchronously wait for the posted function to finish from within a Worker, use one of
the emscripten_wasm_worker_*()
thread synchronization functions to sleep the calling thread until
the callee has finished the operation.
Note that Wasm Workers cannot
At the expense of performance and code size, pthreads implement a notion of POSIX cancellation
points (pthread_cancel()
, pthread_testcancel()
).
Wasm Workers are more lightweight and performant by not enabling that concept.
Creating new Workers can be slow. Spawning a Worker in JavaScript is an asynchronous operation. In order to support synchronous pthread startup (for applications that need it) and to improve thread startup performance, pthreads are hosted in a cached Emscripten runtime managed Worker pool.
Wasm Workers omit this concept, and as result Wasm Workers will always start up asynchronously. If you need to detect when a Wasm Worker has started up, post a ping-pong function and reply pair manually between the Worker and its creator. If you need to spin up new threads quickly, consider managing a pool of Wasm Workers yourself.
On the web, if a Worker spawns a child Worker of its own, it will create a nested Worker hierarchy that the main thread cannot directly access. To sidestep portability issues stemming from this kind of topology, pthreads flatten the Worker creation chain under the hood so that only the main browser thread ever spawns threads.
Wasm Workers do not implement this kind of topology flattening, and creating a Wasm Worker in a Wasm Worker will produce a nested Worker hierarchy. If you need to create Wasm Workers from within a Wasm Worker, consider which type of hierarchy you would like, and if necessary, flatten the hierarchy manually by posting the Worker creation over to the main thread yourself.
Note that support for nested Workers varies across browsers. As of 02/2022, nested Workers are not supported in Safari. See here for a polyfill.
The multithreading synchronization primitives offered in emscripten/wasm_worker.h
(emscripten_lock_*
, emscripten_semaphore_*
, emscripten_condvar_*
) can be freely invoked
from within pthreads if one so wishes, but Wasm Workers cannot utilize any of the synchronization
functionality in the Pthread API (pthread_mutex_*
, pthread_cond_
, pthread_rwlock_*
, etc),
since they lack the needed pthread runtime.
The startup/execution model of pthreads is to start up executing a given thread entry point function. When that function exits, the pthread will also (by default) quit, and the Worker hosting that pthread will return to the Worker pool to wait for another thread to be created on it.
Wasm Workers instead implement the direct web-like model, where a newly created Worker sits idle in its
event loop, waiting for functions to be posted to it. When those functions finish, the Worker will
return to its event loop, waiting to receive more functions (or worker scope web events) to execute.
A Wasm Worker will only quit with a call to emscripten_terminate_wasm_worker(worker_id)
or
emscripten_terminate_all_wasm_workers()
.
Pthreads allow one to register thread exit handlers via pthread_atexit
, which will be called when
the thread quits. Wasm Workers do not have this concept.
In order to enable flexible synchronous execution of code on other threads, and to implement support APIs for example for MEMFS filesystem and Offscreen Framebuffer (WebGL emulated from a Worker) features, main browser thread and each pthread have a system-backed “proxy message queue” to receive messages.
This enables user code to call API functions, emscripten_sync_run_in_main_runtime_thread()
,
emscripten_async_run_in_main_runtime_thread()
, emscripten_dispatch_to_thread()
, etc. from
emscripten/threading.h
to perform proxied calls.
Wasm Workers do not provide this functionality. If needed, such messaging should be implemented manually by users via regular multithreaded synchronized programming techniques (mutexes, futexes, semaphores, etc.)
Another portability aiding emulation feature that Pthreads provide is that the time values returned by
emscripten_get_now()
are synchronized to a common time base across all threads.
Wasm Workers omit this concept, and it is recommended to use the function emscripten_performance_now()
for high performance timing in a Wasm Worker, and avoid comparing resulting values across Workers, or
manually synchronize them.
The multithreaded input API provided in emscripten/html5.h
only works with the pthread API. When
calling any of the functions emscripten_set_*_callback_on_thread()
, one can choose the target
pthread to be the recipient of the received events.
With Wasm Workers, if desired, “backproxying” events from the main browser thread to a Wasm Worker
should be implemented manually e.g. by using the emscripten_wasm_worker_post_function_*()
API family.
However note that backproxying input events has a drawback that it prevents security sensitive operations, like fullscreen requests, pointer locking and audio playback resuming, since handling the input event is detached from the event callback context executing the initial operation.
The mutex implementation from pthread_mutex_*
has a few different creation options, one being a
“recursive” mutex.
The lock implemented by emscripten_lock_*
API is not recursive (and does not provide an option).
Pthreads also offer a programming guard against a programming error that one thread would not release
a lock that is owned by another thread. emscripten_lock_*
API does not track lock ownership.
Pthreads have a fixed dependency to dynamic memory allocation, and perform calls to malloc
and free
to allocate thread specific data, stacks and TLS slots.
With the exception of the helper function emscripten_malloc_wasm_worker()
, Wasm Workers are not dependent
on a dynamic memory allocator. Memory allocation needs are met by the caller at Worker creation time, and
can be statically placed if desired.
The disk size overhead from pthreads is on the order of a few hundred KBs. Wasm Workers runtime on the other hand is optimized for tiny deployments, just a few hundred bytes on disk.
To further understand the different APIs available between Pthreads and Wasm Workers, refer to the following table.
Feature | Pthreads | Wasm Workers |
Thread termination | Thread calls pthread_exit(status)or main thread calls pthread_kill(code) |
Worker cannot terminate itself, parent thread terminates by calling emscripten_terminate_wasm_worker(worker) |
Thread stack | Specify in pthread_attr_t structure. | Manage thread stack area explicitly with emscripten_create_wasm_worker_*_tls()functions, or automatically allocate stack with emscripten_malloc_wasm_worker()API. |
Thread Local Storage (TLS) | Supported transparently. | Supported either explicitly with emscripten_create_wasm_worker_*_tls()functions, or automatically via emscripten_malloc_wasm_worker()API. |
Thread ID | Creating a pthread obtains its ID. Call pthread_self()to acquire ID of calling thread. |
Creating a Worker obtains its ID. Call emscripten_wasm_worker_self_id()acquire ID of calling thread. |
High resolution timer | ``emscripten_get_now()`` | ``emscripten_performance_now()`` |
Synchronous blocking on main thread | Synchronization primitives internally fall back to busy spin loops. | Explicit spin vs sleep synchronization primitives. |
Futex API | emscripten_futex_wait emscripten_futex_wakein emscripten/threading.h |
emscripten_atomic_wait_u32 emscripten_atomic_wait_u64 emscripten_atomic_notifyin emscripten/atomic.h |
Asynchronous futex wait | N/A | emscripten_atomic_wait_async() emscripten_*_async_acquire()However these are a difficult footgun, read WebAssembly/threads issue #176 |
C/C++ Function Proxying | emscripten/threading.h API for proxying function calls to other threads. | Use emscripten_wasm_worker_post_function_*() API to message functions to other threads. These messages follow event queue semantics rather than proxy queue semantics. |
Build flags | Compile and link with -pthread | Compile and link with -sWASM_WORKERS |
Preprocessor directives | __EMSCRIPTEN_SHARED_MEMORY__=1 and __EMSCRIPTEN_PTHREADS__=1 are active | __EMSCRIPTEN_SHARED_MEMORY__=1 and __EMSCRIPTEN_WASM_WORKERS__=1 are active |
JS library directives | USE_PTHREADS and SHARED_MEMORY are active | USE_PTHREADS, SHARED_MEMORY and WASM_WORKER are active |
Atomics API | Supported, use any of __atomic_* API, __sync_* API or C++11 std::atomic API. | |
Nonrecursive mutex | pthread_mutex_* |
emscripten_lock_* |
Recursive mutex | pthread_mutex_* |
N/A |
Semaphores | N/A | emscripten_semaphore_* |
Condition Variables | pthread_cond_* |
emscripten_condvar_* |
Read-Write locks | pthread_rwlock_* |
N/A |
Spinlocks | pthread_spin_* |
emscripten_lock_busyspin* |
WebGL Offscreen Framebuffer | Supported with -sOFFSCREEN_FRAMEBUFFER |
Not supported. |
The following build options are not supported at the moment with Wasm Workers:
-sSINGLE_FILE
Dynamic linking (-sLINKABLE, -sMAIN_MODULE, -sSIDE_MODULE)
-sPROXY_TO_WORKER
-sPROXY_TO_PTHREAD
See the directory test/wasm_workers/
for code examples on different Wasm Workers API functionality.