This article provides information that is relevant to people who want to contribute to Emscripten. We welcome contributions from anyone that is interested in helping out!
小技巧
The information will be less relevant if you’re just using Emscripten, but may still be of interest.
For contributing to core Emscripten code, such as emcc.py
, you don’t need to
build any binaries as emcc.py
is in Python, and the core JS generation is
in JavaScript. You do still need binaries for LLVM and Binaryen, which you can
get using the emsdk:
emsdk install tot-upstream
emsdk activate tot-upstream
That gets a “tip-of-tree” build of the very latest binaries. You can use those
binaries with a checkout of the core Emscripten repository, simply by calling
emcc.py
from that checkout, and it will use the binaries from the emsdk.
If you do want to contribute to LLVM or Binaryen, or to test modifications to them, you can build them from source.
The Emscripten main repository is https://github.com/emscripten-core/emscripten.
Aside from the Emscripten repo, the other codebases of interest are LLVM and Binaryen, which Emscripten invokes, and have their own repos.
Patches should be submitted as pull requests in the normal way on GitHub.
When submitting patches, please:
tests/*.py
for related tests, as often the simplest thing is to add to
an existing one. If you’re not sure how to test your code, feel free to ask
for help.One of the core developers will review a pull request before merging it. If several days pass without any comments on your PR, please comment in the PR which will ping them. (If that happens, sorry! Sometimes things get missed.)
The Emscripten Compiler Frontend (emcc) is a python script that manages the entire compilation process:
wasm-ld
to link it. It
builds and integrates with the Emscripten system libraries, both the
compiled ones and the ones implemented in JS.src/compiler.js
and related files) which emits the JS.Emscripten has a comprehensive test suite, which covers virtually all Emscripten functionality. These tests are run on CI automatically when you create a pull request, and they should all pass. If you run into trouble with a test failure you can’t fix, please let the developers know.
If you find a regression, bisection is often the fastest way to figure out what went wrong. This is true not just for finding an actual regression in Emscripten but also if your project stopped working when you upgrade, and you need to investigate if it’s an Emscripten regression or something else. The rest of this section covers bisection on Emscripten itself. It is hopefully useful for both people using Emscripten as well as Emscripten developers.
If you have a large bisection range - for example, that covers more than one version of Emscripten - then you probably have changes across multiple repos (Emscripten, LLVM, and Binaryen). In that case the easiest and fastest thing is to bisect using emsdk builds. Each step of the bisection will download a build produced by the emscripten releases builders. Using this approach you don’t need to compile anything yourself, so it can be very fast!
To do this, you need a basic understanding of Emscripten’s release process The key idea is that:
emsdk install [HASH]
can install an arbitrary build of emscripten from any point in the past (assuming the build succeeded). Each build is identified by a hash (a long string of numbers and characters), which is a hash of a commit in the releases repo. The mapping of Emscripten release numbers to such hashes is tracked by emscripten-releases-tags.json in the emsdk repo.
With that background, the bisection process would look like this:
tot
builds. If instead you only know Emscripten version
numbers, use emscripten-releases-tags.json
to find the hashes.git bisect
on the emscripten-releases
repo.emscripten-releases
repo that you are bisecting on)
using emsdk install HASH
. Then test your code and do
git bisect good
or git bisect bad
accordingly, and keep bisecting
until you find the first bad commit.The first bad commit is a single change in the releases repo. That commit will generally update a single sub-repo (Emscripten, LLVM, or Binaryen) to add one or more new changes. Often that list will be very short or even a single commit, and you can see which actual commit caused the problem. When filing a bug, mentioning such a bisection result can greatly speed things up (even if that commit contains multiple changes).
If that commit contains multiple changes then you can optionally bisect further on the specific repo (as all the changes will normally be in just one of them, with the others kept fixed). Doing this will require rebuilding locally, which was not needed in the main bisection described in this section.