开发指南

This article provides information that is relevant to people who want to contribute to Emscripten. We welcome contributions from anyone that is interested in helping out!

小技巧

The information will be less relevant if you’re just using Emscripten, but may still be of interest.

Setting up

For contributing to core Emscripten code, such as emcc.py, you don’t need to build any binaries as emcc.py is in Python, and the core JS generation is in JavaScript. You do still need binaries for LLVM and Binaryen, which you can get using the emsdk:

emsdk install tot-upstream
emsdk activate tot-upstream

That gets a “tip-of-tree” build of the very latest binaries. You can use those binaries with a checkout of the core Emscripten repository, simply by calling emcc.py from that checkout, and it will use the binaries from the emsdk.

If you do want to contribute to LLVM or Binaryen, or to test modifications to them, you can build them from source.

Repositories and branches of interest

The Emscripten main repository is https://github.com/emscripten-core/emscripten.

Aside from the Emscripten repo, the other codebases of interest are LLVM and Binaryen, which Emscripten invokes, and have their own repos.

Submitting patches

Patches should be submitted as pull requests in the normal way on GitHub.

注解

Together with your first patch, add yourself to the AUTHORS file. By doing so, you agree to license your code under the project’s open source licenses (MIT/LLVM).

When submitting patches, please:

  • Add an automatic test if you add any new functionality or fix a bug. Search in tests/*.py for related tests, as often the simplest thing is to add to an existing one. If you’re not sure how to test your code, feel free to ask for help.
  • We normally squash and merge PRs, which means the PR turns into a single commit on the target branch. Because of that, it’s ok to have merge commits in the PR itself, as they get removed. Please put a good description for the final commit in the PR description, and we’ll use it when squashing.

Code reviews

One of the core developers will review a pull request before merging it. If several days pass without any comments on your PR, please comment in the PR which will ping them. (If that happens, sorry! Sometimes things get missed.)

Compiler overview

The Emscripten Compiler Frontend (emcc) is a python script that manages the entire compilation process:

  • emcc calls Clang to compile C++ and wasm-ld to link it. It builds and integrates with the Emscripten system libraries, both the compiled ones and the ones implemented in JS.
  • emcc then calls emscripten.py which performs the final transformation to wasm (including invoking wasm-emscripten-finalize from Binaryen) and calls the JS compiler (see src/compiler.js and related files) which emits the JS.
  • If optimizing wasm, emcc will then call wasm-opt, run meta-dce, and other useful things. It will also run JS optimizations on the JS that is emitted alongside the wasm.

Emscripten Test Suite

Emscripten has a comprehensive test suite, which covers virtually all Emscripten functionality. These tests are run on CI automatically when you create a pull request, and they should all pass. If you run into trouble with a test failure you can’t fix, please let the developers know.

Bisecting

If you find a regression, bisection is often the fastest way to figure out what went wrong. This is true not just for finding an actual regression in Emscripten but also if your project stopped working when you upgrade, and you need to investigate if it’s an Emscripten regression or something else. The rest of this section covers bisection on Emscripten itself. It is hopefully useful for both people using Emscripten as well as Emscripten developers.

If you have a large bisection range - for example, that covers more than one version of Emscripten - then you probably have changes across multiple repos (Emscripten, LLVM, and Binaryen). In that case the easiest and fastest thing is to bisect using emsdk builds. Each step of the bisection will download a build produced by the emscripten releases builders. Using this approach you don’t need to compile anything yourself, so it can be very fast!

To do this, you need a basic understanding of Emscripten’s release process The key idea is that:

emsdk install [HASH]

can install an arbitrary build of emscripten from any point in the past (assuming the build succeeded). Each build is identified by a hash (a long string of numbers and characters), which is a hash of a commit in the releases repo. The mapping of Emscripten release numbers to such hashes is tracked by emscripten-releases-tags.txt in the emsdk repo.

With that background, the bisection process would look like this:

  1. Find the hashes to bisect between. You may already know them if you found the problem on tot builds. If instead you only know Emscripten version numbers, use emscripten-releases-tags.txt to find the hashes.
  2. Using those hashes, do a normal git bisect on the emscripten-releases repo.
  3. In each step of the bisection, download the binary build for the current commit hash (in the emscripten-releases repo that you are bisecting on) using emsdk install HASH. Then test your code and do git bisect good or git bisect bad accordingly, and keep bisecting until you find the first bad commit.

The first bad commit is a single change in the releases repo. That commit will generally update a single sub-repo (Emscripten, LLVM, or Binaryen) to add one or more new changes. Often that list will be very short or even a single commit, and you can see which actual commit caused the problem. When filing a bug, mentioning such a bisection result can greatly speed things up (even if that commit contains multiple changes).

If that commit contains multiple changes then you can optionally bisect further on the specific repo (as all the changes will normally be in just one of them, with the others kept fixed). Doing this will require rebuilding locally, which was not needed in the main bisection described in this section.