What is the current state of more advanced Glimmer VM features?

Hello,

I’m working on an article comparing the latest rendering engines of React, Angular and Ember. Therefore I’m trying to understand the internals of the Glimmer VM better. I came up with some questions on advanced features, which were discussed in its early days. It seems as they haven’t landed yet. Was wondering about their current state.

I watched two talks about it from 2017: A talk by Tom Dale at ReactConf 2017 and a similar one by Yehuda Katz at Ember San Franciso meetup in the same year. Both mentioning these advanced features of the Glimmer VM, which doesn’t seem to be used in Ember yet:

  1. Sending templates as byte code over the wire. This was claimed to provide two benefits: 1. Avoid costs of parsing JavaScript or JSON at all and 2. have a smaller bundle size.
  2. String interning. It’s only mentioned in Tom’s talk if I recall correctly. It was claimed to reduce the bundle size for apps with a lot of templates as a shared dictionary of static strings is used for them.
  3. Writing parts of the Glimmer VM in WebAssembly. Yehuda said that this will help to prevent performance pitfalls, which are easy to make when writing JavaScript.

Here is a compiled template with latest Ember:

Ember.HTMLBars.template(
  /*
    <p>Count: {{count}}</p>
    <button {{on "click" this.incrementCount}}>increment count</button>
   */
  {
    id: "7uhCW8v2",
    block:
      '{"symbols":[],"statements":[[10,"p"],[12],[2,"Count: "],[1,[34,0]],[13],[2,"\\n"],[11,"button"],[4,[38,1],["click",[32,0,["incrementCount"]]],null],[12],[2,"increment count"],[13]],"hasEval":false,"upvars":["count","on"]}',
    meta: { moduleName: "ember-hello-world/components/hello-world.hbs" },
  }
);

This template shows that bytecode is not used as wire format. The template meta-data is encoded as JSON in the block property. That JSON is lazy parsed when needed.

Also string interning doesn’t seem to be used. There isn’t even a separation between instructions and static strings in current wire format.

A new ember app also doesn’t use any WebAssembly as far as I can tell.

What is the current state of these three features? Are they still planned? Or did you figgured out that they come with trade-offs which aren’t acceptable? Maybe they are even useable through a feature flag? Or in latest Glimmer standalone library?

Didn’t want to make any pressure. Totally understand that it’s complex to land such stuff into an existing ecosystem. Just wondering if I should discuss these features as something still planned or better not mention in the article.

Best Jeldrik

7 Likes

Hey Jeldrik,

Thank you for the thoughtful and detailed question! I want to start off by saying that most of the ideas and features you’ve asked about are definitely still in the works. There were some important things we’ve learned in the past few years since those talks were given:

  1. A proof of concept was built that used Ahead-of-Time (AoT) template compilation to send Glimmer’s bytecode directly over the wire. This proof of concept did show us that transferring the bytecode directly had the benefits we thought it would in terms of less parse cost and smaller bundle size. However, while this worked as long as we could resolve everything about components at compile-time, this didn’t work with Ember at the time, so there was no clear path to bringing this improvement to Ember.

    In fact, we were moving toward Module Unification specifically to make Ember’s resolution system more static so that ahead-of-time static analysis could be done, but even with that extra static-ness it was a major lift to be able to resolve everything correctly at compile time, without running the app, and it wasn’t clear how much would have to change in Ember apps to make it possible. This was one of the many reasons why MU ended up being discarded in favor of Template Imports, actually.

  2. It was also discovered that the custom file format that we intended to use, .gbx, was not going to work as well as we thought. It had issues with CDNs, since they didn’t want to host generic binary files with unknown formats (go figure :stuck_out_tongue:)

  3. A spike was done to use Wasm for certain parts of the VM (in fact, this is the genesis of the low-level VM in the Glimmer codebase). This ran into a few issues, including that there were many instances where a construct should have been created in Wasm, but passed out to JS. However, until recently with WeakRefs there was no way to clean up memory usage in Wasm when a value was no longer referenced in JS and was garbage collected.

With these learnings in mind, we also realized that a large part of Ember’s excess cost at the time had nothing to do with the over-the-wire transfer cost. JSON parsing is really fast overall as it’s a less complex language that JavaScript and uses a different parser, and is done off the main thread (see this video for more details), so the current wire-format setup was pretty efficient as is. And new compression methods like Brotli ended up being difficult to beat for things like string and symbol interning. It definitely helps to deduplicate, but it doesn’t help as much as you might think.

By contrast, there were a number of features that were expensive no matter what we did to make load times better, including:

  • The legacy mixin-based object model provided by EmberObject, which cost a large amount in terms of byte size and was really costly to boot up in the first place.
  • The chains-based state model with computed properties, which created a large amount of overhead up front for things that may never even change, and that were not costly when they did change.
  • Classic components, which have a lot of really dynamic capabilities that add overhead for every single component invocation.

This is a large part of why we pushed to replace these features in Ember Octane, alongside the DX benefits of their replacements.

Now that Octane has landed and the features can be absorbed by the community, we’re focusing on the replacement for MU: Template Imports. Once we land those changes, it should be possible to begin exploring transfering bytecode directly over the wire again, since we’ll have a way to link up templates more directly and we no longer need the full static analysis that AoT mode used to provide. We also plan to address the custom file format problem by using Wasm modules to transfer the bytecode, since they are necessarily supported by CDNs.

This is also how we plan to begin introducing Wasm to Glimmer and Ember overall. While it’s definitely likely that we’ll continue to experiment and try out using Wasm for other parts of the VM now that WeakRefs are available, the most natural place for them to start off is for templates, since it’s already a lot of binary data, and it only needs to be read once. There’s no worries about overhead from transitioning between JS and Wasm, and no worries about conceptual complexity, so its the perfect starting point!

So, the TL;DR is: We’ve been continuing to explore these concepts and features as we implemented Octane, and are getting closer to them now that Template Imports and Strict Mode are being worked on actively!

12 Likes

Thanks a lot Chris for your response! It was very helpful.

If okay for you, I would like to ask one follow-up question to make sure that an assumption about tree shaking is correct.

The Angular team was working hard to make sure that their rendering engine is tree shakeable. As part of Ivy they changed the format to which the templates are compiled. Angular compiles a template to a function, which consists of a list of instructions. Since Ivy each instruction is its own function and called explicitly in the template. It looks like this:

import ng from '@angular/core';

function HelloWorldComponent_Template(rf, ctx) {
  if (rf & 1) {
    ng.ɵɵelementStart(0, "div");                                     // <div>
    ng.ɵɵtext(1);                                                    //   Count: {{count}}
    ng.ɵɵelementEnd();                                               // </div>
    ng.ɵɵelementStart(2, "button", 0);                               // <button
    ng.ɵɵlistener("click", function() { return ctx.increment(); });  //  (click)="increment()">
    ng.ɵɵtext(3, "Increment");                                       //   Increment
    ng.ɵɵelementEnd();                                               // </button>
  }
  if (rf & 2) {
    ng.ɵɵadvance(1);                                                 // <div>
    ng.ɵɵtextInterpolate1("Count: ", ctx.count, "");                 //   Count: {{count}}
  }
};

This allows a bundler like Webpack or Rollup to tree shake all instructions from the rendering engine, which aren’t used by the application.

The Glimmer VM is not tree shakeable due to its architecture. The instructions are represented as opcodes, which are interpreted at run-time. So there isn’t an explicit references to the used instructions of the Glimmer VM. This seems to be more similar to Angular before Ivy than after.

Am I assuming correctly that Glimmer takes the trade-off that the Glimmer VM is not tree shakeable explicitly. I think this trade-off is acceptable cause the list of built-in instructions in Glimmer VM is very small. It’s very likely that all real-world applications use all existing instructions. So there isn’t much room for tree shaking at all. Is this correct?

So the Glimmer VM is not tree-shakeable via standard build steps using JavaScript bundlers, that is absolutely correct. Since it converts templates into an interpreted bytecode, JS bundlers aren’t able to understand what instructions are used and which ones are not, so they cannot tree shake the same way you would for standard JS code. I also think that as you said, the list of opcodes is overall very small, and most real-world apps will end up using all of them.

There is one exception here, and that is for deprecated functionality. Over time, features may be removed from a rendering engine, and replaced with more streamlined, minimal, features. For instance, Classic components actually require quite a few extra capabilities than Glimmer components, and those capabilities exist even if you are writing a new Octane app with no Classic components at all.

This is what the component manager’s capabilities feature is all about. As part of the conversion from the wire-format to the byte-code, capabilities are loaded for the component, and based on that, instructions for those capabilities will be emitted. If a capability is not enabled, the opcode is not added. This reduces the runtime cost for the component, so it is only impacted the very first time it is loaded. It does not reduce the cost of shipping those instructions over the wire though.

The idea at the moment is that we’ll reduce the Intermediate Representation (IR) from the current wire format down to a minimal bytecode, and have a very minimal layer in Wasm that receives the capabilities and expands some minimal opcodes based on those, which should be much faster than the compilation process today. This will all still be done at load time though, in the browser, so we won’t be able to treeshake any capabilities’ instructions. However, in the future we do want to explore using static analysis to figure out what the capabilities of components are at compile-time, and if that works out I could also imagine being able to exclude unused opcodes.

All that to say, this is firmly in the “maybe it could work and we definitely want to explore it” space, and not the “we definitely plan on shipping something like this” space. So, for your article, I would say that we are accepting the tradeoff like you outlined, and unless future R&D pays off I think that’s probably what will happen. And, like you pointed out, the tradeoff isn’t all that huge, because the opcodes are minimal and there aren’t many capabilities anyways (and most are likely to be deprecated and removed eventually).

7 Likes