ES2015 Generators

Recently I had the opportunity to re-write the content tree control that we use to manage content nodes in www.enotes.com. We’ve all worked with the DOM, which represents HTML nodes in a tree structure but has some challenging deficiencies and a relatively grumpy API, the tortures of which prompted me to take a stab at a smoother tree-like design. During this process I experimented with several approaches to managing nodes in a tree structure, the first of which was a “flat hierarchy” which is as obtuse as it sounds and didn’t get much traction. I then opted for the more traditional parent/child approach, but still wanted a way to treat an entire tree of nodes in a “flat” manner. The ES2015 generator function was my solution.

Groot

I affectionately named my new tree control “Groot”. It consists of a small number of files, the most important of which are groot.js and groot-leaf.js.

├── groot-event-args.js
├── groot-leaf.hbs
├── groot-leaf.js
├── groot-menu.hbs
├── groot-menu.js
├── groot-templates.js
├── groot-tree.hbs
└── groot.js

The groot module contains a tree of data–a hierarchy of groot-leaf instances. It listens for DOM events and manipulates its inner tree accordingly. A “tree” is merely a root groot-leaf instance with children, who also have children, and so on. So every node in the tree is a “leaf”. The API for each leaf is large but not conceptually complex, and may be divided into the following categories:

  • modifying this leaf’s relationship to other leafs
    • moveBefore(leaf)
    • moveAfter(leaf)
    • makeParentOf(leaf), etc.
  • getting leafs related to this leaf
    • getParent()
    • getChildren()
    • hasAnyActiveChildren(deep=false)
    • hasAnyInactiveChildren(deep=false)
    • getSiblings()
    • traverse(deep=false)
    • get(position)
    • find(id, throwOnMissing=true)
    • findByAttributes(attributes={}, throwOnMissing=true), etc.
  • querying this leaf’s relationship to other leafs
    • isParentOf(leaf)
    • isChildOf(leaf)
    • isSibling(leaf), etc.
  • how this leaf represents itself
    • expand(deep=false)
    • collapse(deep=false)
    • commitLabelChange(label), etc.
  • utility methods
    • equals(leaf)
    • length()
    • toString(), etc.

Each leaf also contains two important properties:

  • parent
  • leafs=[] (children)

For many of a leaf’s operations, it is concerned only with either its immediate parent or children. But for some operations, such as expand(), find(), hasAnyActiveChildren(), etc., recursion is necessary to locate and affect all of a leaf’s descendants.

Common accumulation and why it sucks

Suppose that we implemented the traverse() method using a common accumulation pattern.

{
    traverse: function (deep = true) {
        if (!deep) {
            // shallow copy
            return this.leafs.slice();
        }
        // shallow copy
        const results = [];
        this.leafs.forEach((leaf) => {
            results.push(leaf);
            if (leaf.hasChildren()) {
                results = results.concat(leaf.find(deep));
            }
        });
        return results;
    }
}

In this scenario, we are adding each leaf to an array of leafs and then appending its children in order, who will also append their children in order, until all leafs “below” this leaf are members of the results array, and are then returned. We could pass the results array by reference to traverse() to eliminate the overhead of concatenation, but the idea is still the same–results are built up as an array in memory and then returned. We could then loop over this array to modify all nodes within a leaf.

The find() method would look similar to this pattern–start with the immediate children and search “downwards” to find a given leaf by id, iterating over each child and bubbling up some result.

While both of these approaches are valid, there is a cooler and better way to perform this work, and that involves the use of ES2015 generators.

Generating instead of accumulating

A generator is an object returned from a special kind of function (which we will examine shortly) that has an API that shares similarities with the JavaScript iterator protocol–that is to say, it has a next() method that will fetch a result within a sequence.

Using a generator is pretty straight-forward. For example, the hasAnyActiveChildren() method uses a generator to iterate recursively over a leaf’s children to find any that are active.

{
    hasAnyActiveChildren: function (deep = false) {
        // #1
        const generator = this.traverse(deep);
        // #2
        let result = generator.next();
        // #3
        while (!result.done) {
            // #4
            const leaf = result.value;
            if (leaf.isActive) {
                return true;
            }
            // #5
            result = generator.next();
        }
        return false;
    }
}

In the code sample above:

  1. The generator is created by calling this.traverse(deep). This is a special kind of method that is responsible for locating the leafs that the generator will deliver when its next() method is called.
  2. Each successive leaf within the generator is fetched by calling next(). The result obtained from next() is not actually a leaf, but a results object with two important properties:
  3. done, which indicates whether the generator has finished iterating over its leafs, and
  4. value, the next leaf in the sequence.
  5. The next result in the sequence is obtained by calling next() again. The loop is repeated until all leafs in the sequence have been exhausted.

In both the accumulation example and this generator example, a sequence of leafs is being iterated over for some purpose. So what makes generation superior to accumulation, then, if both have the same purpose? The answer is: the generator function.

Is that a typo?

A generator function is responsible for creating a generator and supplying it with the items over which it iterates. In a Groot leaf, the traverse() method is a generator function, and is declared:

{
    traverse: function* (deep = false) { /*...*/ }
}

No, that asterisk is not a typo. That is, in fact, what distinguishes this function as a generator function. It tells the babel transpiler that, even though this function will appear to return leaf objects, babel should re-write it in such a way that it returns a generator instead, and that generator then iterates over leaf objects as traverse() supplies them. This is a fundamental difference between an accumulation function and a generator function: a generator function will supply values as they are requested, whereas an accumulation function collects all values and returns them at once.

The body of traverse() contains some interesting logic:

{
    traverse: function* (deep = false) {
        // #1
        for (let i = 0; i < this.length(); i += 1) {
            const leaf = this.get(i);
            // #2
            yield leaf;
            // #3
            if (!deep || leaf.length() === 0) {
                continue;
            }
            // #4
            const generator = leaf.traverse(true);
            let result = generator.next();
            while (!result.done) {
                const childLeaf = result.value;
                // #5
                yield childLeaf;
                result = generator.next();
            }
        }
    }
}
  1. First, a standard for loop iterates over the leaf’s immediate children. This is no surprise until we see…
  2. A yield statement. The first child leaf is fetched then yielded up to the generator object. This happens when the next() method is invoked on the generator. The yielded leaf will be the value of the generated result object. When the generator’s next() method is invoked again, the generator function will resume where it left off.
  3. Once resumed, the generator function will evaluate whether or not it should go deep. If not, it will simply continue looping through the leaf’s immediate children, yielding up each in turn to the generator object. This is simple enough, but the real power of this generator is in recursion, so we assume that deep === true and move on.
  4. Since we want to recurse through the child leaf’s children, we ask the child leaf for its own generator by calling leaf.traverse(true). We then iterate over this generator’s results…
  5. …yielding up each generated leaf as a result to the enclosing generator. The net effect is that all leafs will eventually be yielded up, in sequence, to the topmost generator, allowing calling code to “walk” the tree in order without accumulation.

But wait, there’s more!

Generators can also consume data passed to the next() method, which can be assigned to a variable through the yield expression. Consider an example where we want to progressively hash some accumulated input, to see how the hash changes over time as the input changes (maybe we are capturing a keydown event for an input field and show the changing hash as the user types). We can implement the following generator function.

// node.js crypto
const crypto = require('crypto');

const progressiveHasher = function* (value) {
    console.log('HASHING', value);
    let hash = crypto.createHash('sha256');
    hash.update(value);
    moar = yield hash.digest('hex');
    while (true) {
        value += moar;
        console.log('HASHING', value);
        hash = crypto.createHash('sha256');
        hash.update(value);
        moar = yield hash.digest('hex');
    }
}

The progressiveHasher function accepts an initial input value and then yields up its hash. Each time the next() method is called on the returned generator, a bit of additional input will be supplied, and then the recomputed hash for the accumulated input will be yielded up again.

// user gives some initial input
let input = "Nich";
let hasher = progressiveHasher(input);
// first "next()" forces the generator to start
console.log('>>', input, hasher.next());
// user adds some additional input
input = "olas ";
console.log('>>', input, hasher.next(input));
input = "Clo";
console.log('>>', input, hasher.next(input));
input = "ud";
console.log('>>', input, hasher.next(input));

The console output shows the progressive hashing that is performed.

HASHING Nich
>> Nich { value: '8a5e83d26360937f69b55871f6b106736f80eb0e2fe64f2ceba5cdafbed6c3c9',
  done: false }
HASHING Nicholas
>> olas  { value: 'bd5d2cc47e69b718bc22523096a4c5c62e6068f9d7046e4a19332c38997bb9a3',
  done: false }
HASHING Nicholas Clo
>> Clo { value: '7cbc7d03a684091b972a8771c232d3b2b3bad14a903ba472eb4859e25ead9353',
  done: false }
HASHING Nicholas Cloud
>> ud { value: '47bf74db3917b66855d5b4cb2b76c7548256ecbcd429519a01f8217fe8644568',
  done: false }

Note that the result returned from each invocation of next() is never done. This generator operates an internal while loop, and so will operate in perpetuity. However, if the generator’s return() function is called, it will run until the next yield and then terminate. Successive calls to next() will return undefined values. Changing the second occurrence of next() to return() reveals that the generator does, in fact, give up the ghost.

// setting up the generator...
input = "olas ";
console.log('>>', input, hasher.return(input));
// more calls to next() ...
HASHING Nich
>> Nich { value: '8a5e83d26360937f69b55871f6b106736f80eb0e2fe64f2ceba5cdafbed6c3c9',
  done: false }
>> olas  { value: 'olas ', done: true }
>> Clo { value: undefined, done: true }
>> ud { value: undefined, done: true }

Cash me owside how bow dat

You can also use destructuring shorthand to acquire values from a generator if you know how many values you want ahead of time. The following generator function will take a starting year, increment it by some value, up to a stopping year (exclusive). Each incremented year will be yielded as a generator value. Destructuring is used to capture the first three years in the potential sequence.

const theFuture = function* (startYear, stopYear, increment) {
    while (startYear < stopYear) {
        yield startYear += increment;
    }
};

const [year1, year2, year3] = theFuture(2017, 2200, 10);
console.log(year1);
console.log(year2);
console.log(year3);

Because the generator object is never assigned to a variable in this example, it cannot be used further. If theFuture() is invoked again it will create a brand new generator object, so for destructuring to be helpful the generator function must produce some known, useful quantity.

Resources

There is a lot more to learn about generators that I don’t cover it here. For further reading I recommend the following resources:

  1. Exploring ES6, Chapter 22 - Generators
  2. function* - MDN
  3. Generator - MDN