Why is PhantomJS disconnecting?

For a few months now, we’ve been having problems with our tests aborting in CI. On Ember-CLI 1.13.13, we got the error

not ok 130 PhantomJS - Browser “phantomjs /home/travis/build/fastly/Tango/node_modules/ember-cli/node_modules/testem/assets/phantom.js http://localhost:7357/9353” exited unexpectedly.

That message comes from Testem.

It would happen fairly erratically, but once it happened on a build, it would likely recur at least a few times. Here are some of our hypotheses and what we did to try to fix the problem:

Bad version of PhantomJS

This StackOverflow answer suggests it’s a bad build of PhantomJS. We upgraded to a release version of PhantomJS 2.0.0 in dev and CI using this PPA. That didn’t fix the problem.

Out of memory

Since the error only happened on some builds and usually fairly late into the build, we thought perhaps the PhantomJS process was crashing because it was running our to memory.

We upgraded our CI infrastructure to machines with more memory. We split the test run into four separate test suites:

- ember build --environment=test
- ember test --path=dist --filter="JSHint"
- ember test --path=dist --filter="JSCS"
- ember test --path=dist --filter="Unit: "
- ember test --path=dist --filter="Acceptance: "

I went through the build in Chrome many times looking for memory leaks and eliminated all I could find. (Other than upgrading Mirage to take advantage of this cleanup, all were jQuery event handlers.)

I also ran the test suite on the CI machines and SSH’d into them while the tests were running. I ran free -m occasionally to find out how much memory was left. Over the course of the suite, it went from 7000MB to 4500MB – definitely some memory leaking, but not nearly enough to crash the process.

Upgrade Ember-CLI

I noticed that more recent versions of Testem had a slightly more informative error message, so I upgraded to Ember-CLI 2.3.0-beta.1. The upgrade went fine, but didn’t help. Now we get

not ok 130 PhantomJS - Browser “phantomjs /home/travis/build/fastly/Tango/node_modules/ember-cli/node_modules/testem/assets/phantom.js http://localhost:7357/9353” disconnected unexpectedly.

from this line.

Print STDOUT, STDERR

I tried monkey-patching Testem to print STDOUT and STDERR when it exits, but they were empty.

Summary

I’m at a loss. I can’t think of anything that would cause the socket to disconnect that wouldn’t also crash the browser. There’s no code that navigates away from the page or closes the window. The failures happen only sporadically – always on an acceptance test, but on many different ones. If anyone has any suggestions, please let me know.

I had the same disconnection problem. Downgrading testem version to 0.9.11 solves the problem for now.

How did you downgrade? Did you have to fork Ember-CLI? Or is there some setting you can use to tell Ember-CLI to use another version?

I wouldn’t have thought it was related, but since @rwjblue mentioned it in Slack, I will. When I tried switching the test runner to Chrome, the tests seemed to boot, but never got any output back from the browser, so they just timed out.

Go to node_modules/ember-cli/ and run npm install testem@0.9.11.

If you can create reproducible steps with a vanilla ember app, then let the [dev] (Fixes crash on missing items in dev mode by johanneswuerbach · Pull Request #764 · testem/testem · GitHub) know.

Testem maintainer here :wave:

I tried the same, but sadly phantomjs isn’t logging anything so I skipped the logging for now. If somebody knows how to get more details from phantom before it crashes I’m happy to change that in testem. PhantomJS debugging is currently rough.

The lost socket connection looks more “promising” as test runs are usually really stressing the browser and socket.io might have some to small timeouts. The previous testem version was somehow hiding this by always reconnecting and restarting your tests, which looks better in CI, but can be super expensive with bigger test suites.

I’ll look into the timeout issue as soon as time permits.

Testem issue, please add any additional details here: https://github.com/testem/testem/issues/777