Crawlable Ember Apps

blatyo · March 17, 2013, 1:32pm

I was wondering what approaches people were taking to make ember apps crawlable by search engines. I know discuss is suppose to be crawlable, but looking through the source, it wasn’t obvious to me what they did to make it crawlable. I’d be interested in hearing about discuss’s approach and any others.

Wildhoney · March 17, 2013, 2:53pm

I don’t have the answer, but I thought I’d add my two-penneth. I’ve not looked into it too far, because I only use Ember for one public-facing website, all the others are private applications which require a login.

I remember reading a while ago that Google compiles JavaScript if the JavaScript is all on one page (inline JavaScript), and so Googlebot doesn’t have to go and grab external resources to compile it. The experiment I was reading consisted of semi-complex JavaScript, but I am not at all sure on how Google handles Handlebars. I’d be very interested to see if there are any experiments relating to Ember being compiled if it’s inline. That doesn’t mean we have to write all of our Ember using inline JavaScript, since we could use a Ruby/PHP script to inject it into the index file.

However, I’ve created an application in the past with limited time to concern myself with the non-JavaScript version, but I simply wanted it crawlable. For that I used PhantomJS to take the URLs (where they don’t include a hash) and then compiled the JavaScript using PhantomJS into a static HTML file, which would then be cached for a predetermined duration before being generated again upon request. If a genuine user visited the non-hashed URL (such as from Google), then I would add the hash back into the URL and the user would be forwarded to the Ember version.

With all that said, I’m far from an expert when it comes to this, and hopefully others with more experience can add their wisdom.

system · March 17, 2013, 2:53pm

As @sam said in this post, Discourse uses noscript tag to make the content of the posts/topics available to crawlers.

i_am_brennan · March 17, 2013, 6:42pm

Not that anyone outside of the developer community cares, but this is really a google problem. I mean it’s our responsibility that our apps are crawl-able, but… I think they realize that and are working on some secret new crawling techniques.

Do a search for pangratz’s emberjs dashboard and you’ll see recent results previewed in the search results.

Also search google for this form, the previews show the emberjs app and not the noscript version: http://cl.ly/image/1H2E0M131w1j

seilund · March 17, 2013, 7:26pm

You need to make your back-end support every URL that your Ember.Router does. Do a “view source” on different pages of discuss.emberjs.com, and you’ll see that each page serves static HTML that matches completely with the content you see on your screen. This HTML needs to have real links to the next pages.

Take the source code of http://discuss.emberjs.com/ for example. It contains HTML like this:

...
<div class="topic-list">
<a href="/t/welcome-to-the-ember-js-discussion-forum/185">Welcome to the Ember.js Discussion Forum</a> <span title='posts'>(4)</span><br/>
<a href="/t/how-to-add-child-in-ember-data/470">How to add child in ember-data?</a> <span title='posts'>(2)</span><br/>
<a href="/t/todomvc-based-getting-started-guide/433">TodoMVC-based Getting Started Guide</a> <span title='posts'>(2)</span><br/>
...

If you check the source for Welcome to the Ember.js Discussion Forum, you can see that it also contains the actual post content, author name, and everything else.

...
<div class='creator'>
    #1 By: <b>Tom Dale</b>, March 11th, 2013 12:23
  </div>
  <div class='post'>
    <p>Welcome to the Ember.js discussion forum.</p>

<p>We're running on <a href="http://www.discourse.org/" rel="nofollow">the open source, Ember.js-powered Discourse forum software</a>. They are also providing the hosting for us. Thanks guys!</p>
...

All this HTML should be wrapped in a <noscript>...</noscript> tag, so normal users with modern browsers only see the Javascript generated stuff. But the search engines will only look at what’s inside the noscript tag, which is how they are able to crawl your site.

stusalsbury · April 17, 2013, 6:45pm

My plan, unless I figure out something better:

use Phantom JS to crawl my site for me and produce static files; and
in a Spring servlet, use the Robots database to serve the static files to known bots instead of the AJAX site.

pjscrape looks like it might come in handy.

Aside from this being an awful lot of work to do for SEO, does this sound possible? Obviously it would be automated.

alexsferreira · April 22, 2013, 8:34pm

I developed a tool that assists in the creation of snapthots HTML dynamically in real time.

I have used this tool in all applications using ember and I have no problem with indexing on google and bing.

to use it in ember is need to change the location of the route to hash

tool ajax-seo

###Example:

Screenshot search with google http://cl.ly/image/3n450K450h10

(function() {

var get = Ember.get, set = Ember.set;

Ember.Location.registerImplementation('hashbang', Ember.HashLocation.extend({   

  getURL: function() {
    return get(this, 'location').hash.substr(2);
  },

  setURL: function(path) {
    get(this, 'location').hash = "!"+path;
    set(this, 'lastSetURL', "!"+path);
  },

  onUpdateURL: function(callback) {
    var self = this;
    var guid = Ember.guidFor(this);

    Ember.$(window).bind('hashchange.ember-location-'+guid, function() {
      Ember.run(function() {
        var path = location.hash.substr(2);
        if (get(self, 'lastSetURL') === path) { return; }
        set(self, 'lastSetURL', null);
        callback(location.hash.substr(2));
      });
    });
  },

  formatURL: function(url) {
    return '#!'+url;
  }

  })
);

})();
```

```
App.Router.reopen({
  location: 'hashbang',
});
```

stusalsbury · April 22, 2013, 10:30pm

Thanks, Alex. That looks really interesting. I might be able to use it. A couple of questions:

I’m confused by all the hashbang stuff… is this meant for an app that uses HTML5 history?
How do you consider it to be usable (there’s no license information)? Does that mean this applies?

alexsferreira · April 23, 2013, 1:39am

The solution was created based on the documentation of google, but you can use without hashbang

You can use the ajax-seo freely, had forgotten to put the license, just add the license mit

alexsferreira · April 23, 2013, 6:23pm

@stusalsbury tested the proposed solution?

stusalsbury · April 23, 2013, 6:46pm

No, I’m “frying some other fish” right now. I will certainly let you know when I get to this. I’m glad to know that it should work for non-hashed URLs.

alexsferreira · April 23, 2013, 7:17pm

Any questions you may have just talk.

thanks.

Scott_Baggett · April 24, 2013, 5:01am

This is something I’m going to need to also be thinking about very soon. Glad to see this post! @alexsferreira the app looks very useful, I’ll be in touch if I have any questions. Thanks again.

alexsferreira · April 24, 2013, 6:15am

@Scott_Baggett any difficulty using the ajax-seo contact.

best regards

alexsferreira · May 15, 2013, 5:05am

An example of running an application using emberJs created with emberGen and indexed with the help of seoJS.

Application Website http://seojs.alexferreira.eti.br/

See the application already beginning to be indexed on google. http://goo.gl/jIA1F

curiously open any link of google results and viewing the source code.

persocon · July 14, 2013, 3:49pm

And to run the SeoJS on server? like dreamhost :s sorry for the dull question

alexsferreira · July 19, 2013, 7:56pm

If you have ssh access to the dreamhost can run, just that u download the package phantom js on the server and run the seoJs with him.

remembering that I have not tested it on dreamhost but I do not have problems with using.

douglasbhill · July 21, 2013, 11:43am

The way I approached this was to extend HashLocation to get a hashbang style url. Then used nginx to forward to an instance of phantomjs running a custom script that would call my app and load the page, strip script tags and other junk that isn’t needed, cache it and return the results when they were in the form of http://www.myapp.com/?_escaped_fragment_ . Works fairly well.

This also works very well for linking dynamic pages with the right opengraph info on facebook as they use the #! same scheme as google.

yannick · August 21, 2013, 9:44pm

We provide a service to simplify the process of snapshot creation, host them and serve them to search engine bots.

Works with hashbang or pushState (adding in your tag) and provide some interesting features like :

possibility to specify a HTTP code that should be returned to bots for crawled routes/paths (specially useful for 404)
site crawling in advance and regularly, to always serve updated captures without the constraint of real-time and so long response times
mechanism to detect when the page is ready to capture automatically or programmatically (through a css selector or a callback)

This may help you if you need a solution to index your ember spa. We are in beta so the service is free to use during this period. Do not hesitate to have a look at it!

More information here: http://www.seo4ajax.com

ahaurw01 · August 22, 2013, 4:14pm

I have written a blog post about how I do it with my blog, which is a simple ember app: Making your ajax webapp crawlable

My approach uses a lot of approaches you folks are talking about. My case is pretty straightforward; I don’t really need to render the pages using phantom or anything. But if you needed to do such things, the concepts in my post would still definitely apply.

One thing I haven’t written about yet is using a sitemap.xml to hint at El Goog about what sites to crawl and how often it might get updated.

Topic		Replies	Views
Ember search engine Search	13	5661	October 7, 2013
Can Ember.js Make Infinite Scroll Search-Friendly? Search	0	1983	March 8, 2014
Google: Deprecating our AJAX crawling scheme	2	1862	October 19, 2015
Google doesn't render Ember app Search	18	5319	June 12, 2016
The state of SEO & EmberJS	3	3890	November 27, 2015

Crawlable Ember Apps

Related topics