Performance with large data sets


#1

Our Ember application has to work with quite large data sets and we’ve found that trying to load 5000+ objects into the store impacts performance.

To test this we use an ajax GET to return 17000+ objects from our backend and display the results in ember-models-table, this takes about 1 second to grab the data and fully render the display.

Next we tried using the REST adapter to grab 13000+ objects, import them into the store and fully render the display, this takes 18 seconds.

We started using pagination from the backend to work around this problem but I can see us needing to return lots of objects at some point in the future, and also this would mean creating multiple endpoints that all pretty much do the same thing.

  1. How are other people working with large data sets?
  2. Is the JSONAPIadapter any quicker or does the bottleneck lie with ember-data importing into the store?
  3. Would rewriting our adapter improve performance? If so what should we be doing and how would it look? (see below for our very basic current adapter and serializer)

I’d like to stick with Ember if possible but if there’s performance issues with the store and large data sets we might have to evaluate other frameworks (which I’d prefer not to do!)

Adapter

import DS from 'ember-data';
import DataAdapterMixin from 'ember-simple-auth/mixins/data-adapter-mixin';
import { inject as service } from '@ember/service';
import config from '../config/environment';

export default DS.RESTAdapter.extend(DataAdapterMixin, {
  session: service(),

  authorize(xhr) {
    let { accessToken } = this.get('session.data.authenticated');
    xhr.setRequestHeader('Authorization', `Bearer ${accessToken}`);
  },
  host: config.hostname
});

Serializer

import DS from 'ember-data';

export default DS.RESTSerializer.extend({
  primaryKey: '_id',

  serializeId: function(id) {
    return id.toString();
  }
});

#2

If the issue is only related to ember-data, you might just want to drop than one and not the entire framework. You could try out ember-easy-orm or just go without an ORM. Before choosing another framework for this reason, I would recommend performance testing with that one (including a similar ORM as ember-data). I bet most of the frontend frameworks are not build with having such large data sets in mind.

What is your use case to transfer that large datasets to client? That seems strange to me. How do you visualize that amount of data? I can’t imagine how a user would benefit having a table providing 17000+ rows.


#3

We have had a brief look at other ORMs and ways of retrieving our data, in particular these:-

  • ember-restless
  • ember-data-storefront
  • ember-cli-simple-store

There didn’t seem to be many “use cases” for these as most people are using “ember-data” and as that is pretty much part of the ember core we thought that would be the best option as it would be the most supported going forward.

As for the amount of data we need to retrieve I may have not explained myself well, this might not always be displayed in a table, sometimes we need this amount of data for live statistical analysis and we thought as the data was in the store this would help reduce network requests.

It might be the case we drop ember-data in favour of something else, I’m just not sure what that something else should be and was looking to the community to see what other people were using with large data sets.

I’ll have a look into ember-easy-orm and see what that offers.


#4

Did you investigated other approaches like doing the statistical analysis before pushing into ember-data’s store to reduce the number of records in store? Mostly this is done server-side but if this is not possible (e.g. third-party API and can’t add a custom service in between due to authentication issues) it might be possible to do the statistical analysis part in serializer. I bet the performance issues you are facing are caused by ember-data’s internal model doing a lot of stuff (e.g. setting up relationships to other records, tracking changes to attributes etc.).


#5

Ember Data is most definitely optional, and lots of people use other things. You can also use Ember Data for some kinds of data and not others.

Large statistical data sets are a good example of a use case where there’s not much point is using Ember Data, or any ORM really. If you’re not doing create/update/delete of individual data points, there’s no reason they need to each be backed by a model.


#6

In this screencast I showed how to do some basic statistical work and plotting using just fetch in an Ember app.


#7

@ef4 We are using fetch at the moment, that’s what returns the 17000+ objects from our backend and populates a table in 1 second.

Looking at how we are doing things and the what has been suggested here it looks like we might stop using ember-data and switch to some variant of using POJO and ajax GETs and POSTs, which ember data does under the hood anyway.