Robust client state? How/where to store?


#1

Imagine you’re taking an online exam, and spend 2 hours answering questions, then your system decides to turn itself off to install updates, or your browser crashes, or your cat walks on the keyboard and reloads the page. Everything is lost when you go back to the page. Time to demand a refund and use another company.

How, as Ember / Ember.data developers, should we prevent this from happening? How do we get robust state in the client?

Another similar example is this one about shopping carts

Use the URL

The current situation seems to be assuming everything in the app can be rebuilt from the URL, either by query params or by re-loading external data.

I don’t think the URL and query params are appropriate for large amounts of data. I also find query params really ugly when they’re not used to control a query, but that’s personal preference.

Sending all updates immediately to the server isn’t the architecture we want. We want to use the client’s local storage, for client robustness and latency, and lower server throughput.

Store locally, commit later

Instead we want to use client state to roll those requests into one big update later, once the data is ready to submit (once the exam is completed).

In practice that means storing the exam responses (e.g. a hash of question ids and selected answer ids, or text) in the client until the user clicks mark exam.

A survey or feedback form might work differently here, wanting to report data to the back-end immediately, to allay abandonment, but we don’t need that, and can have a single post request.

Ideally in this case the ember store would not be flushed when the page is reloaded. Since it is, I’m trying to find a good, or at least working, alternative.

Make pending models with the local storage adapter

I’m trying now, after failing with ember-local-storage, to build the exam (form) using normal ember-cli/ember.data pages and models, with the ember.data localstorage adapter, then once the user finishes the exam it’ll build up and submit a set of duplicate (or very similar) models backed by the rest-adapter.

This seems like a lot of extra work? Is it normal?

Could ember.data, or a new adapter, handle it?

It seems like ember data models have two states: ephemeral (updated or new) or saved-to-backend? There could be a 3rd state: persisted on client (local storage), which could replace the ephemeral state in the same way that some apps write to disk instead of to memory?

In my case that’d mean an adapter pointing at the rest api as normal, but also saving itself to local storage (and being able to restore from it) up until submission to the rest api. Whether this should be automatic (work by default) or invoked by the app (more performant) isn’t clear, but both are probably possible?

Does that sound like it could work, or maybe someone has another idea? Or there’s an addon already?

Does it matter?

Ember’s marketing says it’s for making apps that compete with native clients.

I’d argue that native clients use local storage to be responsive, with good performance and availability despite network or server problems, and patching over reloads and browser crashes is part of this?

I seem able to answer a lot of my own questions by saying “store in the server”, but that’s forcing one architecture on all applications. Is that expected, or is this a real problem?


#2

I personally believe you should persist to backend as early and often as possible and use URLs to manage state. Sure the routes are a little complex but you can fetch multiple resources in a single route, you just need to make sure your relationships are modeled correctly.

Orbit JS offers a few more fine grained controls that in theory let you manage the sync between local, remote (REST) and in memory stores.

If you have complex state you might want to check out Orbit JS.


#3

Thanks that looks interesting.

My main concerns are scalability and client latency rather than how the urls look.


#4

Yeah, Orbit might give you more fine grain control to manage the apparent latency.

There are lots of ways to hide the latency if you are smart about when you are persisting and do the work continuously in small bits in the background. It is not always the case that the user needs all the data “right now” same for saving. Not everything has to be be saved “right now” to backend if you have it committed to local and a queue is flush out to backend.

The trick is the “eventual” consistency and you have to mange change and synchronization logic. I think that the problem Orbit is trying to solve by keeping the different stores (memory, local and REST) as relatively equal with transparent sync logic.


#5

Commit to local and flush is exactly what I want to do, but am not clear how to do that in Ember / Ember data. It seems like the models can get very wet (not dry) or very knotty very quickly?

Or perhaps it’s fine to write the extra models, since that’s expressing exactly what is going on?

Orbit looks interesting (I was working on/around CRDTs a couple of years ago as a doctoral student, so nice to see a place that can plug in with & that’s ember-related) but for now I just want short-lived durable storage of some new data until it’s submitted, rather than full or partial replicas of data in the client.

If anyone reads this and isn’t sure about “consistency” or why it isn’t a solved problem in ember.data I recommend Doug Terry’s article Replicated Data Consistency Explained Through Baseball, 2013 CACM, which Microsoft Research also share as a free pdf.


#6

You might also be interested in coalesce.js

This originally started life as EPF (Ember Persistance Foundation) an alternative approach to Ember Data.

http://epf.io

It has the concept of a “session” that you can flush. Not sure about the local storage story though.


#7

Ember Pouch 2.0.0 is now an Ember CLI addon. You can try that. Example


#8

This looks really interesting, thanks!

all of your app’s data is automatically saved on the client-side

If that’s true w.r.t. what I’ve been moaning about then it’s going to solve my problem quite nicely.

Initially I guess I’ll need to write some submit code that reads data out of pouch and posts to a rails api (that writes into postgres), but we’re thinking about switching to document storage later anyway so this could be a nice route to that.


#9

Or you sync data from a CouchDB like iriscouch or your own with rails and connect Pouch to Couch.


#10

Interesting topic which reminded me of a few projects of my own. I would focus on building a robust application and not yet jump to the conclusion that a robust client state is the best way to accomplish that.

Storing the data client side and syncing it periodically or at the end sounds nice in the best case scenario but how do the other scenarios look like? (The user closes the browser after finishing and the sync was not yet complete, no data at all or some things are missing. Will you be able to understand and debug the new, more complex, stack and library needed for syncing? And a lot of different things no one can think of right now.)

The end user does not care about all these things. As someone who takes the exam I want to know what is saved and when. I want to trust the system that it does not lose my answers. And if something goes wrong I want to know, and please don’t show me some cryptic developer error messages. Instead you should give me some actionable advice (check if you have an internet connection, …).

I would approach the problem from this point of view. Having built something similar, a survey tool, I would structure my data in a way that I can save single questions + answers on their own. This way it’s easy to show what is saved and what not and you don’t run into the problems associated with having a big ball of JSON that gets passed along whenever you make the tiniest change.

Since you briefly talked about server load, what are the requirements in terms of concurrent users? How many users has the server to handle at a minimum? I doubt you will run into problems with a decent server and modern backend application. And even if you do it’s easier to scale a backend than to build a complex and fancy client that tries to mitigate that problem. (What happens if all clients sync at the end and the server get’s into trouble at that point?)


#11

Thanks for reading and thinking about this. It’s making me revisit some past decisions, which is healthy. Scaling the front-end, back-end, or both, is clearly a key decision, and for our problem and current resources we feel the front-end is the way to go. I also wouldn’t want to make a very complex client, but I don’t think we’re going to end up with one, thanks to Ember.

I would focus on building a robust application and not yet jump to the conclusion that a robust client state is the best way to accomplish that.

I think I left myself open to this with how I wrote the original post. I should have asked how to idiomatically get durable client state in an ember-cli app, as talking a lot about my app doesn’t seem appropriate here. Also I think robust was the wrong word. Still, the cat is out of the bag now :smile:

From what I’ve seen of it I think ember-pouch / pouchdb is a viable drop-in answer to that question. Orbit also looks interesting, but I didn’t get past looking at github.

The end user does not care about all these things. As someone who takes the exam I want to know what is saved and when. I want to trust the system that it does not lose my answers. And if something goes wrong I want to know, and please don’t show me some cryptic developer error messages. Instead you should give me some actionable advice (check if you have an internet connection, …).

I agree completely with this, but think it’s another topic. I’m asking a technical question that, as you say, the users don’t need to know about; in fact, they shouldn’t notice we’ve replaced part our system, except that everything will feel faster than before & there might be a little extra visual feedback if things are loading.

Storing the data client side and syncing it periodically or at the end sounds nice in the best case scenario but how do the other scenarios look like?

Quite! I expect this will lead to the design starting one way but changing as we go on. It will start with the assumption that the user completes their session on a single device, so local storage is fine, but of course it’s nicer if they’re able to change browser, change computer, etc.

I doubt you will run into problems with a decent server and modern backend application

Ouch :slight_smile: but fair comment. We’re avoiding working on this due to resource constraints, but ideally it would be done. However I think it’s a band-aid for the real problem, which is the clients sending too many requests.

Roughly, for the part of the system we’re talking about, our current rails app gets a request from each client approx every 2-10 seconds, for 100-2000 concurrent clients at peak season, and clients are exposed to the database write latency each time they change question. The trouble is that’s a mix of cheap operations (perhaps 30ms) and expensive (a few seconds), and things get choked up.

There are a few ways to improve this, and although it’s desirable to keep everything in-house where we have more control of it, if we manage the exam state in the browser instead of in rails, the client will be more responsive, and the the majority of the server requests will disappear, the time between requests increasing to between 10 minutes and 2 hours (ignoring reads, which hit memcached).

It seems to me that this is a big difference, and changing the communication pattern between the client and the server will not only make the back-end less of an issue, but also improve the user experience.

Data architecture for Ember docs?

Something I’ve wondered while working on this is whether the Ember docs/guides should talk about this? I’ve seen people say React isn’t Ember because it only handles views, but if Ember is doing more, should it talk about data architecture a bit, rather than assume Rails? For example, a guide page that mentions e.g. client-only, client and server replicas, or server only, and pointers to appropriate adapters. Currently you end up with a very different system if you use the RESTAdapter vs the pouch adapter. It could also warn about the data consistency trade-offs / pitfall. I guess a blog post (or book) with code examples would also be nice, but all require time and expertise to write.

Or perhaps it’s outside of the framework responsibilities; I’m not sure what they are.


#12

If you are targeting mobile I do not think browsers are ever going to compete on a level playing field as native implies more access to system resources that browsers are likely never going to expose.

We have a similar client-side requirement in our system. We use lawnchair (http://brian.io/lawnchair/) as our storage adapter. Initally all our data was stored using the DOM local storage. However, at some stage it came to light that when clearing one’s browser cache the local storage is also wiped. That was somewhat of a problem.

What we opted for was using a local node-js web-api to store our local data on the client side. This required a client side installation. Since our application is an enterprise solution it doesn’t really present us with a problem. This solution would not work for a public-facing site. For mobile devices we still use the local storage and hope for the best :smile:

Browser development does present itself with some unique challenges and I, like many others, believe it really should be used for applications that are always online. Offline browser applications are currently blown out of the water by native applications on any platform.