Ember Data Merge Hook

Ember Data Merge Handling

Internally, Ember Data models use a state machine to represent their lifecycle. One of the limitations of the current implementation is that records cannot be modified:

  • By the client application if the record is in-flight, or
  • By the server (via an out-of-band load()) if the record has pending changes on the client.

One solution to both of these limitations is to implement merging logic if there a conflict between client and server truth is detected. The default could be as simple as last-write-wins or as sophisticated as a three-way merge (à la Git’s default), with the application developer given the option of overriding the default merge strategy.

App.Person = DS.Model.extend({
  firstName: DS.attr('string'),
  lastName: DS.attr('string'),

  addresses: DS.hasMany('App.Address'),

  merge: function(original, local, remote) {
    //... resolve conflict here
    return resolved;
  }
});

Outstanding Issues

Dirtiness

After a merge is resolved, how do you indicate whether the record is still dirty? For example, imagine that after a merge all of the changes made locally have been subsumed by the new truth that arrived from the adapter. At this point, the record should be considered clean.

Perhaps the best thing to do here is to simply diff the returned hash against the new remote hash, and mark the record as clean if they are the same.

Merge Hook Location

The most natural place for the merge hook to be implemented seems to be on the DS.Model class, but:

  • Is this actually a model concern? May people want adapter-specific merge semantics?
  • The above proposed API makes merge a reserved attribute and relationship name. Are we okay with this?

Arguments to Merge Hook

We obviously need to provide original, local and remote copies of the data. But should those arguments be raw data hashes, or DS.Model instances? I think it would break people’s mental model if they weren’t models. For example, imagine you have a computed property on your DS.Model and you want to use that to compare the two. The fact that this wouldn’t work if we passed raw hashes would be confusing.

If we do pass DS.Model instances, we may need to do some work to make sure that people don’t somehow hang on to references, because they will essentially be inert. This may take some infrastructure work.

8 Likes

Seems good! I think having it at the model level seems good, especially if records are provided as attributes, but that also seems crazy non-trivial (relationships, iScare.) Maybe a “simpler” implementation at the adapter level that doesn’t contain any model concerns, sort of like “got these attrs, and these attrs, what attrs you want dogg?” and then that gets loaded into a record. Maybe at that lower level it would be a more reliable implementation? — I don’t know what I’m saying!

2 Likes

I think merge is adapter concern and should operate on raw hashes. Creating 3 instances of the “same” model and needing to create and manipulate one as a response seems overkill. Merging data seems like a low level concern.

2 Likes

First off, super excited by this proposal!

Thoughts on details:

  • Dirtiness: This would be a big change, but would it be possible to store the dirty state of attributes rather than just the state of the model itself? Presumedly, if a merge overwrites a dirty attribute, that attribute should be set to clean. Otherwise, I think the only sane way to do this would be to diff the hashes.
  • Merge Hook Location: I think it’s a per-model concern. Even if it was something that the adapter could handle, I would like for it to be as a default handler and allow models to override it. It also may be a better idea to use a property like _merge for the handler method. It is, after all, a private method that won’t be called directly.
  • Arguments: The major advantage I see to passing hashes is that they’re far easier to reason about and use. It’s very clear from a user perspective that these are local copies of various attributes. I think the downside here isn’t so much the lack of computed properties, but how you’d represent relationships. For example, if a model containing a hasMany relationship needs to be merged, should the relationship be represented as IDs, as a hash of the contained models’ attributes (if it’s embedded), etc.
1 Like

it’s awesome to think of having a framework solve these problems and not worry about them in the app :smile:

i think it could be both – is it possible to provide an override at both the model and adapter level, such that the adapter merge is used if available, then it falls back to the model if available, then the default?

it seems like you might want to take a more granular approach and think in terms of merging model properties. the conflicts are happening between the properties, and the model is meant to capture the relationship between those properties. all that to say i’d expect them to be more along the lines of

  merge: function(localModel, propName, originalPropVal, localPropVal, remotePropVal) {
    //... resolve conflict here
    return resolvedProp;
  }

the downside is that you don’t have the computed properties you get with a model instance, the benefit is the application developer doesn’t need to worry about determining which properties caused the conflict (which seems like a framework concern).

this does seem like a better approach.

This seems like an awesome idea in theory, but 3-way merges sometimes require intervention to be resolved. Do we really want to be presenting the user with a merge conflict while they’re trying to use an app?

A 3-way merge could be useful as a first-attempt to resolve a model diff, but it would have to fall back to a pre-set authority (probably the server).

This exists as a starting point: https://github.com/rsms/js-object-merge

We have cases where we would want user intervention in a three-way merge and other cases where we can do it automatically. So far, those cases are all on separate models. I don’t believe we need controller-specific merge strategies, but I’m not certain of that.

I can forsee that you would like the user to resolve some conflicts (or all, depending on the type of data in your application) depending on the context of the application. It feels clunky to put merging logic in the model. Maybe we have a MergeStrategy object that we pass to the model to resolve this?

@thomasboyt we have support for per-model adapters, so pouting the merge on the adapter makes it per-model. If we go with hashes it make no sens to expose it on model. A model should newer deal with raw data, only adapter should.

In a way I like the idea of per attribute/relationship hook. Not sure right now about downfalls.

@kieran @jamesarosen we definitely want an easy way to plug some UI on top of merge process. We probably need to fire some events on the model and expose a resolved hook or something.

Very excited to see this feature proposal. Up-to-now ember data was mostly a great match for apps with long-lived data. Giving a standard way to solve merging issues is a big step towards some use cases.

Merge Hook Location: I’m with @tchak that it should be a concern of the adapter, although configurable per-model in adapter.

Arguments: Comparing hashes seems easier and more appropriate if located in adapter.

Merge Strategy:

Yes, definitely. Sometimes, user intervention could be necessary (eg. for resolving a conflict in a property that contains lots of content eg. blog posts content). Providing a UI for this should be a snap for app developers. I would suggest that all models/hashes that have conflicts and need to be resolved by a user, get buffered in a bucket in the store. An app could observe this bucket and provide the user a notification in the UI.

what is the status of this feature now?