Technical Notes on the New ChangeApplier Implementation - FLUID-5045
The intention and thinking behind the new ChangeApplier implementation has been sketched out at a couple of pages: New Notes on the ChangeApplier and Notes on Expressionism in Model Relay. However, the implementation itself has proved quite complex and this page contains a guide to the implementation strategy and data structures which will be useful to those reviewing or reading the implementation code for other purposes.
Broadly speaking, the new ChangeApplier manages a "skeleton" of models linked together via relay, connected as a subgraph of the graph of components in the IoC component tree. To be part of the same skeleton, a set of components must i) be descended from the new model-bearing grade fluid.modelRelayComponent
, and ii) declaratively define a relay link to another element of the skeleton - either via a) "implicit relay" by declaring IoC references to the other element's model in the model
area of the component, or b) a transforming relay in the modelRelay
area.
When a change is incident to a model part of a skeleton from the outside (by virtue of a "user" firing a ChangeRequest using its ChangeApplier's "change" method), this change is propagated around the skeleton until it becomes consistent. This propagation is called a "transaction" although it has only a little in common with transactions as understood in typical persistence technologies (see the "New Notes" for discussion). A special case of such a transaction is the "initial transaction" - each model in the skeleton is considered to start off with the value "undefined", and all initial values collected from configuration and other relays are applied to the skeleton as part of the "initial transaction" which then propagates in the usual way until a consistent value has been achieved. Only after a transaction has "stabilised" are user listeners (registered as part of the modelListeners section of component configuration) notified - these in terms of the new implementation are all considered "transactional listeners" and those listeners which are registered by model relay itself in order to propagate the transaction are considered "nontransactional listeners".
Several data structures are needed in order to track the progress of the propagation of "invalidation" throughout the model skeleton. These are all stored in a special area in the standard "instantiator" which is the book-keeping structure for the IoC instantiation system - this area is named modelTransactions. This contains two types of record, one named init
holding details of an "initial transaction" (in theory there may be several of these but limitations on JS concurrency and the implementation imply that at any time there will be only one), and standard records which are indexed by a transaction id.
Fields within instantiator.modelTransactions
:
Record index | 2nd-level index | Record type | Contents and significant fields | Field type | Field purpose |
---|---|---|---|---|---|
init | <component id> | initial transaction record |
| Boolean | "Is this component ready to participate in this transaction?" - that is, has it reached the end of interpretation of its model and modelRelay configuration |
completeOnInit | Boolean | "Was this component's model already fully initialised when this transaction started?" | |||
that | Component | The component itself participating in the init transaction | |||
<transaction id> | <link id> | link count field | Integer | Number of times during this cycle that this link (relay) has been operated. All of these counts are reset whenever new data enters the transaction from the outside - e.g. when a new "initial model" is set or a user change arrives. They are also cleared when a relay document is activated - fluid.clearLinkCounts . These counts are capped to 2 per cycle | |
<applier id> | participating applier transaction | transaction | Transaction | The representative for this "overall transaction" with a particular ChangeApplier. These all share the same "transaction id" | |
options | Object | Optional - only present if this applier is the applier attached to a relay document, rather than to a model. Contains fields:
|
Notes on propagation damping:
The new ChangeApplier features several schemes for preventing changes from propagating indefinitely. The old ChangeApplier made use of a "source tracking" system which assigned an id to every participant in a change, and ensures each one was notified only once "per change" (this notion was vague in the old applier and relied on identity of stack frames). In the new ChangeApplier there are several reasons that changes may pass through a node more than once - for example - in a situation involving "transforming relay to self" for operating model-based constraints - and so this system is not currently implemented although it may be again in the future. We currently rely on:
- Fine-grained "genuine change" detection - The algorithm operated by
fluid.model.applyHolderChangeRequest
recursively compares the incoming change object in detail against the existing model, and only registers changes where there are either i) alteration in primitive values, ii) change in trunk type between Object and Array - these are stored in a structure "changeMap" which holds flags at the "deepest common path" of a set of such changes.- This is in contrast to the old ChangeApplier which could only check change contents at the level of single leaves
- "Floating point slop" - The algorithm operated by fluid.model.isSameValue applies a forgiving algorithm for floating point equality that is designed to prevent values which have passed through "reasonable", continuous and invertible model transformations from being detected as different on round-tripping. Based on the standard JS 64-bit float resolution, it allows for a magnification in error of about 1000 relative to the "floating point epsilon" for the value. This implementation is intended one day to be pluggable and extendable to accommodate an open repertoire of strategies for "unchanged value detection".
- Link counting - Each applier, and separately, each relay document, is allocated an "activation count" which tracks the number of times it has received changes within a particular transaction. We allow each link to be activated up to 2 times "in the absence of separate invalidation" which has a different meaning for the two types of links. As the comments in the above table explain, "relay counts" are reset whenever new data enters the transactional system from the outside - either from user changes or from consideration of a fresh "init model". Link counts are reset at the same time, but in addition they are reset whenever a relay is activated
- This is clearly somewhat "heuristic" but covers current use cases which involve simple relays operating straightforward constraints which are expected to result in a stable resultant model in a dependence tree of depth about 2. It should at least prevent annoying "runaway recursion" which will typically bomb the runtime - although it is worth noting that there are valid use cases for "nearly infinite" patterns of propagation, for example in cases where multiple numerical constraints are being satisfied simultaneously (cf. Sutherland's "relaxation" system for Sketchpad - http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-574.pdf - p.96). We don't have any such use cases at present but clearly the "link counting" system itself should at some point be externalised as a configurable aspect of the ChangeApplier implementation.
On Transactions
Identification
Each transaction is identified by an id - the field "id" held in the transaction object itself. This id is shared across all transactions held in different appliers which are coordinated as part of the same "overall transaction" which is taking place over the entire model skeleton. This id is the key which is used in the instantiator.modelTransactions
structure described in the table above. For debugging purposes each transaction also contains an individual field instanceId
which identifies uniquely this particular transaction on a particular applier. This field may be removed in future versions. Note that every id in the system, whether for components, appliers, transactions or transaction instances is allocated in the same way, using the standard fluid.allocateGuid() method which the core framework uses for other purposes, e.g. DOM nodes etc. This is why we happen to be able to mix together such disparate things in the modelTransactions record, since their ids are guaranteed to be unique across all types of things.
Demarcation
Transactions, as the "New Notes" commentary suggests, are much simpler than before and also than in other implementations, as a result of, for a start, not being "cancellable". We view the cancellation of a state update as an "impolite act" by a local member of a model ecology. However, we may at some point need better schemes for "demarcation" of units where changes are batched up before being passed on. One "informal demarcation strategy" of this kind is currently operated within the machinery for relay itself, and is the cause of a lot of complexity in the implementation. The reasoning for this is as follows:
"Half-transactions" for relay documents
Relay documents are rather more "fragile" than general pieces of model state, and may well produce peculiar results if operated when part of their fields have been updated and others have not, midway through a particular "transaction". Even if they are stable, they are in general expensive to recompute since (currently) the whole document needs to be parsed to operate the relay (a future version of the relay system will be able to trace "invalidation" through a relay document based on the transaction's changeMap and eliminate the evaluation of parts of the relay which do not correspond to an update) - as well as, with current transactions, a fresh copy of the model being taken. Therefore we would like to "batch up" changes as much as possible and operate relay documents as infrequently as we can. The current system for this registers a listener to the "preCommit" event which the overall transaction exposes - that is, when the system has "nothing better to do" and is just about to otherwise commit the overall transaction, it will only then come to recompile and run all transforming relays. This, of course, may lead to further waves of activity in the system, especially since it will reset all the standard link counts (see above). Unfortunately this system is somewhat informal and has itself required the design of transactions to be modified to take care of this use case - which requires the "bulking" capabilities of transactions, without causing the overall transaction to be committed early or any events to be fired. This is itself the reason for two features of the transaction system - firstly the reset method exposed by transactions which allow them to be reused repeatedly as part of a larger transaction, and the preCommit event itself which is used to trigger the relay operation process itself.
Cost of transactions
We would like the cost of "plain" transactions to be as low as possible given that one is created for effectively every change triggered by the user (e.g. in an app UI). It's at the moment unavoidable that a copy is taken of the local component's model on opening each transaction, and it's difficult to think of a scheme whereby we could operate the fine-grained change detection (described in the "damping" section) without this. However, for large models we do need to try to look at reviving the "thin" transaction system operated by the old ChangeApplier intended for use in very large and frequently updating models (perhaps displays of big tables such as stock prices, weather reports etc.). So far no such use cases came up in our community since 2008.
Unfortunately the reset method requires to copy the model a second time - again for purposes of reliable change tracking. An annoying bug appeared at the last moment integrating the Pager (whose model structure has seemingly driven ChangeApplier development for the last 5 years) with the new implementation, in that the "half-transaction" for a relay document committed, whilst leaving the old model in place. Given that the relay had itself just reverted the user's original change to the previous model condition, this created a corrupt workflow where it seemed that no change had taken place in the transaction, and so the relay's own document itself failed to be updated with the new (censored) value, causing further corruption on the next event cycle.
Whilst we must work hard in the future to eliminate unnecessary model copies (and in fact, if we can, the creation of ChangeAppliers at all - in the "flyweight component" system we will have to build for the "new renderer" it will seem imperative that large numbers of simple components can under many conditions share access to the same applier), this is probably right now a small cost relative to others in the IoC framework which need to be eliminated before 2.0. The power to issue arbitrary IoC expressions to bind change listeners declaratively to any appliers in the tree somewhat reduces the burden on each component to create a fresh applier or relayed model.
One scheme for reducing the cost of transactions is to place the registration of transaction events onto the applier itself rather than onto the transaction - that is, each transaction operated by an applier fires to the same list of listeners. To allow a little dynamism in this system we allowed an optional "commit disposition argument" to the transaction's commit
method, which allows a listener to preCommit
to distinguish between different kinds of commit events (again, only important in the "half-transaction" system operated by relay).
Important locations in the code
The implementation at time of writing is at https://github.com/amb26/infusion/blob/cc3fd22e253d139ddc4179a1cb55ed40a92c7b8a/src/framework/core/js/DataBinding.js - line numbers may drift, but function names should remain reasonably stable.
The core ChangeApplier itself is fluid.makeNewChangeApplier
at line 1066 - this core is extremely short (about 90 lines) especially compared to the previous implementation, although there are a number of utilities which have been broken out into global functions. Unfortunately the great simplicity in the ChangeApplier itself has been balanced by a great increase in complexity elsewhere, mainly in the model relay system itself which meets numerous new requirements.
The central point of coordination is at fluid.registerDirectChangeRelay
line 469. This is larded with many comments, but still remains a confusing site since it is the point where every kind of relay is eventually registered - the "sourceListener" at line 476 is the only ChangeApplier listener implementation used by the system. This consolidation results in the overall function being called with several different sets of arguments in different situations. All of these calls are made from fluid.connectModelRelay below.
In a "vanilla" call to fluid.registerDirectChangeRelay
, options
and transducer
are empty, and the first four arguments are set, representing a "direct relay" between one applier's model and another. These are typically set up in pairs (line 544 and 546). In the more complex cases, the relay is either on to or away from a model relay document itself - that is, changes are bound from the "model at large" onto a model relay document at line 533, causing various special options to be set (targetApplier
, relayCount
and update
). Some of these options end up dumped directly into the modelTransactions record once a transaction is actually active. Going out from a model relay document, we use the special "half-transactional" system described above, on line 541 - this ensures that model relays are only operated as infrequently as possible.
Some of the main responsibilities of fluid.registerDirectChangeRelay
are to enter the proper records into the modelTransactions structure described above, and to update and check link counts. The relay counts are updated and checked elsewhere in fluid.model.updateRelays. This latter method is executed repeatedly from the top-level fluid.establishModelRelay
until no further changes have been registered.
A further but more pleasant coordination point is fluid.parseImplicitRelay
on line 625. This replaces some of the natural action of the IoC framework in parsing "by hand" the contents of the model record in the component's options. Whereas IoC itself would resolve references eagerly, resulting in a "dead" initial model, we need to discover the positions and contents of inter-model references listed in this area so that we can convert them into live "links" in the model skeleton - operated by fluid.registerDirectChangeRelay's listener. This method does double duty since we also use it for parsing modelRelay (model transformation format) documents themselves, before they are sent to the model transformation system. Similarly, we discover the location of inter-model references and use them to bind listeners onto the model relay document itself so that it may be kept live to changes elsewhere in the model skeleton.
Notes on model discovery and "init" timing problems
Unfortunately the workflow of the "init transaction" which tries to both discover and set all models in the model skeleton to their initial values is somewhat at odds with the workflow of the current IoC instantiation system. In particular, components are discovered "opportunistically" and we can never be entirely sure when we have really finished discovering all models in the skeleton. On discovery of each such component we call fluid.enlistModelComponent, and on concluding the parse of each such component we call fluid.deenlistModelComponent. If the latter discovers that parsing is complete for all components so far discovered, it will try to operate the initial transaction for all of these models. The "ginger process" of IoC guarantees that we will at least discover all components participating in a cyclic linkage passing through the first component, but it is possible that many components will "miss the boat" if they are not referenced by outgoing links written in the initial set. These components can particulate in a further round of an "init transaction" but as well as being inefficient this is a cause of special cases in the implementation and probably at least one bug. The "completeOnInit" flag in fluid.operateInitialTransaction checked at line 325 is one of these special cases - on discovering a component which is already fully initialised, it will fire a "fake initialisation event" so that the component beginning to observe it can see it apparently going through the init transaction process again. Correspondingly there is another special case on line 797 in fluid.mergeModelListeners which extends the same facility to user listeners as for relay listeners. Unfortunately this latter is almost certainly the site of a bug in that in some situations we may "doubly observe" model initialisation if we observe a component that was part of the initially observed set that registers a listener for a different such member.
These problems will be difficult to resolve without an implementation of http://issues.fluidproject.org/browse/FLUID-4925 "wave of explosions" which thankfully is one of the very next upcoming framework tasks since it is required to enable the "new renderer" work. With this implementation, the system will eagerly instantiate all "component shells" in the system complete with their grade lists, and so it will be easy to proceed to a further "early eager" stage of eagerly parsing all model
, modelRelay
and modelListener
sections of all components discovered in the tree which are descended from fluid.modelRelayComponent
- this guarantees that we will have the complete set of references available before instantiation proceeds any further and so our knowledge of the model skeleton geometry during the "initial transaction" will be complete, removing the need for multiple "init transactions" within the same skeleton (unless some components are instantiated late via "createOnEvent") and bugs such as the "double init observation bug" described above.