Plan to Abolish Invokers and Events

This page contains a writeup of a discussion between Antranig Basman and Simon Bates on 15/11/16, during which the idea was floated to eliminate some of our most fundamental and long-standing framework primitives (invokers, events, and possible some others) in favour of a new, unified primitive with improved authorial values and greater generality.

The Invoker Hole

It has become increasingly apparent that invokers (which are modelled on the very well-established paradigm of methods as part of object orientation, which themselves were derived from the even more basic abstraction of function calls) fail to meet our increasingly clear criteria for open authoring, articulated in our paper introducing the Open Authorial Principle (OAP).

The improvement that invokers offer with respect to methods/function calls is at least to allow the signatures of callers and callees to be freely detached - by means of Infusion's reference resolution system. However, whilst arbitrary numbers of contributions of configurations to a particular invoker site can be arbitrated by the framework in a fine-grained way (especially by means of options distributions), the final authorial result mostly remains a crude "winner-takes-all system" whereby the winner of the arbitration gets the right to author the invoker definition. There is a minor exception to this whereby the funcName and args of an invoker might result from different authorial expressions - but in practice this is a freedom which has never been seen taken advantage of. It's overwhelmingly likely that, for a given invoker definition, the same author will want to determine both the implementation's global function name and argument resolution at the same time.

As we were writing the OAP paper, the contrast to affordances of other systems became clearer. In Aspect-Oriented Programming (AOP), it is possible to issue "around advice" which can interpose arbitrary many layers of gearing between callers and callees. The ultimate prior art for all of these systems is the method combination facility provided in CLOS.

Whilst AOP advice is a powerful system (and strictly more powerful than we can currently express using invokers), it creates an opaque authorial system. In most models for AOP, each element entering method combination is only allowed to interpose on the outside of any existing method combination (note that CLOS does feature a multiple inheritance system with a complex model for specificity which does allow method definitions to be interleaved amongst each other - however, this is hardwired to the class inheritance structure and features rules so complex that authors would be nervous in practice that the order they obtained in practice would be the one they wanted - as well as privileging certain authors as a result of their position in the inheritance hierarchy, cutting off interception possibilities from those who are unable to arrange for their rule to appear in the necessary place). In AOP method combination, once the order of combination has been settled, arguments and return values become hardwired within the call graph, by means of an earlier method dispatching to a later one via the primitive call_next_method with its choice of arguments, and making its choice of return value. This only allows local control of the flow of data - a return value or an argument which is not forwarded by a predecessor becomes permanently lost.

Summary on Invokers

In Favour:

  • Can be highly optimised
    • Plain function calls could even be inlined, though much harder to combine this with dynamic dispatch
  • Highly intelligible and familiar
    • Debugger traces featuring stack frames are a powerful source of explanation

Against:

  • No clear model for method combination within the paradigm
    • AOP techniques allow for interception but this lies outside the function call/method model itself
  • Tight coupling of senders and receivers
    • The fact that there is exactly one receiver is tightly woven into this model (again, ignoring interception models which lie outside the paradigm)

Events and Their Limitations

The Infusion Event system began life as a very straightforward implementation of the Observer/Listener pattern with free argument lists - but gradually began to outstrip the invoker system in power and authorial affordances as the implementation quality of the rest of the framework improved. An early innovation, inspired by misunderstanding the corresponding feature in jQuery's event system, was the ability to label particular listeners to events with strings representing namespaces. This allowed the listeners to be identified after the fact, and particular listeners to be targetted for overriding as configuration for grades was merged together. However, this feature didn't obtain its full power until the innovation of priorities in the 2015-era framework (FLUID-5506). This allowed the namespaces attached to listeners to be used to derive "positional constraints", giving authors the ability for fine-grained targetting against the listener notification order. Whilst these were called "constraints", an early realisation was that these did not precisely fit the traditional model of constraint as seen in constraint programming, which would find any position which did not violate a constraint as suitable as any other, and would lead to the algorithm determining the notification order being a topological sort. Instead, a "constraint" in our system of the form before:other does not signify that "this element should appear anywhere before other" but instead that "this element should appear directly before other unless some other element interposes". In this sense our "constraints" should be seen rather as positional directives, which specify the progress of the assembly algorithm, rather than abstract mathematical constraints. This leads not only to a more efficient algorithm but also clearer and more deterministic authorial results.

However, the model underlying listeners is fundamentally limited in several ways. To start with, the observer pattern itself lacks any form of reciprocity. The dataflow is "push-only" - out from the observable to the observers. It is also a strictly multicast model - each observer is enforced to receive exactly the same payload (in our model, this is mitigated, as with observers, by the possibility for each event listener to specify its own boiled signature by means of Infusion's IoC reference resolution system). As a result of this, the ability to finely control listener notification order falls somewhat upon deaf ears - since the results of having such fine control could only be useful if in some sense the listeners had strong side-effects on some other part of the system - signifying a somewhat broken overall design.

These limitations are effectively baked into the source design pattern that our feature was modelled on, and are somewhat part and parcel of each other - it's precisely because traditional events are forcibly multicast, that the idea of a "return value" seems faulty - since it would be impossible to abstract over the multiplicity of listeners in order to determine whether one or many values should be returned, and/or which one(s) of the collection should be responsible for computing it.

Summary on Events

In Favour:

  • Good decoupling of senders and receivers - there may be zero or more receivers
    • Under Infusion's model, a receiver may receive an arbitrary signature
  • Publish/subscribe model is again familiar and well-attested
  • Using Infusion's model for priorities, good authorial properties allowing listener sequence to be authored collaboratively

Against:

  • Hard to incorporate a notion of a return value
    • Since each receiver is treated symmetrically, there is no clear way to distinguish one or more which may collaborate on a return value
  • Since values are not returned, this relies on a "side-effect" model of computation which is inherently dangerous
    • For example, the situation may have changed substantially during the process of event firing, which in theory is synchronous
    • This may cause the entire notification process to need to be aborted - which is already an extension of the traditional event model
  • Already some loss of debuggability values - effect propagation forms a tree rather than a stack, of which previous branches are not visible

Transforming Promise Chains

A somewhat improved primitive, which so far only has a "pseudo-framework" status (whilst it is supported by some utilities delivered along with parts of the framework, it does not have the status of a first-class framework feature - it is delivered solely using current "userland" features) is that of a transforming promise chain. This accepts configuration syntactically identical to that of a standard Infusion event, but misuses the resulting collection of listeners by firing them in a custom workflow, with a particular stereotypical argument set. It is intended that each listener may choose to return a promise rather than a plain value, leading to an interpretation of the listener as a task. These tasks would then be run back-to-back, implementing the promise sequence algorithm (note that our hardwired signature allows for a blend of the sequence and pipeline algorithms - in that argument 1 represents the cascaded value operating the pipeline, and argument 2 represents material invariant across the task array).

Summary on Chains

In Favour:

  • Can incorporate asynchronous processes
  • Builtin and native semantic for cancellation

Against:

  • Rigid signature requirements in current implementation - natural model of unary functions with one return value extended by placing invocation-static material in 2nd argument
  • Dataflow is thus also hardwired - each element can only consume values output by the immediate predecessor
  • Current implementation reuses syntax from event/listener system and is not a first-class framework primitive (must be fired manually and can't form part of ginger world)
  • Expensive datastructures are definitely created on each invocation and cannot be optimised away

Towards a Generalised Feature

The "transforming promise chain" feature seems to point the way, especially when situated in the current discussion, to a single, generalised feature which serves all of these use cases as well as some others. Some implementation hazards/considerations in the way of such a feature are:

  • When acting to replace an invoker, the performance impact of the new feature should be negligible. We worked hard during the FLUID-5249 to bring invoker performance close to the limits of what is possible in pure JavaScript (without code generation) and we could not accept any degradation from this level. This implies that we are somewhat stricter in drawing up a "type system" for the configuration, and should not allow, for example, a listener to make its choice as to whether to return a promise or a plain value at runtime as with the "transforming promise chain" implementation - this should instead be signalled by some static configuration.
  • We expect to construct new forms of context reference in order to express the argument and result dependence of each of the listeners. They should be able to nominate either the arguments or the return value of any other listener by means of a syntax, say, such as {elements}.<name>.args, say, in addition to the ability to make use of the special name previous in order to refer to their immediate predecessor in the sequence if this is meaningful. We would adopt the CLOS-like conventions that any listener that made no such declaration would receive the argument list of its predecessor (in the "chain" form) or of the first element (in the "event" form), and would pass on the return value of its predecessor by default.
  • This implementation promises to become the direct implementation of "megapayloadism". Given that the data dependence of each listener could be inferred statically, we could then execute arbitrarily parallel graphs in order to build up a megapayload, rather than being limited to the strict sequence algorithm implied by the default definitions in which each listener implicitly referred to previous. This then removes the clunky hardwiring in fluid.promise.fireTransformEvent that restricts us to purely sequential dataflows (we would be freed to express dataflows outside the convex hull of sequence and pipeline).
  • Our requirement for a "type system" in point 1 also suggests that we should explicitly characterise plain values within the pipeline - rather than clunkily expecting the user to express these as the confection of fluid.identity(value). This then promises to be a generalised replacement for the members directive resulting in data, etc.
  • The final "grand unification" plots the interaction of this system with our model idiom. Given our initial megapayloadic model, we would cast this scheme as building up a payload as a whole considered immutable - that is, with the same idiom as general framework configuration, which is that an elaborated value should be considered "immutable once first observed". However, there's no reason why this system couldn't be generalised to support periodic/repeating activities resulting in a "model-semanticed area".


Discussion on the nature of feature evolution and learnability

An important topic surfaced during the discussion related to the user experience of users both new to the framework and those experienced with it. Presenting new users with an over-generalised solution risks having them lose their bearings, whilst making excessive efforts to preserve forms of configuration accepted by previous versions of the framework risks increasing their learning burden, and requirement to internalise the complete history of the system (as seen with users of C++) before they could become proficient with it. An excellent example of this "generalised new features for old" can be seen in the evolution of the C++ class primitive out of the former C primitive of a struct. This put old features and new features in parity - in that it could be said that "a struct is retrospectively interpreted as a class in which the definition of all the members has implicitly been prefixed with the keyword public:"

This would be one useful way to cast the old features in terms of the new - that is, by considering them as "configurational sugar" for longer definitions with some elements retrospectively considered implicit being interpolated. The question then remains whether the old constructs should be deprecated, or else retained as "learning aids" to the framework.

What can this new feature be called?

"elements"?

Some Sketches for Syntax

We should be able to model all of the old primitives described in terms of the new, unified primitive. Given our discussion on learnability, it's not clear whether we would abolish the old primitives or not - but we should design the new syntax in such a way that there is sufficient static information that will allow any instances of the new primitive to be optimised to use no more resources than the old. The use of the old primitives might then become a matter of convention - that is, they represent a more compact encoding of the user's intention in a particular situation, since they would represent a different choice of defaults - but all of the full syntax would be available in each location.

We have choices as to whether to expose more context names to give a more natural fit where these can't correspond to concretely manifest data (e.g. previous) or else to reduce our intrusion on the limited resource of resolvable context names.

Invoker as Element

Original invoker definition:

invokers: {
    anInvoker: {
        args: ["{arguments}.1", "{exterior}.value"],
        funcName: "myNamespace.funcName"
    }
}

Represented as element:

elements: {
    anInvoker: {
        body: {
            args: ["{element}.top.args.1", "{exterior}.value"] 
            funcName: "myNamespace.funcName"
        }
        top: {
            return: {element}.body.return
        }
    }
}

We reserve two namespaces, body representing the (normally single) listener forming the method invocation, and top representing the element's dataflow as a whole (that is, its input arguments and return value).

Event as Element

Original event definition:

events: {
    anEvent: null
},
listeners: {
    "anEvent.aNamespace": {
        funcName: "myNamespace.funcName"
    }
}
   

Represented as element:

elements: {
    anEvent: {
        aNamespace: {
            args: "{element}.top.args"
            funcName: "myNamespace.funcName"
        }
    }
}


Improvements:

We get rid of the clumsy split of events/listeners being declared in separate blocks.

Hazards:

We've got several hazards stemming from the traditional packaging of events as specially treated top-level elements.

  • The definition within top-level option's events also gives rise to a top-level definition which is within events in the resulting component
  • The procedural API addListener, removeListener, fire both creates a support burden as well as a garbage burden - we are forced to create some kind of closure to close over the internal state required for this support
    • Although we note that the existing addListener/removeListener API is actually sufficient for configuration of complex model elements, and that actually operating them will always require some kind of external machine such as fireTransformEvent etc
  • The "unnamed namespace" can't really be supported cleanly. We had discouraged users from putting things there anyway
    • Note that FLUID-5948 describes a significant problem with the design of namespaces throughout the system - which has to be resolved as part of the route to normalising the "unnamed namespace" 

Chain as Element

Original chain definition:

events: {
    aChain: null
},
listeners: {
    "aChain.firstNamespace": {
        funcName: "myNamespace.firstFuncName"
    },
    "aChain.secondNamespace": {
        priority: "after:firstNamespace"
        funcName: "myNamespace.secondFuncName" // This function actually return a promise
    }
}

Represented as element:

elements: {
    aChain: {
        firstNamespace: {
            funcName: "myNamespace.firstFuncName",
            args: ["{element}.previous.args.0", "{element}.top.args.0"],
        },
        secondNamespace: {
            taskName: "myNamespace.secondFuncName",
            args: ["{element}.previous.args.0", "{element}.top.args.0"]
        },
        top: {
            return: "{element}.final.return"
        }
    }
} 

We need to reserve two more special namespaces, previous and final in order to replicate the original functionality.


Further Use Cases

The original inspiration for this feature emerged whilst contemplating the markup generation pipeline for the fluid authoring/debugging system, which still needs to be implemented without anything in the way of a "renderer". Written out by hand, there are various functions which require to generate pieces of markup, and then integrate them into wider assemblages of markup, and then inject them into various places into the document.

This original use case suggests that we have a further space of requirements, which is also suggested by the fireTransformEvent workflow that we implemented for promise chains - that of subsetting the workflow at execution time. In that API, this makes use of the somewhat cumbersome filterNamespaces options that allows just some elements of the sequence to be selected for execution. This is not a very "open" API - in that its cost scales with the number of authors. More natural would be the ability to select custom "input" and "output" points for the workflow, and expect that the system would arrange that all points between them would be executed. This would handle our use cases within Kettle dataSources as well as with markup generation. However, it's unclear how this could be fitted in to a system which allowed completely free signatures. The "options" system for fireTransformEvent provides a clear place to put such per-request policy information. The way out of this is perhaps to allow the entire input argument set to be "wrapped" by some routes and not others. event.fire would be retained as a native entry point that allowed no space for policy, whereas we would then support a fuller element.operate(args, options) API that left space for them in arg 2. The declarative form of invocation would then then support options at every site where invocation was possible, in addition to args. At the cost of yet more garbage, we could accommodate this in our existing recordToApplicable utility.

Parallels with model relay definitions

There's a close analogy with the Model Relay definitions that can be attached to model material within our new ChangeApplier model scheme. These express dataflow elements where data is ingested from a named path within a document, undergoes some processing, and is output to another path.

The contrasts are:

  • An element operates only once (per cycle), and in one direction - whereas a model relay element may operate any number of times (during a transaction) and possibly in both directions
  • The output path for an element is derived from its namespace, and nested within a further segment named return, whereas the output of a model relay is configured via target and can be freely ranged over the model material. Whilst model relay rules can, like elements, freely range over the document for their input material (via IoC references within the body of the transform), they traditionally draw input from a distinguished path named source.
  • The "document" that an element operates on is considered immutable, to the extent that it is "write once" - similar to options material, once a value has been observed, its value cannot be changed. In contrast, model documents can be mutated indefinitely
  • The signature to element listeners is free, whereas model relay elements are Model Transformation transforms. Whilst free signatures are a possibility (via fluid.transforms.free) these need to be treated specially. There is an ontology of metadata applied to transforms, whereas element listeners are all arbitrary, uninterpreted functions
  • There is an explicit concept of sequence applied to elements - giving rise to the idea of special context references as previous and last - all the listeners in an element are sorted into a single, stable sequence before they are operated, whereas model relay rules operate in a free graph, triggered by model invalidation
  • Very cheap examples of elements (the analogues of invokers) could proceed in a way which is allocation free, whereas dealing with a model area via relay rules will invariably produce a lot of garbage

That said, there are routes through which we could close up the gap between these features.

  • As suggested above by the notes on megapayloadism, we could allow the sequence order for elements to be drawn up in a dataflow-sensitive way, as well as simply driven by the sequence induced by namespace-directed priority constraints. This eases the distinction between "namespaces" and "paths". Presently, an element outputs to a path derived from its namespace. As with model listeners, we could support a separate "path" entry which is "traditionally multiplexed" with the namespace, but can be configured separately.
    • Again note FLUID-5948 which will be needed to reform the role of namespaces, especially our current support for the "unnamed namespace"
  • We could reduce the costs of component construction such that we could make a more direct analogy between "a component construction" and "an element firing". Traditionally, after an element fires, its "model area" is wholly lost, unless it is exported to its exterior via a top.return directive - whereas the model area of a component is persistent. However, we could make an analogy between an element firing and a very short-lived dynamic component