Notes on Modularisation of Infusion

(Original discussion as of July 2014)

The current packaging of Infusion is rather monolithic and old-fashioned. Modern JS users require a variety of builds through a variety of formalities - 

i) A variety of different module loaders and contexts, e.g. the plain browser, node.js, require.js in the browser, AMD and non-AMD aware module loaders, CommonJS module loaders

ii) A variety of different elements inserted in the build - framework-only, a custom mixture of components, a custom pattern of 3rd-party dependencies, some of which may be in common (e.g. jQuery and jQuery UI etc.)

iii) A variety of different treatments (concat/minification, custom boilerplate - this one feeds back into i)

 

"bower" is a particularly interesting current environment. Things in its favour are that it is extremely easy to get to grips with and has a completely static structure (a "module" is simply a git repository) and flat checkout structure. These same are the key points against it also - for example, it supports no plugin structure nor alternative technologies for repositories (for example, plain HTTP). It's quite possible it may cease to be "flavour of the month" before long and give way to a more sound solution covering several more of the deployment points and scenarios (especially more sane treatment of 3rd-party/multiply occuring dependencies) but the current proliferation of "crazy bananas" module loaders shows no signs of bringing this about very soon.

As well as considering the use of bower, we considered strategies for modularising "Infusion" or "an Infusion artefact" and how this modularisation should interact with our current git repository structure. 

 

 

Proposals:
1a:
    i) Fully modularize Infusion at the level of git repositories
    ii) Use bower to express dependencies amongst modules that were formerly known as Infusion
    iii) Provide a Grunt plugin that helps users produce a manageable single-file Infusion build from their various bower depenencies
    
1b:
    i) Fully modularize Infusion at the level of git repositories
    ii) Use bower to express dependencies amongst modules that were formerly known as Infusion
    iii) Commit the build artifacts for each module into its Git repository
    
2a:
    i) Don't modularize Infusion
    ii) Create a single repository, called infusion-bower containing "common builds"
        a) All of Infusion
        b) Framework-only
        c) UI Options
        ...
    iii) Come up with some kind of post-commit workflow that allows us to easily:
        a) produce the above-mentioned builds
        c) push them to the Infusion-bower repository
2b:
    i) Modularize Infusion into:
        a) framework ("infusion")
        b) all the components ("infusion-ui")
        c) jqUnit + IoC testing framework
        ** What about circular dependencies?
    ii) Create bower repositories for:
        a) framework
        b) infusion-ui
    iii) Come up with some kind of post-commit workflow that allows us, for ii.a and ii.b to easily:
        a) produce appropriate builds
        b) push them to the -bower repositories
        ** This will likely involve a grunt plugin
        ** Minified builds should be ditched, and should be the responsibility of the user of builds
        ** How should we handle tags?
       
3: We could adopt a WACKY NAMING CONVENTION, together with some kind of scheme for automatically publishing artifacts which match a specification into a CDN
THAT IS, for each possible build profile of Infusion there corresponds a unique and easily and publically determinable stable global name - a request to that URL either fetches a pre-built artifact cached there, or else triggers the build process in order to populate it
As in ---- ***NIXOS***!!
Summary:
1. Bower isn't a sufficient package manager for large, modularized systems where a user is able to freely choose parts of the larger system to include in their application (e.g. the choice amongst a set of Infusion components, or jQuery UI widgets, or Angular modules, or whatever)
2. The model where modules are always directly tied to Git repositories is inappropriate because - 
   i) build artefacts are derived FROM material which is version managed, it is not appropriate to version manage them themselves (a.k.a. it's noisy and annoying to have your build artifacts under version control)
   ii) there is no model for acquiring a part of a github repository either via git or its REST API - each user of a build system just wants ONE configuration of build artefact rather than having a collection of them pushed to them most of which they will ignore
      a) There are 2^N build combinations based on N binary choices - and there is in reality a very "long tail" of users who might commit each other to checking out build artefacts only 1 of which they need
3. Until either a suitable web package manager emerges or we introduce one ourselves (which we're not going to do any time soon), we prefer proposal #2 because it is the least disruptive to Infusion's repository status quo while enabling many users to quickly "bower install" and use Infusion, at the cost of an additional build/commit stage for committers to Infusion

THOUGHTS that were not captured in the above:

  • A big risk is our use of raw commit hashes in several venues, esp. in JIRAs. The traditional use of git filter-branch would destroy all of these. A possible route around this is to simply clone our infusion repo into infusion-framework and infusion-components, and simply git rm the mismatching files. Our repo has not become so large that this doubling would be prohibitive
  • Fully modularising infusion into a one-repo-per-component repo would be extremely prohibitive. Our directory structure currently consolidates artefacts by type and not by component - e.g. all jQuery UI plugins are sourced from a central location, all tests are held in their own area, etc. Reorganising all of this would make matchwood of our git history as well as depending on really indomitable performance by our hypothetical build/module system in allowing us to consolidate dependencies which do indeed end up being shared.

We really don't release enough, and splitting off the framework might allow us to make perhaps more frequent releases of at least that given its (internal) QA consists of just running its tests. However, a framework build can't be considered desirable unless it has been validated against most or all of our "live" users ("the components", the metadata tool, the video player, kettle, the GPII, etc.). This is another route to ease off our current recidivist reliance on git hashes and become a slightly more recognisable citizen of the npm semver-toting club.
 

Updates to this Discussion as of October 2016

The world has moved on somewhat since this original discussion. "bower" has largely become derelict, and we have a lot more experience of the really obstructive bureaucracy of working with multiple git repositories. The general movement, as per the increasingly popular "lerna" package - https://github.com/lerna/lerna, is to work with single, large git repositories ("monorepos") containing multiple npm packages. Interestingly we already had partial experience of this as part of a pattern we developed in the GPII's projects, but this pattern could be taken further than we did and supported with a bit of tooling.

After discovering the insufferable problems npm poses to those trying to work with scripts in nested packages as part of https://issues.fluidproject.org/browse/FLOE-484 we convened a small meeting to revisit our modularisation issues. Notes were taken at https://pad.gpii.net/p/infusion-module-discussion-oct-4-2016-okg4ntt which are wikified below:


Notes on Infusion Modularisation Meeting October 4 2016

Interestingly this ancient prior art doesn't include any of the options that we are now interested in - e.g. monorepos in the "lerna" style
--> Just as well we didn't decide to do anything about it at the time
The Issue
  • How do we source Infusion in a "third party" application?
  • concatting builds
  • moving "resources" (html, css, etc.) into the correct spot
  • building style sheets using Stylus
  • Alan's strategy:
  • Within a "third party" application, a grunt script invokes Infusion's grunt script in order to produce a custom build
  • Copy the custom build + (other assets?) into somewhere in the application's directory structure
  • Requires a "double npm install" in order to avoid being blasted by grunt's idiotic module loading strategy
  • Not sustainable in a wider context
  • WHAT ANTRANIG DID
  • Wrote a utility that will load a grunt plugin using npm's module resolution strategy—this avoids having to recursively call "npm install" on all the dependencies we want to build
  • What I would like to do, is to consume chartAuthoring as an npm dependency, and, in practice, run its demo application
  • Even if I don't actually want the results of the postinstall script (which I actually sort of do), as it stands, I cannot consume chartAuthoring as a dependency at all without its postInstall concluding successfully - if it fails, current npm will just blow the module away completely
  • Alan's argument:
  • The postinstall script does nothing "to the contents of chartAuthoring itself" and so could be considered inessential to consuming it
  • Its purpose is to prepare a static file layout suitable for hosting the chartAuthoring demo "from the filesystem"

THINGS WE WILL TAKE AWAY FROM THIS:

    
    i) postInstall is the wrong place to put such material <-- leaves many important questions unanswered
    ii) prePublish is probably a better time - this is problematic for Infusion with its numerous build configurations

    -> Does nothing for people who want to consume a module as a git dependency

       -> This could be fixed up via a big "index.js" at the root of the monorepo that forcibly loads all of the dependencies and thus registers all of their base paths into the fluid module registry - even if they ordinarily wouldn't be resolvable via npm

        -> This would require further special intelligence inside fluid.require such that it would know that, say, an attempt to require("inlineEdit") needs to be converted first into an attempt to load infusion first (ability to pre-inspect its package.json - this implies

             we need some kind of pre-publish step which registers all submodules into the package.json of, say, Infusion, before it gets published)

    -> However, note that a "monorepo" style will ALSO make it essentially impossible to consume a module as a git dependency (even if it comes with a suitable index.js which loads EVERYTHING, the module as a whole has a different identity to any of its contents and so switching between the use of npm and git dependencies will require changes at the code level (for as long as we continue to use plain "require").

SOME GIZMOS WE COULD IMPLEMENT TO FIND A WAY OUT

a) (FOR APPLICATIONS?) Cease to rely on "the filesystem" morally as a deployment area - in practice we do this anyway because of the strict SOP policies in Chrome etc. for apps hosted from the filesystem
--> Instead produce a self-contained kettleish hoster that allows the use of material in <head> such as 
<script src="%infusion/src/framework/js/Fluid.js">
etc.
This could be run in two modes - either i) a "live" mode which does such things in place
OR/AND ii) a "publish" mode which runs a build step which produces a static site which allows such a thing to be taken away (in effect, a kind of "browserify lite")
b) A kind of "lerna lite" which is an improvement to our existing fluid-publish plugin that allows it to work with the module layout that we currently have in the GPII (swapping out the path "gpii" for "packages")
--> This would be greatly simplified because we would bypass all the complex "bootstrapping" workflow and simply allow the inbuilt use of "node_modules" relative resolution to allow each of the submodules to mutually find each other where they have mutual dependencies
--> The hard problems are the ones which lerna doesn't really answer, which are - HOW do we assign version numbers to things? Lerna has two policies, a "fixed" one where all packages update in lockstep and an "free" one where each module can declare its own version. Neither are really a great universal solution, e.g. particularly for Infusion which contains a heterogeneous mixture of things, some of which might want to be versioned together (parts of the framework) and some of which might want to be versioned separately (the components). 
c) Move over where we can to a prepublish/dist model
iii) for alan's case, this is actually a DEPLOYMENT STEP
Chart Authoring's current workflow to use Infusion:
  • npm install Infusion A SECOND TIME ( installs Infusions dev dependencies, to enable the following grunt build script ) 
  • run Infusion's grunt build script
  • copy out the build directory structure

Immediate Actions (as of November 2016)