Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

General motivation

...

This shows how "half-assed" this support for global installation really is. The mess is compounded by the fact that npm's location for global modules and CLI tools is subtly different from node's support, operated by the require.paths property - node docs Loading from the global folders explains that, for example, it will look in locations such as $HOME/.node_modules. The 3rd option, $PREFIX/lib/node, closely resembles the location used by npm, but does not actually agree with it - see npm's documentation on npm folders. This explains Global installs on Unix systems go to {prefix}/lib/node_modules. Global installs on Windows go to {prefix}/node_modules (that is, no lib folder).

One of the core use cases we mention above, speeding development when a deeply nested, multiply shared dependency is being worked on, is handled by an npm function known as npm link - the following blog posting on the node.js blog (6/4/2011) explains how this feature came about and was properly implemented for npm 1.0: link. The thinking does appear to be about a very closely related problem - however, there are two serious issues:

  • No support is possible for this feature under windows (except through the prohibitive route of compiling your node.exe under Cygwin) since the feature relies on symlinks
  • The installation of the modules is indeed global - the developer is guided into a globally stateful workflow and must remember to "unlink" their package otherwise all work will be corrupted in all packages. Ridiculously, this still needs to be done manually - see npm unlink does not unlink from global path

The shell of a plan

The things which are shared, and the things which are not shared, need to be considered very carefully. Our use case is clearly wider than the one acknowledged by "liftoff" - it is not just CLI tools themselves, but the entire build and load chain, which needs to be able to establish agreement on what modules are loadable from where. A core use case is from grunt plugins themselves, and from our replacement for npm - let's call it ipm. We need to identify the barest minimum of the "module bootstrap loader" which receives this treatment. It clearly makes no sense to put all of infusion, or all of any sizable project, into a shared module area - but nonetheless there should be something in that area, which quickly and authoritatively determines, given any invocation point, which the particular instance of infusion is that should service the current project, and ensure that there is exactly one of it.

We need the concept of "workspaces".... ("multi-projects", etc. or some other hackneyed and disagreeable term), which is, a "space" of related projects which have been checked out/installed and have agreed to cooperate on consolidating dependencies and module resolution. This space of projects will be guaranteed to have exactly one copy of infusion (and any other cooperating dependencies) resolvable within it, and in addition, to have never even attempted to check out a duplicate version of such a dependency. Tasks like "npm dedupe" and the constant harping about how disk space and network bandwidth are cheap notwithstanding, npm installs are now extremely slow and wasteful and only getting more so. A checkout of express 4.0 includes 7 copies of "bluebird" and 2 separate instances of all of selenium in its 80Mb project tree. "lodash" may be very small, but guaranteeing to have one copy of it in every project due to grunt depending on it and grunt insisting on being separately installed in every npm module is a route to insanity. The idea of "locally installed tools" makes sense but we need much more control over what the scope of "local" actually is - it can't simply be "the smallest unit that our package manager knows how to manage".

Such a workspace will be delimited in the filesystem by a file named "ipmRoot.json" (say) held at its root. This will serve a variety of functions - 

  • Providing clearer delimitation of the space searched through for modules - the "keep looking upwards for node_modules" rule is easily prone to accidental leakage
  • Providing a portable and easily readable place for dumping "redirect rules" of the kind morally operated by npm link - this makes it easy to continue, for example, working with a given checkout of a project (say, one's "main checkout of infusion" just in a particular space, whilst leaving other spaces unaffected

We can defer to require <physical path> for our physical mechanism for loading a particular module within node.js - however, we can't defer to npm install <git URL, etc.> for our mechanism for installing an individual dependency of one package - because npm will still continue to cascade endlessly through derived dependencies of that package before we can stop it. The package.json packaging scheme however is good, as is the registry itself, and these are both pieces of infrastructure we can easily reuse with good profit.

...

  1. fluid's global is just straightforwardly advertised as fluid.global - allowing access to all such shared dependencies to be bootstrapped from any participating library
  2. it stores a registry of module loaders associated with each (npm) project name. fluid.resolveModulePath("universal/src/testData") etc. should then be capable of getting filesystem paths into any such dependency for the purpose of resolving resources. cf ancient art in the form of ClassLoader.getResourceAsStream().

For point 2 - this hopefully eliminates the need for "kettleModuleLoader.js" - since we will just use the utility to rebase file paths rather than needing to export "require".

However, the actual use of fluid.require will become unnecessary - except for gaining access to "fluid" itself - since the entire purpose of the module loader is to cause all dependencies to be located and installed automatically. As we have been saying for a while, we will initially go with a "load everything loadable" model, and then over time we can think about more sophisticated schemes for only loading files holding defaults blocks and/or global functions that have actually been referenced. "load everything loadable" needs to be qualified by platform and/or other environmental dependencies though. This militates towards using a separate marker file (ipmPackage.json?) rather than stuffing things into package.json - which has only a few hard-wired kinds of env dependencies such as os, cpu, etc.

The "bare minimum module loader" will be small

  • Firstly, to ensure it represents as stable as possible commitment to something which is globally shared
  • Secondly, to minimise its impact should it end up unfortunately being installed multiple times locally

We would hope that it just consists of a couple of dozen lines of code.

...