Some ideas about dependency graph generation
============================================

Ok, here's the new version.

Each variant is represented by a number of nodes in the dependency
graph. These nodes are called the "history" of the variant in the
graph. Variants can have different forms of histories, reflecting
different processes that happen during the resolution of the
dependency graph.

For instance, a variant can have an "install" history if it isn't
available in the beginning, but will be pulled into the system. It
can have an "available" history if it is available and will remain
so. It can have a "remove" history if it is available but will be
removed.

An "upgrade" from one variant to another is, in essence, a combination
of a "remove" history for the old variant with an "install" history
for the new variant.

One design principle is that problems should appear as cycles in the
dependency graph. In the end, we can check if the graph is cycle-free
and if it is, we can go ahead and perform the resulting actions.

Aha, actions. Actions such as "fetching", "building", "merging",
"unmerging", are attached to some of the nodes.


= Representation =

For easy access, we should retain the histories that are stored in the
graph.

> histories :: Map PS History
>
> History = Install Version
>         | Available Version
>         | Remove Version
>         | Upgrade Version Version  -- can also be used for downgrades

We should also have a direct map to look up nodes.

> nodes :: Map Node Int
>
> Node = (Action, PVS)  -- or PV?


= Actions =

In principle, we require the following sorts of nodes:

Built: a variant is built
Available: a variant is available for use
Removed: a variant is no longer on the system

Possibly, we require Pre-Available to resolve PDEPEND cycles.

In addition, we require the following two meta-nodes:

Top: This is the goal.
Bottom: This is what's present before we start.


= "Install" history =

An install history looks as follows


   /- X.R ---\
   |         |
   v         v
  Top ----> X.A ----> X.B ----> Bot

Explanation:

X.B --> Bot: The package can only be built if we have a consistent
             status quo.
X.A --> X.B: The package cannot be used before it isn't built.
Top --> X.A: We *want* X on the system in the end.
X.R --> Top: If we have Top --> X.R somehow, this is a desired
             conflict.
X.R --> X.A: We cannot remove X before it isn't available.


= "Available" history =

   /- X.R ---\
   |         |
   v         v
  Top ----> X.A <---- Bot
           
            X.B <---- Bot

Bot --> X.A: The package is available before we start.
Bot --> X.B: Dito for building.
Top --> X.A: We want X on the system in the end.
X.R --> Top: If we have Top --> X.R somehow, this is a desired
             conflict.
X.R --> X.A: We cannot remove X before it isn't available.


= "Remove" history =

  Top ----> X.R ----> X.A <---- Bot
                      
                      X.B <---- Bot

Bot --> X.A: The package is available before we start.
Bot --> X.B: Dito for building.
X.R --> X.A: We cannot remove X before is isn't available.
Top --> X.R: We want X removed in the end.


= Dependencies =

For X DEPENDs on Y, add X.B --> Y.A, and Y.R --> X.B.
For X RDEPENDs on Y, add X.A --> Y.A, and Y.R --> X.R.

PDEPENDs necessarily create cycles.
If X PDEPENDS on Y, but Y DEPENDs on X (which it usually
does), then Y cannot assume that X is completely usable before
Y is installed. After all, for X to be completely usable Y
would have to be installed already.

For X PDEPENDs on Y, add X.A --> Y.A, and Y.R --> X.R.

So, a typical situation is that of ghc and cabal:

ghc PDEPENDs on cabal
cabal DEPENDs on ghc
cabal RDEPENDs on ghc

The idea is that cabal's dependencies should not point to ghc's Available
state, but to something intermediate.

If a cycle contains a path

X.? -(R)DEPEND-> Y.A -PDEPEND-> Z.A

we change it to

X.? -(R)DEPEND-> Y.P
Y.A -PDEPEND-> Z.A
Y.A -META-> Y.P

Additionally, Y.A -META-> Y.P

This will resolve this particular cycle.
For RDEPEND/PDEPENDs, there's also a cycle between X.R and Y.R.
Since a PDEPEND is installed after the originating package, we want
it removed first, too, so we change

Y.R -(R)DEPEND-> X.R -PDEPEND-> Y.R

to

X.R -PDEPEND-> Y.R

= Blockers =

For X DEPENDs on !Y, add Y.B --> X.B.
For X RDEPENDs on !Y, add Y.B --> X.R.


Are these sufficient?

Yes, I think so: if X is installed, and Y available, and we have
X RDEPEND !Y, we get the cycle:

  Top --> X.A --> X.B --> Bot --> Y.B ==> X.R --> Top

Nice, huh?

What if X is installed, Y is available, and we have
X DEPEND !Y?

  Top --> X.A --> X.B --> Bot --> Y.B ==> X.B

Still a conflict, but we could do better (remove Y, but
not without reverse dependency handling).

What if X is installed, Y is installed, and we have
X DEPEND !Y?

  Top --> X.A --> X.B --> Bot
   |               ^       ^
   |               |       |
   \----> Y.A --> Y.B -----/

Ok, but X must be built before Y.


= Histories vs. dependencies =

If we take dependencies into account, histories can be simplified.
Essentially, there are two histories only:

~ "Available" history (for packages that are available initially)

  X.R ----> X.A <---- Bot
             =
            X.B


~ "Install" history

  X.R ----> X.A ----> X.B ----> Bot


[2]
TOP

There is a special node called TOP. If a package X is to be installed
on the system, there should be a direct edge from TOP to X.Available.

If we do not want something to happen, we can depend on TOP. For instance,
if X is supposed not to be installed on the system, let X.Available
depend on TOP. As a result, X.Available cannot be depended on from
TOP without introducing a cycle, and thus an inconsistency.

[3]
For blockers, we have to distinguish between DEPEND blockers, and
R/PDEPEND blockers. The latter are far more severe and can never be
resolved if encountered.

If X DEPENDs on !Y, we have that Y.Built depends on X.Built.
That's if Y has an installation history (see [4]). If Y has a
remove history, then X.Built depends on Y.Removed. If Y is
already installed, there's an unresolvable conflict. One way
to represent this is to add a remove history for Y even though
an installed history already exists.

[5]
Cycle detection
===============

A really important feature will be that we have reasonably
efficient detection of cycles. We could first compute SCCs.
We know that each SCC has at least one back-edge, which yields
a cycle. We could iterate this process.

We require cycles to be given as lists of edges, so that
we can check what kind of edges make up a path.

Data.Graph.Inductive contains a type Path, which is a list of
nodes. We can compute our representation from this list
of nodes.