24 Jun 2015
This blog post is intended to answer two very frequest questions about stack: how is it different from Cabal? And: Why was it developed as a separate project instead of being worked on with Cabal?
Before we delve into the details, let’s first deconstruct the premises of the questions. There are really three things that people talk about when they say “Cabal”:
- a package metadata format (
.cabal
files) and specification for a “common architecture for building applications and tools”, aka Cabal-the-spec, - an implementation of the spec as a framework, aka Cabal-the-library,
cabal-install
, aka Cabal-the-tool, which is a command-line tool that uses Cabal-the-library.
Stack complies with Cabal-the-spec, both in the sense that it groks
.cabal
files in their entirety and behaves in a way that complies
with the spec (insofar as that is relevant since the spec hasn’t seen
any updates in recent years). In fact it was easy for Stack to do,
because just like Cabal-the-tool, it is implemented using
Cabal-the-library. Therefore, a first answer to the questions at hand
is that stack is Cabal: it is 100% compatible with the existing
Cabal file format for specifying package metadata, supports exactly
the same package build harnesses and is implemented on top of the same
reference implementation of the spec as cabal-install
, which is just
one tool among others using Cabal-the-library. cabal-install
and
stack are separate tools that both share the same framework.
A successful framework at that: Haskell’s ecosystem would not be where
it is today without Cabal, which way back in 2004, for the first time
in the long history of Haskell made it possible to easily reuse code
across projects by standardizing the way packages are built and used
by the compiler.
Stack is different in that it is a from-the-ground-up rethink of Cabal-the-tool. So the real questions are: why was a new tool necessary, and why now? We’ll tackle these questions step-by-step in the remainder of this post:
- What problem does stack address?
- How are stack’s design choices different?
- stack within the wider ecosystem
The problem
Stack was started because the Haskell ecosystem has a tooling problem. Like any number of other factors, this tooling problem is limiting the growth of the ecosystem and of the community around it. Fixing this tooling problem was born out of a systematic effort of growth hacking: identify the bottlenecks that hamper growth and remove them one by one.
The fact that Haskell has a tooling problem is not a rumour, nor is it
a fringe belief of disgruntled developers. In an effort to collect the
data necessary to identifying the bottlenecks in the growth of the
community, FP Complete conducted a wide survey of
the entire community on behalf of the Commercial Haskell SIG. The
results are in and the 1,200+ respondents are unequivocal: package
management with cabal-install
is the single worst aspect of using
Haskell. Week after week, Reddit and mailing list posts pop up
regarding basic package installation problems using cabal-install
.
Clearly there is a problem, no matter whether seasoned users
understand their tools well, know how to use it exactly right and how
to back out gracefully from tricky situations. For every battle
hardened power user, there are 10 enthusiasts willing to give the
language a try, if only simple things were simple.
Of a package building and management tool, users expect, out-of-the-box (that means, by default!):
- that the tool facilitates combining sets of packages to build new applications, not fail without pointing to the solution, just because packages advertize conservative bounds on their dependencies;
- that the tool ensures that success today is success tomorrow: instructions that worked for a tutorial writer should continue to work for all her/his readers, now and in the future;
- that invoking the tool to install one package doesn’t compromise the success of invoking the tool for installing another package;
- that much like make, the UI not require the user to remember what previous commands (s)he did or did not run (dependent actions should be run automatically and predictably).
In fact these are the very same desirable properties that Johan Tibell identified in 2012 and which the data supports today. If our tooling does not support them, this is a problem.
Stack is an attempt to fix this problem - oddly enough, by building in
at its core much of the same principles that underlie how power users
utilize cabal-install
successfully. The key to stack’s success is to
start from common workflows, choosing the right defaults to support
them, and making those defaults simple.
The design
One of the fundamental problems that users have with package management systems is that building and installing a package today might not work tomorrow. Building and installing on my system might not work on your system. Despite typing exactly the same commands. Despite using the exact same package metadata. Despite using the exact same version of the source code. The fundamental problem is: lack of reproducibility. Stack strives hard to make the results of every single command reproducible, because that is the right default. Said another way, stack applies to package management the same old recipe that made the success of functional programming: manage complexity by making the output of all actions proper functions of their inputs. State explicitly what your inputs are. Gain the confidence that the outputs that you see today are the outputs that you see tomorrow. Reproducibility is the key to understandability.
In the cabal workflow, running cabal install
is necessary to get
your dependencies. It's also a black box which depends on three pieces
of global, mutable, implicit state: the compiler and versions of
system libraries on your system, the Cabal packages installed in GHC’s
package database, and the package metadata du jour downloaded from
Hackage (via cabal update). Running cabal install
at different times
can lead to wildly different install plans, without giving any good
reason to the user. The interaction with the installed package set is
non-obvious, and arbitrary decisions made by the dependency solver can
lead to broken package databases. Due to lack of isolation between
different invocations of cabal install
for different projects,
calling cabal install
the first time can affect whether cabal
install
will work the second time. For this reason, power users use
the cabal freeze
feature to pin down exactly the version of every
dependency, so that every invocation of cabal install
always comes
up with the same build plan. Power users also build in so-called
“sandboxes”, in order to isolate the actions of calling cabal
install
for building the one project from the actions of calling
cabal install
for building this other project.
In stack, all versions of all dependencies are explicit and determined
completely in a stack.yaml
file. Given the same stack.yaml
and OS,
stack build should always run the exact same build plan. This does
wonders for avoiding accidentally breaking the package database,
having reproducible behavior across your team, and producing reliable
and trustworthy build artifacts. It also makes it trivial for stack to
have a user-friendly UI of just installing dependencies when
necessary, since future invocations don’t have to guess what the build
plan of previous invocations was. The build plan is always obvious and
manifest. Unlike cabal sandboxes, isolation in stack is complete:
packages built against different versions of dependencies never
interfere, because stack transparently installs packages in separate
databases (but is smart enough to reuse databases when it is always
safe to do, hence keeping build times low).
Note that this doesn’t mean users have to painstakingly write out all package versions longhand. Stack supports naming package snapshots as shorthand for specifying sets of package versions that are known to work well together.
Other key design principles are portability (work consistently and have a consistent UI across all platforms), and very short ramp-up phase. It should be easy for a new user with little knowledge of Haskell to write “hello world” in Haskell, package it up and publish it with just a few lines of configuration or none at all. Learning a new programming language is challenge enough that learning a new package specification language is quite unnecessary. These principles are in contrast with those of platform specific and extremely general solutions such a Nix.
Modularity (do one thing and do it well), security (don’t trust stuff pulled from Internet unless you have a reason to) and UI consistency are also principles fundamental to the design, and a key strategies to keeping the bug count low. But more on that another time.
These have informed the following "nice to have" features compared to
cabal-install
:
- multi-package project support (build all packages in one go, test all packages in one go…),
- depend on experimental and unpublished packages directly, stored in Git repositories, not just Hackage and the local filesystem,
- transparently install the correct version of GHC automatically so that you don’t have to (and multiple concurrently installed GHC versions work just fine),
- optionally use Docker for bullet-proof isolation of all system resources and deploying full, self-contained Haskell components as microservices.
The technologies underpinning these features include:
- Git (for package index management),
- S3 (for high-reliability package serving),
- SSL libraries (for secure HTTP uploads and downloads),
- Docker,
- many state-of-the-art Haskell libraries.
These technologies have enabled swift development of stack without
reinventing the wheel and have helped keep the implementation stack
simple and accessible. With the benefit of a clean slate to start
from, we believe stack to be very hackable and easy to contribute to.
These are also technologies that cabal-install
did not have the
benefit of being able to use when it was first conceived some years
ago.
Whither cabal-install
, stack and other tools
Stack is but one tool for managing packages on your system and
building projects. Stack was designed specifically to interoperate
with the existing frameworks for package management and package
building, so that all your existing packages work as-is with stack,
but with the added benefit of a modern, predictable design. Because
stack is just Cabal under the hood, other tools such as
Halcyon for deployment and Nix are good fits complement
stack nicely, or indeed cabal-install
for those who prefer to work
with a UI that they know well. We have already heard reports of users
combining these tools to good effect. And remember: stack packages are
cabal-install
packages are super-new-fangled-cabal-tool packages.
You can write the exact same packages in stack or in another tool,
using curated package sets if you like, tight version bounds à la PVP
if you like, none or anything at all. stack likes to make common usage
easy but is otherwise very much policy agnostic.
Stack is a contributor friendly project, with already 18 contributors to the code in its very short existence, several times more bug reporters and documentation writers, and counting! Help make stack a better tool that suits your needs by filing bug reports and feature requests, improving the documentation and contributing new code. Above all, use stack, tell your friends about it. We hope stack will eliminate a documented bottleneck to community growth. And turn Haskell into a more productive language accessible to many more users.