<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" xml:lang="en-US" xml:base="http://blog.ezyang.com/wp-atom.php">
	<title type="text">Inside 214-1E</title>
	<subtitle type="text">Existential Pontification and Generalized Abstract Digressions</subtitle>

	<updated>2017-03-26T15:36:54Z</updated>

	<link rel="alternate" type="text/html" href="http://blog.ezyang.com" />
	<id>http://blog.ezyang.com/feed/atom/</id>
	

	<generator uri="https://wordpress.org/" version="4.7.2">WordPress</generator>
	<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/ezyang" /><feedburner:info uri="ezyang" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Proposal: Suggest explicit type application for Foldable length and friends]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/78i3FFifBeo/" />
		<id>http://blog.ezyang.com/?p=9919</id>
		<!-- ezyang: omitted update time -->
		<published>2017-03-21T23:50:13Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[tl;dr If you use a Foldable function like length or null, where instance selection is solely determined by the input argument, you should make your code more robust by introducing an explicit type application specifying which instance you want. This isn't necessary for a function like fold, where the return type can cross-check if you've [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2017/03/proposal-suggest-explicit-type-application-for-foldable-length/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;&lt;strong&gt;tl;dr&lt;/strong&gt; &lt;em&gt;If you use a Foldable function like length or null, where instance selection is solely determined by the input argument, you should make your code more robust by introducing an explicit type application specifying which instance you want. This isn't necessary for a function like fold, where the return type can cross-check if you've gotten it right or not. If you don't provide this type application, GHC should give a warning suggesting you annotate it explicitly, in much the same way it suggests adding explicit type signatures to top-level functions.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Recently, there has been some dust kicked up about &lt;a class="reference external" href="https://mail.haskell.org/pipermail/libraries/2017-March/027716.html"&gt;Foldable instances causing &amp;quot;bad&amp;quot; code to compile&lt;/a&gt;. The prototypical example is this: you've written &lt;tt class="docutils literal"&gt;length (f x)&lt;/tt&gt;, where &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; is a function that returns a list &lt;tt class="docutils literal"&gt;[Int]&lt;/tt&gt;. At some future point in time, a colleague refactors &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; to return &lt;tt class="docutils literal"&gt;(Warnings, [Int])&lt;/tt&gt;. After the refactoring, will &lt;tt class="docutils literal"&gt;length (f x)&lt;/tt&gt; continue to type check? Yes: &lt;tt class="docutils literal"&gt;length (f x)&lt;/tt&gt; will always return 1, no matter how long the inner list is, because it is using the &lt;tt class="docutils literal"&gt;Foldable&lt;/tt&gt; instance for &lt;tt class="docutils literal"&gt;(,) Warnings&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;The solution proposed in the mailing list was to remove &lt;tt class="docutils literal"&gt;Foldable&lt;/tt&gt; for &lt;tt class="docutils literal"&gt;Either&lt;/tt&gt;, a cure which is, quite arguably, worse than the disease. But I think there is definitely merit to the complaint that the &lt;tt class="docutils literal"&gt;Foldable&lt;/tt&gt; instances for tuples and &lt;tt class="docutils literal"&gt;Either&lt;/tt&gt; enable you to write code that typechecks, but is totally wrong.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://mail.haskell.org/pipermail/libraries/2017-March/027743.html"&gt;Richard Eisenberg&lt;/a&gt; described this problem as the tension between the goals of &amp;quot;if it compiles, it works!&amp;quot; (Haskell must &lt;em&gt;exclude&lt;/em&gt; programs which don't work) and general, polymorphic code, which should be applicable in as many situations as possible. I think there is some more nuance here, however. Why is it that &lt;tt class="docutils literal"&gt;Functor&lt;/tt&gt; polymorphic code never causes problems for being &amp;quot;too general&amp;quot;, but &lt;tt class="docutils literal"&gt;Foldable&lt;/tt&gt; does? We can construct an analogous situation: I've written &lt;tt class="docutils literal"&gt;fmap (+2) (f x)&lt;/tt&gt;, where &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; once again returns &lt;tt class="docutils literal"&gt;[Int]&lt;/tt&gt;. When my colleague refactors &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; to return &lt;tt class="docutils literal"&gt;(Warnings, [Int])&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;fmap&lt;/tt&gt; now makes use of the &lt;tt class="docutils literal"&gt;Functor&lt;/tt&gt; instance &lt;tt class="docutils literal"&gt;(,) Warnings&lt;/tt&gt;, but the code fails to compile anyway, because the type of &lt;tt class="docutils literal"&gt;(+1)&lt;/tt&gt; doesn't line up with &lt;tt class="docutils literal"&gt;[Int]&lt;/tt&gt;. Yes, we can still construct situations with &lt;tt class="docutils literal"&gt;fmap&lt;/tt&gt; where code continues to work after a type change, but these cases are far more rare.&lt;/p&gt;
&lt;p&gt;There is a clear difference between these two programs: the &lt;tt class="docutils literal"&gt;fmap&lt;/tt&gt; program is &lt;em&gt;redundant&lt;/em&gt;, in the sense that the type is constrained by both the input container, the function mapping over it, and the context which uses the result. Just as with error-correcting codes, redundancy allows us to detect when an error has occurred; when you reduce redundancy, errors become harder to detect. With &lt;tt class="docutils literal"&gt;length&lt;/tt&gt;, the &lt;em&gt;only&lt;/em&gt; constraint on the selected instance is the input argument; if you get it wrong, we have no way to tell.&lt;/p&gt;
&lt;p&gt;Thus, the right thing to do is &lt;em&gt;reintroduce&lt;/em&gt; redundancy where it is needed. Functions like &lt;tt class="docutils literal"&gt;fold&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;toList&lt;/tt&gt; don't need extra redundancy, because they are cross-checked by the use of their return arguments. But functions like &lt;tt class="docutils literal"&gt;length&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;null&lt;/tt&gt; (and arguably &lt;tt class="docutils literal"&gt;maximum&lt;/tt&gt;, which only weakly constrains its argument to have an &lt;tt class="docutils literal"&gt;Ord&lt;/tt&gt; instance) don't have any redundancy: we should introduce redundancy in these places!&lt;/p&gt;
&lt;p&gt;Fortunately, with GHC 8.0 provides a very easy way of introducing this redundancy: an &lt;strong&gt;explicit type application.&lt;/strong&gt; (This was also independently &lt;a class="reference external" href="https://www.reddit.com/r/haskell/comments/5x4yka/deprecate_foldable_for_either/def96j4/"&gt;suggested by Faucelme&lt;/a&gt;.) In this regime, rather than write &lt;tt class="docutils literal"&gt;length (f x)&lt;/tt&gt;, you write &lt;tt class="docutils literal"&gt;length &amp;#64;[] (f x)&lt;/tt&gt;, saying that you wanted length for lists. If you wanted length for maps, you write &lt;tt class="docutils literal"&gt;length &amp;#64;(Map _) (f x)&lt;/tt&gt;. Now, if someone changes the type of &lt;tt class="docutils literal"&gt;f&lt;/tt&gt;, you will get a type error since the explicit type application no longer matches.&lt;/p&gt;
&lt;p&gt;Now, you can write this with your FTP code today. So there is just one more small change I propose we add to GHC: let users specify the type parameter of a function as &amp;quot;suggested to be explicit&amp;quot;. At the call-site, if this function is used without giving a type application, GHC will emit a warning (which can be disabled with the usual mechanism) saying, &amp;quot;Hey, I'm using the function at this type, maybe you should add a type application.&amp;quot; If you really want to suppress the warning, you could just type apply a type hole, e.g., &lt;tt class="docutils literal"&gt;length &amp;#64;_ (f x)&lt;/tt&gt;. As a minor refinement, you could also specify a &amp;quot;default&amp;quot; type argument, so that if we infer this argument, no warning gets emitted (this would let you use the list functions on lists without needing to explicitly specify type arguments).&lt;/p&gt;
&lt;p&gt;That's it! No BC-breaking flag days, no poisoning functions, no getting rid of FTP, no dropping instances: just a new pragma, and an opt-in warning that will let people who want to avoid these bugs. It won't solve all &lt;tt class="docutils literal"&gt;Foldable&lt;/tt&gt; bugs, but it should squash the most flagrant ones.&lt;/p&gt;
&lt;p&gt;What do people think?&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/78i3FFifBeo" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2017/03/proposal-suggest-explicit-type-application-for-foldable-length/#comments" thr:count="8" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2017/03/proposal-suggest-explicit-type-application-for-foldable-length/feed/atom/" thr:count="8" />
		<thr:total>8</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2017/03/proposal-suggest-explicit-type-application-for-foldable-length/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Prio: Private, Robust, and Scalable Computation of Aggregate Statistics]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/N89RboIToCA/" />
		<id>http://blog.ezyang.com/?p=9885</id>
		<!-- ezyang: omitted update time -->
		<published>2017-03-17T23:35:48Z</published>
		<category scheme="http://blog.ezyang.com" term="Security" />		<summary type="html"><![CDATA[I want to take the opportunity to advertise some new work from a colleague of mine, Henry Corrigan-Gibbs (in collaboration with the venerable Dan Boneh) on the subject of preserving privacy when collecting aggregate statistics. Their new system is called Prio and will be appearing at this year's NSDI. The basic problem they tackle is [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2017/03/prio-private-robust-and-scalable-computation-of-aggregate-statistics/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;I want to take the opportunity to advertise some new work from a colleague of mine, &lt;a class="reference external" href="https://www.henrycg.com/"&gt;Henry Corrigan-Gibbs&lt;/a&gt; (in collaboration with the venerable Dan Boneh) on the subject of preserving privacy when collecting aggregate statistics.  Their new system is called &lt;a class="reference external" href="https://www.henrycg.com/pubs/nsdi17prio/"&gt;Prio&lt;/a&gt; and will be appearing at this year's NSDI.&lt;/p&gt;
&lt;p&gt;The basic problem they tackle is this: suppose you're Google and you want to collect some statistics on your users to compute some aggregate metrics, e.g., averages or a linear regression fit:&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/prio/regression-good.png" src="/img/prio/regression-good.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;A big problem is how to collect this data without compromising the privacy of your users. To preserve privacy, you &lt;em&gt;don't&lt;/em&gt; want to know the data of each of your individual users: you'd like to get this data in completely anonymous form, and only at the end of your collection period, get an aggregate statistic.&lt;/p&gt;
&lt;p&gt;This is an old problem; there are a &lt;a class="reference external" href="https://github.com/google/rappor"&gt;number&lt;/a&gt; of &lt;a class="reference external" href="http://nms.csail.mit.edu/projects/privacy/privstats-ccs.pdf"&gt;existing&lt;/a&gt; &lt;a class="reference external" href="https://iakkus.github.io/papers/2013-sigcomm-chen.pdf"&gt;systems&lt;/a&gt; which achieve this goal with varying tradeoffs. Prio tackles one particularly tough problem in the world of private aggregate data collection: robustness in the face of malicious clients. Suppose that you are collecting data for a linear regression, and the inputs your clients send you are completely anonymous.  A malicious client could send you a bad data point that could skew your entire data set; and since you never get to see the individual data points of your data set, you would never notice:&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/prio/regression-bad.png" src="/img/prio/regression-bad.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Thus, Prio looks at the problem of anonymously collecting data, while at the same time being able to &lt;em&gt;validate&lt;/em&gt; that the data is reasonable.&lt;/p&gt;
&lt;p&gt;The mechanism by which Prio does this is pretty cool, and so in this post, I want to explain the key insights of their protocol. Prio operates in a regime where a client &lt;em&gt;secret shares&lt;/em&gt; their secret across a pool of servers which are assumed to be non-colluding; as long as at least one server is honest, nothing is revealed about the client's secret until the servers jointly agree to publish the aggregate statistic.&lt;/p&gt;
&lt;p&gt;Here is the problem: given a secret share of some hidden value, how can we &lt;em&gt;efficiently&lt;/em&gt; check if it is valid?  To answer this question, we first have to explain a little bit about the world of secret sharing.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;A secret sharing scheme allows you to split a secret into many pieces, so that the original secret cannot be recovered unless you have some subset of the pieces. There are amazingly simple constructions of secret sharing: suppose that your secret is the number &lt;em&gt;x&lt;/em&gt; in some field (e.g., integers modulo some prime &lt;em&gt;p&lt;/em&gt;), and you want to split it into &lt;em&gt;n&lt;/em&gt; parts. Then, let the first &lt;em&gt;n-1&lt;/em&gt; shares be random numbers in the field, the last random number be &lt;em&gt;x&lt;/em&gt; minus the sum of the previous shares. You reconstruct the secret by summing all the shares together. This scheme is information theoretically secure: with only &lt;em&gt;n-1&lt;/em&gt; of the shares, you have learned nothing about the underlying secret. Another interesting property of this secret sharing scheme is that it is homomorphic over addition. Let your shares of x and y be &lt;img src='http://s0.wp.com/latex.php?latex=%5Bx%5D_i&amp;#038;bg=ffffff&amp;#038;fg=000000&amp;#038;s=0' alt='[x]_i' title='[x]_i' class='latex' /&gt; and &lt;img src='http://s0.wp.com/latex.php?latex=%5By%5D_i&amp;#038;bg=ffffff&amp;#038;fg=000000&amp;#038;s=0' alt='[y]_i' title='[y]_i' class='latex' /&gt;: then &lt;img src='http://s0.wp.com/latex.php?latex=%5Bx%5D_i+%2B+%5By%5D_i&amp;#038;bg=ffffff&amp;#038;fg=000000&amp;#038;s=0' alt='[x]_i + [y]_i' title='[x]_i + [y]_i' class='latex' /&gt; form secret shares of &lt;em&gt;x + y&lt;/em&gt;, since addition in a field is commutative (so I can reassociate each of the pairwise sums into the sum for x, and the sum for y.)&lt;/p&gt;
&lt;p&gt;Usually, designing a scheme with homomorphic addition is easy, but having a scheme that supports addition and multiplication simultaneously (so that you can compute interesting arithmetic circuits) is a bit more difficult. Suppose you want to compute an arithmetic circuit on some a secret shared value: additions are easy, but to perform a multiplication, most multiparty computation schemes (Prio uses &lt;a class="reference external" href="https://www.cs.bris.ac.uk/~nigel/FHE-MPC/Lecture8.pdf"&gt;Beaver's MPC protocol&lt;/a&gt;) require you to perform a round of communication:&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/prio/mpc.png" src="/img/prio/mpc.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;While you can batch up multiplications on the same &amp;quot;level&amp;quot; of the circuit, so that you only to do as many rounds as the maximum depth of multiplications in the circuit, for large circuits, you may end up having to do quite a bit of communication. Henry tells me that fully homomorphic secret sharing has been the topic of some research ongoing research; for example, &lt;a class="reference external" href="https://eprint.iacr.org/2016/585"&gt;this paper&lt;/a&gt; about homomorphic secret sharing won best paper at CRYPTO last year.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Returning to Prio, recall that we had a secret share of the user provided input, and we would like to check if it is valid according to some arithmetic circuit. As we've seen above, we could try using a multi-party computation protocol to compute shares of the output of the circuit, reveal the output of the circuit: if it says that the input is valid, accept it. But this would require quite a few rounds of communication to actually do the computation!&lt;/p&gt;
&lt;p&gt;Here is one of the key insights of Prio: we don't need the servers to &lt;em&gt;compute&lt;/em&gt; the result of the circuit--an honest client can do this just fine--we just need them to &lt;em&gt;verify&lt;/em&gt; that a computation of the circuit is valid. This can be done by having the client ship shares of all of the intermediate values on each of the wires of the circuit, having the servers recompute the multiplications on these shares, and then comparing the results with the intermediate values provided to us by the client:&lt;/p&gt;
&lt;div class="outer-image"&gt;&lt;div class="inner-image"&gt;&lt;img alt="/img/prio/validate.png" src="/img/prio/validate.png" /&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;When we transform the problem from a &lt;em&gt;computation&lt;/em&gt; problem to a &lt;em&gt;verification&lt;/em&gt; one, we now have an &lt;em&gt;embarrassingly parallel&lt;/em&gt; verification circuit, which requires only a single round to multiply each of the intermediate nodes of the circuit.&lt;/p&gt;
&lt;p&gt;There is only one final problem: how are we to check that the recomputed multiplies of the shares and the client provided intermediate values are consistent? We can't publish the intermediate values of the wire (that would leak information about the input!) We &lt;em&gt;could&lt;/em&gt; build a bigger circuit to do the comparison and combine the results together, but this would require more rounds of communication.&lt;/p&gt;
&lt;p&gt;To solve this problem, Prio adopts an elegant trick from Ben-Sasson'12 (&lt;a class="reference external" href="https://eprint.iacr.org/2011/629.pdf"&gt;Near-linear unconditionally-secure multiparty computation with a dishonest minority&lt;/a&gt;): rather than publish the &lt;em&gt;entire&lt;/em&gt; all of the intermediate wires, treat them as polynomials and publish the evaluation of each polynomial at a random point. If the servers behave correctly, they reveal nothing about the original polynomials; furthermore, with high probability, if the original polynomials are not equal, then the evaluation of the polynomials at a random point will also be not equal.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;This is all very wonderful, but I'd like to conclude with a cautionary tale: you have to be &lt;em&gt;very&lt;/em&gt; careful about how you setup these polynomials. Here is the pitfall: suppose that a malicious server homomorphically &lt;em&gt;modifies&lt;/em&gt; one of their shares of the input, e.g., by adding some delta. Because our secret shares are additive, adding a delta to one of the share causes the secret to also be modified by this delta! If the adversary can carry out the rest of the protocol with this modified share, when the protocol finishes running, he finds out whether or not the &lt;em&gt;modified&lt;/em&gt; secret was valid. This leaks information about the input: if your validity test was &amp;quot;is the input 0 or 1&amp;quot;, then if you (homomorphically) add one to the input and it is still valid, you know that it definitely was zero!&lt;/p&gt;
&lt;p&gt;Fortunately, this problem can be fixed by &lt;em&gt;randomizing&lt;/em&gt; the polynomials, so that even if the input share is shifted, the rest of the intermediate values that it computes cannot be shifted in the same way.  The details are described in the section &amp;quot;Why randomize the polynomials?&amp;quot; I think this just goes to show how tricky the design of cryptographic systems can be!&lt;/p&gt;
&lt;p&gt;In any case, if this has piqued your interest, &lt;a class="reference external" href="https://www.henrycg.com/pubs/nsdi17prio/"&gt;go read the paper&lt;/a&gt;! If you're at MIT, you can also go see Henry give a seminar on the subject on &lt;a class="reference external" href="http://css.csail.mit.edu/security-seminar/details.html#Mar2217"&gt;March 22&lt;/a&gt; at the MIT CSAIL Security Seminar.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/N89RboIToCA" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2017/03/prio-private-robust-and-scalable-computation-of-aggregate-statistics/#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2017/03/prio-private-robust-and-scalable-computation-of-aggregate-statistics/feed/atom/" thr:count="0" />
		<thr:total>0</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2017/03/prio-private-robust-and-scalable-computation-of-aggregate-statistics/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Designing the Backpack signature ecosystem]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/ugpwE4LcCCU/" />
		<id>http://blog.ezyang.com/?p=9867</id>
		<!-- ezyang: omitted update time -->
		<published>2017-03-11T11:40:42Z</published>
		<category scheme="http://blog.ezyang.com" term="Backpack" />		<summary type="html"><![CDATA[Suppose you are a library writer interested in using Backpack. Backpack says that you can replace a direct dependency on a function, type or package with one or more signatures. You typecheck against a signature and your end user picks how they want to eventually implement the signature. Sounds good right? But there's a dirty [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2017/03/designing-the-backpack-signature-ecosystem/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Suppose you are a library writer interested in using Backpack.  Backpack says that you can replace a direct dependency on a function, type or package with one or more &lt;em&gt;signatures&lt;/em&gt;.  You typecheck against a signature and your end user picks how they want to eventually implement the signature.&lt;/p&gt;
&lt;p&gt;Sounds good right? But there's a dirty little secret: to get all of this goodness, you have to &lt;em&gt;write&lt;/em&gt; a signature--you know, a type signature for each function and type that you want to use in your library. And we all know how much Haskellers &lt;a class="reference external" href="https://ghc.haskell.org/trac/ghc/ticket/1409"&gt;hate writing signatures&lt;/a&gt;. But Backpack has a solution to this: rather than repeatedly rewrite signatures for all your packages, a conscientious user can put a signature in a package for reuse in other packages.&lt;/p&gt;
&lt;p&gt;For the longest time, I thought that this was &amp;quot;enough&amp;quot;, and it would be a simple matter of sitting down and writing some tutorials for how to write a signature package. But as I sat down and started writing signature packages myself, I discovered that there was more than one way to set things up. In the post, I want to walk through two different possible designs for a collection of signature packages. They fall out of the following considerations:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;How many signature packages for, e.g., &lt;tt class="docutils literal"&gt;bytestring&lt;/tt&gt;, should there be? There could be exactly one, or perhaps a separate &lt;em&gt;package&lt;/em&gt; for each API revision?&lt;/li&gt;
&lt;li&gt;Should it be possible to post a new version of a signature package? Under what circumstances should this be allowed?&lt;/li&gt;
&lt;li&gt;For developers of a library, a larger signature is more convenient, since it gives you more functionality to work with.  For a client, however, a smaller signature is better, because it reduces the implementation burden. Should signature packages be setup to encourage big or small signatures by default?&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="section" id="a-signature-package-per-release"&gt;
&lt;h3&gt;A signature package per release&lt;/h3&gt;
&lt;p&gt;Intuitively, every release of a package is also associated with a &amp;quot;signature&amp;quot; specifying what functions that release supports.  One could conclude, then, that there should be a signature package per release, each package describing the interface of each version of the package in question. (Or, one could reasonably argue that GHC should be able to automatically infer the signature from a package. This is not so easy to do, for reasons beyond the scope of this post.)&lt;/p&gt;
&lt;p&gt;However, we have to be careful how we perform releases of each of these signatures.  One obvious but problematic thing to do is this: given &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-0.10.8.1&lt;/span&gt;&lt;/tt&gt;, also release a &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-sig-0.10.8.1&lt;/span&gt;&lt;/tt&gt;. The problem is that in today's Haskell ecosystem, it is strongly assumed that only &lt;em&gt;one&lt;/em&gt; version of a package is ever selected. Thus, if I have one package that requires &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-sig&lt;/span&gt; == 0.10.8.1&lt;/tt&gt;, and another package that requires &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-sig&lt;/span&gt; == 0.10.8.2&lt;/tt&gt;, this will fail if we try to dependency solve for both packages at the same time. We could make this scheme work by teaching Cabal and Stack how to link against multiple versions of a signature package, but at the moment, it's not practical.&lt;/p&gt;
&lt;p&gt;An easy way to work around the &amp;quot;multiple versions&amp;quot; problem is to literally create a new package for every version of bytestring. The syntax for package names is a bit irritating (alphanumeric characters plus hyphens only, and no bare numbers between a hyphen), but you could imagine releasing &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1008&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1009&lt;/span&gt;&lt;/tt&gt;, etc., one for each version of the API that is available. Once a signature package is released, it should never be updated, except perhaps to fix a mistranscription of a signature.&lt;/p&gt;
&lt;p&gt;Under semantic versioning, packages which share the same major version are supposed to only add functionality, not take it away. Thus, these successive signature packages can also be built on one another: for example &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1009&lt;/span&gt;&lt;/tt&gt; can be implemented by inheriting all of the functions from &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1008&lt;/span&gt;&lt;/tt&gt;, and only adding the new functions that were added in 0.10.9.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="a-signature-package-per-major-release-series"&gt;
&lt;h3&gt;A signature package per major release series&lt;/h3&gt;
&lt;p&gt;There is something very horrible about the above scheme: we're going to have &lt;em&gt;a lot&lt;/em&gt; of signature packages: one per version of a package! How awful would it be to have in the Hackage index &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v900&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v901&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v902&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1000&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1002&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1004&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1006&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v1008&lt;/span&gt;&lt;/tt&gt; as package choices? (With perhaps more if there exist patch releases that accidentally changed the API.) Thus, it is extremely tempting to try to find ways to reduce the number of signature packages we need to publish.&lt;/p&gt;
&lt;p&gt;Here is one such scheme which requires a signature package only for major releases; e.g., for &lt;tt class="docutils literal"&gt;bytestring&lt;/tt&gt;, we would only have &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v10&lt;/span&gt;&lt;/tt&gt;:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The latest version of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt; should correspond to the &amp;quot;biggest&amp;quot; API supported by the 0.9 series.  Thus, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt;, every minor version release of &lt;tt class="docutils literal"&gt;bytestring&lt;/tt&gt;, there is a new release of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt;: e.g., when &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-0.9.1.0&lt;/span&gt;&lt;/tt&gt; is released, we release &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9-1.0&lt;/span&gt;&lt;/tt&gt;. Each of the releases increases the functionality recorded in the signature, but is not permitted to make any other changes.&lt;/li&gt;
&lt;li&gt;When depending on the signature package, we instead provide a version bound specifying the minimum functionality of the signature required to build our package; e.g., &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt; &amp;gt;= 1.0&lt;/tt&gt;. (Upper bounds are not necessary, as it assumed that a signature package never breaks backwards compatibility.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There is one major difficulty: suppose that two unrelated packages both specify a version bound on &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt;. In this case, the ultimate version of the signature package we pick will be one that is compatible with both ranges; in practice, the &lt;em&gt;latest&lt;/em&gt; version of the signature. This is bad for two reasons: first, it means that we'll always end up requiring the client to implement the full glory of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt;, even if we are compatible with an earlier version in the release series. Second, it means that whenever &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9&lt;/span&gt;&lt;/tt&gt; is updated, we may bring more entities into scope: and if that introduces ambiguity, it will cause previously compiling code to stop compiling.&lt;/p&gt;
&lt;p&gt;Fortunately, there is a solution for this problem: use &lt;em&gt;signature thinning&lt;/em&gt; to reduce the required entities to precisely the set of entities you need. For example, suppose that &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9-0.0&lt;/span&gt;&lt;/tt&gt; has the following signature:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
signature Data.ByteString where
    data ByteString
    empty :: ByteString
    null :: ByteString -&amp;gt; Bool
&lt;/pre&gt;
&lt;p&gt;As a user, we only needed &lt;tt class="docutils literal"&gt;ByteString&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;empty&lt;/tt&gt;. Then we write in our local &lt;tt class="docutils literal"&gt;ByteString&lt;/tt&gt; signature:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
signature Data.ByteString (ByteString, empty) where
&lt;/pre&gt;
&lt;p&gt;and now &lt;em&gt;no matter&lt;/em&gt; what new functions get added to &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;bytestring-v9-0.0&lt;/span&gt;&lt;/tt&gt;, this signature will only ever require &lt;tt class="docutils literal"&gt;ByteString&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;empty&lt;/tt&gt;. (Another way of thinking about signature thinning is that it is a way to &lt;em&gt;centralize&lt;/em&gt; explicit import lists.) Notice that this scheme does &lt;em&gt;not&lt;/em&gt; work if you don't have a separate package per major release series, since thinning can't save you from a backwards incompatible change to the types of one of the functions you depend on.&lt;/p&gt;
&lt;p&gt;These signature thinning headers can be automatically computed; I've &lt;a class="reference external" href="https://hackage.haskell.org/package/ghc-usage"&gt;written a tool (ghc-usage)&lt;/a&gt; which does precisely this. Indeed, signature thinning is useful even in the first design, where they can be used to reduce the requirements of a package; however, with a signature package per major release, they are &lt;em&gt;mandatory&lt;/em&gt;; if you don't use them, your code might break.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;So, what design should we adopt? I think the first scheme (a signature package per release) is more theoretically pure, but I am very afraid of the &amp;quot;too many packages&amp;quot; problem. Additionally, I do think it's a good idea to thin signatures as much as possible (it's not good to ask for things you're not going to use!) which means the signature thinning requirement may not be so bad. Others I have talked to think the first scheme is just obviously the right thing to do.&lt;/p&gt;
&lt;p&gt;Which scheme do you like better? Do you have your own proposal? I'd love to hear what you think. (Also, if you'd like to bikeshed the naming convention for signature packages, I'm also all ears.)&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="appendix"&gt;
&lt;h3&gt;Appendix&lt;/h3&gt;
&lt;p&gt;After publishing this post, the comments of several folks made me realize that I hadn't motivated &lt;em&gt;why&lt;/em&gt; you would want to say something about the API of bytestring-0.10.8; don't you just want a signature of strings? So, to address this comment, I want to describe the line of reasoning that lead me down this path.&lt;/p&gt;
&lt;p&gt;I started off with a simple goal: write a signature for strings that had the following properties:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Be reasonably complete; i.e., contain all of the functions that someone who wanted to do &amp;quot;string&amp;quot; things might want, but&lt;/li&gt;
&lt;li&gt;Be reasonably universal; i.e., only support functions that would be supported by all the major string implementations (e.g., String, strict/lazy Text, strict/lazy Word8/Char8 ByteString and Foundation strings.)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It turned out that I needed to drop quite a number of functions to achieve universality; for example, transpose, foldl1, foldl1', mapAccumL/R, scanl, replicate, unfoldr, group, groupBy, inits, tails are not implemented in Foundation; foldr', foldr1', scanl1, scanr, scanr1, unfoldN, spanEnd, breakEnd, splitOn, isInfixOf are not implemented by the lazy types.&lt;/p&gt;
&lt;p&gt;This got me thinking that I could provide bigger signatures, if I didn't require the signature to support &lt;em&gt;all&lt;/em&gt; of the possible implementations; you might have a signature that lets you switch between only the &lt;em&gt;strict&lt;/em&gt; variants of string types, or even a signature that just lets you swap between Word8 and Char8 ByteStrings.&lt;/p&gt;
&lt;p&gt;But, of course, there are combinatorially many different ways one could put signatures together and it would be horrible to have to write (and name) a new signature package for each. So what is the &lt;em&gt;minimal&lt;/em&gt; unit of signature that one could write? And there is an obvious answer in this case: the API of a specific module (say, &lt;tt class="docutils literal"&gt;Data.ByteString&lt;/tt&gt;) in a specific version of the package. Enter the discussion above.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="appendix-2"&gt;
&lt;h3&gt;Appendix 2&lt;/h3&gt;
&lt;p&gt;Above, I wrote:&lt;/p&gt;
&lt;blockquote&gt;
But, of course, there are combinatorially many different ways one could put signatures together and it would be horrible to have to write (and name) a new signature package for each. So what is the &lt;em&gt;minimal&lt;/em&gt; unit of signature that one could write? And there is an obvious answer in this case: the API of a specific module (say, &lt;tt class="docutils literal"&gt;Data.ByteString&lt;/tt&gt;) in a specific version of the package.&lt;/blockquote&gt;
&lt;p&gt;I think there is an alternative conclusion to draw from this: someone should write a signature containing every single possible function that all choices of modules could support, and then have end-users responsible for paring these signatures down to the actual sets they use. So, everyone is responsible for writing big export lists saying what they use, but you don't have to keep publishing new packages for different combinations of methods.&lt;/p&gt;
&lt;p&gt;I'm pursuing this approach for now!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/ugpwE4LcCCU" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2017/03/designing-the-backpack-signature-ecosystem/#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2017/03/designing-the-backpack-signature-ecosystem/feed/atom/" thr:count="2" />
		<thr:total>2</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2017/03/designing-the-backpack-signature-ecosystem/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[How to integrate GHC API programs with Cabal]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/sIN8dxIVflc/" />
		<id>http://blog.ezyang.com/?p=9846</id>
		<!-- ezyang: omitted update time -->
		<published>2017-02-09T00:45:00Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[GHC is not just a compiler: it is also a library, which provides a variety of functionality that anyone interested in doing any sort of analysis on Haskell source code. Haddock, hint and ghc-mod are all packages which use the GHC API. One of the challenges for any program that wants to use the GHC [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2017/02/how-to-integrate-ghc-api-programs-with-cabal/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;GHC is not just a compiler: it is also a library, which provides a variety of functionality that anyone interested in doing any sort of analysis on Haskell source code. Haddock, hint and ghc-mod are all packages which use the GHC API.&lt;/p&gt;
&lt;p&gt;One of the challenges for any program that wants to use the GHC API is integration with Cabal (and, transitively, cabal-install and Stack). The most obvious problem that, when building against packages installed by Cabal, GHC needs to be passed appropriate flags telling it which package databases and actual packages should be used.  At this point, people tend to adopt &lt;a class="reference external" href="https://groups.google.com/forum/#!topic/haskell-cafe/3ZgLB2khhcI"&gt;some hacky strategy&lt;/a&gt; to get these flags, and hope for the best. For commonly used packages, this strategy will get the job done, but for the rare package that needs something extra--preprocessing, extra GHC flags, building C sources--it is unlikely that it will be handled correctly.&lt;/p&gt;
&lt;p&gt;A more reliable way to integrate a GHC API program with Cabal is &lt;em&gt;inversion of control&lt;/em&gt;: have Cabal call your GHC API program, not the other way around! How are we going to get Cabal/Stack to call our GHC API program? What we will do is replace the GHC executable which passes through all commands to an ordinary GHC, except for &lt;tt class="docutils literal"&gt;ghc &lt;span class="pre"&gt;--interactive&lt;/span&gt;&lt;/tt&gt;, which we will then pass to the GHC API program.  Then, we will call &lt;tt class="docutils literal"&gt;cabal repl&lt;/tt&gt;/&lt;tt class="docutils literal"&gt;stack repl&lt;/tt&gt; with our overloaded GHC, and where we would have opened a GHCi prompt, instead our API program gets run.&lt;/p&gt;
&lt;p&gt;With this, all of the flags which would have been passed to the invocation of &lt;tt class="docutils literal"&gt;ghc &lt;span class="pre"&gt;--interactive&lt;/span&gt;&lt;/tt&gt; are passed to our GHC API program. How should we go about parsing the flags?  The most convenient way to do this is by creating a &lt;a class="reference external" href="https://downloads.haskell.org/~ghc/master/users-guide/extending_ghc.html#frontend-plugins"&gt;frontend plugin&lt;/a&gt;, which lets you create a new major mode for GHC. By the time your code is called, all flags have already been processed (no need to muck about with &lt;tt class="docutils literal"&gt;DynFlags&lt;/tt&gt;!).&lt;/p&gt;
&lt;p&gt;Enough talk, time for some code.  First, let's take a look at a simple frontend plugin:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
module Hello (frontendPlugin) where

import GhcPlugins
import DriverPhases
import GhcMonad

frontendPlugin :: FrontendPlugin
frontendPlugin = defaultFrontendPlugin {
  frontend = hello
  }

hello :: [String] -&amp;gt; [(String, Maybe Phase)] -&amp;gt; Ghc ()
hello flags args = do
    liftIO $ print flags
    liftIO $ print args
&lt;/pre&gt;
&lt;p&gt;This frontend plugin is taken straight from the GHC documentation (but with enough imports to make it compile ;-). It prints out the arguments passed to it.&lt;/p&gt;
&lt;p&gt;Next, we need a wrapper program around GHC which will invoke our plugin instead of regular GHC when we are called with the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;--interactive&lt;/span&gt;&lt;/tt&gt; flag.  Here is a simple script which works on Unix-like systems:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
import GHC.Paths
import System.Posix.Process
import System.Environment

main = do
  args &amp;lt;- getArgs
  let interactive = &amp;quot;--interactive&amp;quot; `elem` args
      args' = do
        arg &amp;lt;- args
        case arg of
          &amp;quot;--interactive&amp;quot; -&amp;gt;
            [&amp;quot;--frontend&amp;quot;, &amp;quot;Hello&amp;quot;,
             &amp;quot;-plugin-package&amp;quot;, &amp;quot;hello-plugin&amp;quot;]
          _ -&amp;gt; return arg
  executeFile ghc False (args' ++ if interactive then [&amp;quot;-user-package-db&amp;quot;] else []) Nothing
&lt;/pre&gt;
&lt;p&gt;Give this a Cabal file, and then install it to the user package database with &lt;tt class="docutils literal"&gt;cabal install&lt;/tt&gt; (see the second bullet point below if you want to use a non-standard GHC via the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;-w&lt;/span&gt;&lt;/tt&gt; flag):&lt;/p&gt;
&lt;pre class="literal-block"&gt;
name:                hello-plugin
version:             0.1.0.0
license:             BSD3
author:              Edward Z. Yang
maintainer:          ezyang&amp;#64;cs.stanford.edu
build-type:          Simple
cabal-version:       &amp;gt;=1.10

library
  exposed-modules:     Hello
  build-depends:       base, ghc &amp;gt;= 8.0
  default-language:    Haskell2010

executable hello-plugin
  main-is:             HelloWrapper.hs
  build-depends:       base, ghc-paths, unix
  default-language:    Haskell2010
&lt;/pre&gt;
&lt;p&gt;Now, to run your plugin, you can do any of the following:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;cabal repl &lt;span class="pre"&gt;-w&lt;/span&gt; &lt;span class="pre"&gt;hello-plugin&lt;/span&gt;&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;cabal &lt;span class="pre"&gt;new-repl&lt;/span&gt; &lt;span class="pre"&gt;-w&lt;/span&gt; &lt;span class="pre"&gt;hello-plugin&lt;/span&gt;&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;stack repl &lt;span class="pre"&gt;--system-ghc&lt;/span&gt; &lt;span class="pre"&gt;--with-ghc&lt;/span&gt; &lt;span class="pre"&gt;hello-plugin&lt;/span&gt;&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To run the plugin on a specific package, pass the appropriate flags to the &lt;tt class="docutils literal"&gt;repl&lt;/tt&gt; command.&lt;/p&gt;
&lt;p&gt;The full code for this example can be retrieved at &lt;a class="reference external" href="https://github.com/ezyang/hello-plugin"&gt;ezyang/hello-plugin&lt;/a&gt; on GitHub.&lt;/p&gt;
&lt;p&gt;Here are a few miscellaneous tips and tricks:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;To pass extra flags to the plugin, add &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;--ghc-options=-ffrontend-opt=arg&lt;/span&gt;&lt;/tt&gt; as necessary (if you like, make another wrapper script around this!)&lt;/li&gt;
&lt;li&gt;If you installed &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;hello-plugin&lt;/span&gt;&lt;/tt&gt; with a GHC that is not the one from your PATH, you will need to put the correct &lt;tt class="docutils literal"&gt;ghc&lt;/tt&gt;/&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;ghc-pkg&lt;/span&gt;&lt;/tt&gt;/etc executables first in the PATH; Cabal's autodetection will get confused if you just use &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;-w&lt;/span&gt;&lt;/tt&gt;.  If you are running &lt;tt class="docutils literal"&gt;cabal&lt;/tt&gt;, another way to solve this problem is to pass &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;--with-ghc-pkg=PATH&lt;/span&gt;&lt;/tt&gt; to specify where &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;ghc-pkg&lt;/span&gt;&lt;/tt&gt; lives (Stack does not support this.)&lt;/li&gt;
&lt;li&gt;You don't have to install the plugin to your user package database, but then the wrapper program needs to be adjusted to be able to find wherever the package does end up being installed. I don't know of a way to get this information without writing a Custom setup script with Cabal; hopefully installation to the user package database is not too onerous for casual users.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;cabal-install&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;stack&lt;/tt&gt; differ slightly in how they go about passing home modules to the invocation of GHCi: &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;cabal-install&lt;/span&gt;&lt;/tt&gt; will call GHC with an argument for every module in the home package; Stack will pass a GHCi script of things to load. I'm not sure which is more convenient, but it probably doesn't matter too much if you know already know which module you want to look at (perhaps you got it from a frontend option.)&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/sIN8dxIVflc" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2017/02/how-to-integrate-ghc-api-programs-with-cabal/#comments" thr:count="6" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2017/02/how-to-integrate-ghc-api-programs-with-cabal/feed/atom/" thr:count="6" />
		<thr:total>6</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2017/02/how-to-integrate-ghc-api-programs-with-cabal/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Try Backpack: Cabal packages]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/0igSLleym6k/" />
		<id>http://blog.ezyang.com/?p=9771</id>
		<!-- ezyang: omitted update time -->
		<published>2017-01-18T04:17:21Z</published>
		<category scheme="http://blog.ezyang.com" term="Backpack" />		<summary type="html"><![CDATA[This post is part two of a series about how you can try out Backpack, a new mixin package system for Haskell. In the previous post, we described how to use a new ghc --backpack mode in GHC to quickly try out Backpack's new signature features. Unfortunately, there is no way to distribute the input [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2017/01/try-backpack-cabal-packages/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;This post is part two of a series about how you can try out Backpack, a new mixin package system for Haskell. In the &lt;a class="reference external" href="http://blog.ezyang.com/2016/10/try-backpack-ghc-backpack/"&gt;previous post&lt;/a&gt;, we described how to use a new &lt;tt class="docutils literal"&gt;ghc &lt;span class="pre"&gt;--backpack&lt;/span&gt;&lt;/tt&gt; mode in GHC to quickly try out Backpack's new signature features. Unfortunately, there is no way to distribute the input files to this mode as packages on Hackage. So in this post, we walk through how to assemble equivalent Cabal packages which have the same functionality.&lt;/p&gt;
&lt;div class="section" id="download-a-cabal-install-nightly"&gt;
&lt;h3&gt;Download a cabal-install nightly&lt;/h3&gt;
&lt;p&gt;Along with the GHC nightly, you will need a cabal-install nightly to run these examples.  Assuming that you have installed &lt;a class="reference external" href="https://launchpad.net/~hvr/+archive/ubuntu/ghc"&gt;hvr's PPA&lt;/a&gt; already, just &lt;tt class="docutils literal"&gt;aptitude install &lt;span class="pre"&gt;cabal-install-head&lt;/span&gt;&lt;/tt&gt; and you will get a Backpack-ready cabal-install in &lt;tt class="docutils literal"&gt;/opt/cabal/head/bin/&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;Otherwise, you will need to build &lt;a class="reference external" href="https://github.com/haskell/cabal"&gt;cabal-install from source&lt;/a&gt;. I recommend using a released version of GHC (e.g., your system GHC, not a nightly) to build cabal-install.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="where-we-are-going"&gt;
&lt;h3&gt;Where we are going&lt;/h3&gt;
&lt;p&gt;Here is an abridged copy of the code we developed in the last post, where I have removed all of the module/signature contents:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
unit str-bytestring where
    module Str

unit str-string where
    module Str

unit regex-types where
    module Regex.Types

unit regex-indef where
    dependency regex-types
    signature Str
    module Regex

unit main where
    dependency regex-types
    dependency regex-indef[Str=str-string:Str]     (Regex as Regex.String)
    dependency regex-indef[Str=str-bytestring:Str] (Regex as Regex.ByteString)
    module Main
&lt;/pre&gt;
&lt;p&gt;One obvious way to translate this file into Cabal packages is to define a package per unit.  However, we can also define a single package with many &lt;em&gt;internal libraries&lt;/em&gt;—a new feature, independent of Backpack, which lets you define private helper libraries inside a single package. Since this approach involves less boilerplate, we'll describe it first, before &amp;quot;productionizing&amp;quot; the libraries into separate packages.&lt;/p&gt;
&lt;p&gt;For all of these example, we assume that the source code of the modules and signatures have been copy-pasted into appropriate &lt;tt class="docutils literal"&gt;hs&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;hsig&lt;/tt&gt; files respectively. You can find these files in the &lt;a class="reference external" href="https://github.com/ezyang/backpack-regex-example/tree/source-only"&gt;source-only branch of backpack-regex-example&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="single-package-layout"&gt;
&lt;h3&gt;Single package layout&lt;/h3&gt;
&lt;p&gt;In this section, we'll step through the Cabal file which defines each unit as an internal library. You can find all the files for this version at the &lt;a class="reference external" href="https://github.com/ezyang/backpack-regex-example/tree/single-package"&gt;single-package branch of backpack-regex-example&lt;/a&gt;. This package can be built with a conventional &lt;tt class="docutils literal"&gt;cabal configure &lt;span class="pre"&gt;-w&lt;/span&gt; &lt;span class="pre"&gt;ghc-head&lt;/span&gt;&lt;/tt&gt; (replace &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;ghc-head&lt;/span&gt;&lt;/tt&gt; with the path to your copy of GHC HEAD) and then &lt;tt class="docutils literal"&gt;cabal build&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;The header of the package file is fairly ordinary, but as Backpack uses new Cabal features, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;cabal-version&lt;/span&gt;&lt;/tt&gt; must be set to &lt;tt class="docutils literal"&gt;&amp;gt;=1.25&lt;/tt&gt; (note that Backpack does NOT work with &lt;tt class="docutils literal"&gt;Custom&lt;/tt&gt; setup):&lt;/p&gt;
&lt;pre class="literal-block"&gt;
name:                regex-example
version:             0.1.0.0
build-type:          Simple
cabal-version:       &amp;gt;=1.25
&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Private libraries.&lt;/strong&gt; &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-bytestring&lt;/span&gt;&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-string&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-types&lt;/span&gt;&lt;/tt&gt; are completely conventional Cabal libraries that only have modules.  In previous versions of Cabal, we would have to make a package for each of them.  However, with private libraries, we can simply list multiple library stanzas annotated with the internal name of the library:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library str-bytestring
  build-depends:       base, bytestring
  exposed-modules:     Str
  hs-source-dirs:      str-bytestring

library str-string
  build-depends:       base
  exposed-modules:     Str
  hs-source-dirs:      str-string

library regex-types
  build-depends:       base
  exposed-modules:     Regex.Types
  hs-source-dirs:      regex-types
&lt;/pre&gt;
&lt;p&gt;To keep the modules for each of these internal libraries separate, we give each a distinct &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;hs-source-dirs&lt;/span&gt;&lt;/tt&gt;. These libraries can be depended upon inside this package, but are hidden from external clients; only the &lt;em&gt;public library&lt;/em&gt; (denoted by a &lt;tt class="docutils literal"&gt;library&lt;/tt&gt; stanza with no name) is publically visible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Indefinite libraries.&lt;/strong&gt; &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt; is slightly different, in that it has a signature.  But it is not too different writing a library for it: signatures go in the aptly named &lt;tt class="docutils literal"&gt;signatures&lt;/tt&gt; field:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library regex-indef
  build-depends:       base, regex-types
  signatures:          Str
  exposed-modules:     Regex
  hs-source-dirs:      regex-indef
&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Instantiating.&lt;/strong&gt; How do we instantiate &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt;? In our &lt;tt class="docutils literal"&gt;bkp&lt;/tt&gt; file, we had to explicitly specify how the signatures of the package were to be filled:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
dependency regex-indef[Str=str-string:Str]     (Regex as Regex.String)
dependency regex-indef[Str=str-bytestring:Str] (Regex as Regex.ByteString)
&lt;/pre&gt;
&lt;p&gt;With Cabal, these instantiations can be specified through a more indirect process of &lt;em&gt;mix-in linking&lt;/em&gt;, whereby the dependencies of a package are &amp;quot;mixed together&amp;quot;, with required signatures of one dependency being filled by exposed modules of another dependency.  Before writing the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-example&lt;/span&gt;&lt;/tt&gt; executable, let's write a &lt;tt class="docutils literal"&gt;regex&lt;/tt&gt; library, which is like &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt;, except that it is specialized for &lt;tt class="docutils literal"&gt;String&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library regex
  build-depends:       regex-indef, str-string
  reexported-modules:  Regex as Regex.String
&lt;/pre&gt;
&lt;p&gt;Here, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-string&lt;/span&gt;&lt;/tt&gt; are mix-in linked together: the &lt;tt class="docutils literal"&gt;Str&lt;/tt&gt; module from &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-string&lt;/span&gt;&lt;/tt&gt; fills the &lt;tt class="docutils literal"&gt;Str&lt;/tt&gt; requirement from &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt;. This library then reexports &lt;tt class="docutils literal"&gt;Regex&lt;/tt&gt; under a new name that makes it clear it's the &lt;tt class="docutils literal"&gt;String&lt;/tt&gt; instantiation.&lt;/p&gt;
&lt;p&gt;We can easily do the same for a &lt;tt class="docutils literal"&gt;ByteString&lt;/tt&gt; instantiated version of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library regex-bytestring
  build-depends:       regex-indef, str-bytestring
  reexported-modules:  Regex as Regex.ByteString
&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;Tie it all together.&lt;/strong&gt; It's simple enough to add the executable and then build the code:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
executable regex-example
  main-is:             Main.hs
  build-depends:       base, regex, regex-bytestring, regex-types
  hs-source-dirs:      regex-example
&lt;/pre&gt;
&lt;p&gt;In the root directory of the package, you can &lt;tt class="docutils literal"&gt;cabal configure; cabal build&lt;/tt&gt; the package (make sure you pass &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;-w&lt;/span&gt; &lt;span class="pre"&gt;ghc-head&lt;/span&gt;&lt;/tt&gt;!) Alternatively, you can use &lt;tt class="docutils literal"&gt;cabal &lt;span class="pre"&gt;new-build&lt;/span&gt;&lt;/tt&gt; to the same effect.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="there-s-more-than-one-way-to-do-it"&gt;
&lt;h3&gt;There's more than one way to do it&lt;/h3&gt;
&lt;p&gt;In the previous code sample, we used &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;reexported-modules&lt;/span&gt;&lt;/tt&gt; to rename modules at &lt;em&gt;declaration-time&lt;/em&gt;, so that they did not conflict with each other. However, this was possible only because we created extra &lt;tt class="docutils literal"&gt;regex&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-bytestring&lt;/span&gt;&lt;/tt&gt; libraries. In some situations (especially if we are actually creating new packages as opposed to internal libraries), this can be quite cumbersome, so Backpack offers a way to rename modules at &lt;em&gt;use-time&lt;/em&gt;, using the &lt;tt class="docutils literal"&gt;mixins&lt;/tt&gt; field. It works like this: any package declared in &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; can be specified in &lt;tt class="docutils literal"&gt;mixins&lt;/tt&gt; with an explicit renaming, specifying which modules should be brought into scope, with what name.&lt;/p&gt;
&lt;p&gt;For example, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-string&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-bytestring&lt;/span&gt;&lt;/tt&gt; both export a module named &lt;tt class="docutils literal"&gt;Str&lt;/tt&gt;. To refer to both modules without using package-qualified imports, we can rename them as follows:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
executable str-example
  main-is:             Main.hs
  build-depends:       base, str-string, str-bytestring
  mixins:              str-string     (Str as Str.String),
                       str-bytestring (Str as Str.ByteString)
  hs-source-dirs:      str-example
&lt;/pre&gt;
&lt;p&gt;The semantics of the &lt;tt class="docutils literal"&gt;mixins&lt;/tt&gt; field is that we bring only the modules explicitly listed in the import specification (&lt;tt class="docutils literal"&gt;Str as Str.String&lt;/tt&gt;) into scope for import. If a package never occurs in &lt;tt class="docutils literal"&gt;mixins&lt;/tt&gt;, then we default to bringing all modules into scope (giving us the traditional behavior of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt;). This does mean that if you say &lt;tt class="docutils literal"&gt;mixins: &lt;span class="pre"&gt;str-string&lt;/span&gt; ()&lt;/tt&gt;, you can force a component to have a dependency on &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-string&lt;/span&gt;&lt;/tt&gt;, but NOT bring any of its module into scope.&lt;/p&gt;
&lt;p&gt;It has been argued package authors should avoid defining packages with &lt;a class="reference external" href="http://www.snoyman.com/blog/2017/01/conflicting-module-names"&gt;conflicting module names&lt;/a&gt;.  So supposing that we restructure &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-string&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;str-bytestring&lt;/span&gt;&lt;/tt&gt; to have unique module names:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library str-string
  build-depends:       base
  exposed-modules:     Str.String
  hs-source-dirs:      str-string

library str-bytestring
  build-depends:       base, bytestring
  exposed-modules:     Str.ByteString
  hs-source-dirs:      str-bytestring
&lt;/pre&gt;
&lt;p&gt;We would then need to rewrite &lt;tt class="docutils literal"&gt;regex&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-bytestring&lt;/span&gt;&lt;/tt&gt; to rename &lt;tt class="docutils literal"&gt;Str.String&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;Str.ByteString&lt;/tt&gt; to &lt;tt class="docutils literal"&gt;Str&lt;/tt&gt;, so that they fill the hole of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library regex
  build-depends:       regex-indef, str-string
  mixins:              str-string (Str.String as Str)
  reexported-modules:  Regex as Regex.String

library regex-bytestring
  build-depends:       regex-indef, str-bytestring
  mixins:              str-bytestring (Str.ByteString as Str)
  reexported-modules:  Regex as Regex.ByteString
&lt;/pre&gt;
&lt;p&gt;In fact, with the &lt;tt class="docutils literal"&gt;mixins&lt;/tt&gt; field, we can avoid defining the &lt;tt class="docutils literal"&gt;regex&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-bytestring&lt;/span&gt;&lt;/tt&gt; shim libraries entirely. We can do this by declaring &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-indef&lt;/span&gt;&lt;/tt&gt; twice in &lt;tt class="docutils literal"&gt;mixins&lt;/tt&gt;, renaming the &lt;em&gt;requirements&lt;/em&gt; of each separately:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
executable regex-example
  main-is:             Main.hs
  build-depends:       base, regex-indef, str-string, str-bytestring, regex-types
  mixins:              regex-indef (Regex as Regex.String)
                          requires (Str as Str.String),
                       regex-indef (Regex as Regex.ByteString)
                          requires (Str as Str.ByteString)
  hs-source-dirs:      regex-example
&lt;/pre&gt;
&lt;p&gt;This particular example is given in its entirety at the &lt;a class="reference external" href="https://github.com/ezyang/backpack-regex-example/tree/better-single-package"&gt;better-single-package branch in backpack-regex-example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Note that requirement renamings are syntactically preceded by the &lt;tt class="docutils literal"&gt;requires&lt;/tt&gt; keyword.&lt;/p&gt;
&lt;p&gt;The art of writing Backpack packages is still in its infancy, so it's unclear what conventions will win out in the end. But here is my suggestion: when defining a module intending to implement a signature, follow the existing no-conflicting module names convention.  However, add a reexport of your module to the name of the signature.  This trick takes advantage of the fact that Cabal will not report that a module is redundant unless it is actually used.  So, suppose we have:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library str-string
  build-depends:       base
  exposed-modules:     Str.String
  reexported-modules:  Str.String as Str
  hs-source-dirs:      str-string

library str-bytestring
  build-depends:       base, bytestring
  exposed-modules:     Str.ByteString
  reexported-modules:  Str.ByteString as Str
  hs-source-dirs:      str-bytestring
&lt;/pre&gt;
&lt;p&gt;Now all of the following components work:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
library regex
  build-depends:       regex-indef, str-string
  reexported-modules:  Regex as Regex.String

library regex-bytestring
  build-depends:       regex-indef, str-bytestring
  reexported-modules:  Regex as Regex.ByteString

-- &amp;quot;import Str.String&amp;quot; is unambiguous, even if &amp;quot;import Str&amp;quot; is
executable str-example
  main-is:             Main.hs
  build-depends:       base, str-string, str-bytestring
  hs-source-dirs:      str-example

-- All requirements are renamed away from Str, so all the
-- instantiations are unambiguous
executable regex-example
  main-is:             Main.hs
  build-depends:       base, regex-indef, str-string, str-bytestring, regex-types
  mixins:              regex-indef (Regex as Regex.String)
                          requires (Str as Str.String),
                       regex-indef (Regex as Regex.ByteString)
                          requires (Str as Str.ByteString)
  hs-source-dirs:      regex-example
&lt;/pre&gt;
&lt;/div&gt;
&lt;div class="section" id="separate-packages"&gt;
&lt;h3&gt;Separate packages&lt;/h3&gt;
&lt;p&gt;OK, so how do we actually scale this up into an ecosystem of indefinite packages, each of which can be used individually and maintained by separate individuals? The library stanzas stay essentially the same as above; just create a separate package for each one. Rather than reproduce all of the boilerplate here, the full source code is available in the &lt;a class="reference external" href="https://github.com/ezyang/backpack-regex-example/tree/multiple-packages"&gt;multiple-packages branch of backpack-regex-example&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There is one important gotcha: the package manager needs to know how to instantiate and build these Backpack packages (in the single package case, the smarts were encapsulated entirely inside the &lt;tt class="docutils literal"&gt;Cabal&lt;/tt&gt; library). As of writing, the only command that knows how to do this is &lt;tt class="docutils literal"&gt;cabal &lt;span class="pre"&gt;new-build&lt;/span&gt;&lt;/tt&gt; (I plan on adding support to &lt;tt class="docutils literal"&gt;stack&lt;/tt&gt; eventually, but not until after I am done writing my thesis; and I do not plan on adding support to old-style &lt;tt class="docutils literal"&gt;cabal install&lt;/tt&gt; ever.)&lt;/p&gt;
&lt;p&gt;Fortunately, it's very easy to use &lt;tt class="docutils literal"&gt;cabal &lt;span class="pre"&gt;new-build&lt;/span&gt;&lt;/tt&gt; to build &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;regex-example&lt;/span&gt;&lt;/tt&gt;; just say &lt;tt class="docutils literal"&gt;cabal &lt;span class="pre"&gt;new-build&lt;/span&gt; &lt;span class="pre"&gt;-w&lt;/span&gt; &lt;span class="pre"&gt;ghc-head&lt;/span&gt; &lt;span class="pre"&gt;regex-example&lt;/span&gt;&lt;/tt&gt;. Done!&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusions"&gt;
&lt;h3&gt;Conclusions&lt;/h3&gt;
&lt;p&gt;If you actually want to use Backpack &lt;em&gt;for real&lt;/em&gt;, what can you do? There are a number of possibilities:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;If you are willing to use GHC 8.2 only, and you only need to parametrize code internally (where the public library looks like an ordinary, non-Backpack package), using Backpack with internal libraries is a good fit. The resulting package will be buildable with Stack and cabal-install, as long as you are using GHC 8.2. This is probably the most pragmatic way you can make use of Backpack; the primary problem is that Haddock doesn't know how to deal with &lt;a class="reference external" href="https://github.com/haskell/haddock/issues/563"&gt;reexported modules&lt;/a&gt;, but this should be fixable.&lt;/li&gt;
&lt;li&gt;If you are willing to use &lt;tt class="docutils literal"&gt;cabal &lt;span class="pre"&gt;new-build&lt;/span&gt;&lt;/tt&gt; only, then you can also write packages which have requirements, and let clients decide however they want to implement their packages.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Probably the biggest &amp;quot;real-world&amp;quot; impediment to using Backpack, besides any lurking bugs, is subpar support for Haddock. But if you are willing to overlook this (for now, in any case), please give it a try!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/0igSLleym6k" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2017/01/try-backpack-cabal-packages/#comments" thr:count="10" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2017/01/try-backpack-cabal-packages/feed/atom/" thr:count="10" />
		<thr:total>10</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2017/01/try-backpack-cabal-packages/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[A tale of backwards compatibility in ASTs]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/LJoUTRs0Rjc/" />
		<id>http://blog.ezyang.com/?p=9820</id>
		<!-- ezyang: omitted update time -->
		<published>2017-01-01T04:35:33Z</published>
		<category scheme="http://blog.ezyang.com" term="Haskell" />		<summary type="html"><![CDATA[Those that espouse the value of backwards compatibility often claim that backwards compatibility is simply a matter of never removing things. But anyone who has published APIs that involve data structures know that the story is not so simple. I'd like to describe my thought process on a recent BC problem I'm grappling with on [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2016/12/a-tale-of-backwards-compatibility-in-asts/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Those that espouse the value of backwards compatibility often claim that backwards compatibility is simply a matter of never &lt;em&gt;removing&lt;/em&gt; things. But anyone who has published APIs that involve data structures know that the story is not so simple. I'd like to describe my thought process on a recent BC problem I'm grappling with on the Cabal file format. As usual, I'm always interested in any insights and comments you might have.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The status quo.&lt;/strong&gt; The &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; field in a Cabal file is used to declare dependencies on other packages. The format is a comma-separated list of package name and version constraints, e.g., &lt;tt class="docutils literal"&gt;base &amp;gt;= 4.2 &amp;amp;&amp;amp; &amp;lt; 4.3&lt;/tt&gt;.  Abstractly, we represent this as a list of &lt;tt class="docutils literal"&gt;Dependency&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data Dependency = Dependency PackageName VersionRange
&lt;/pre&gt;
&lt;p&gt;The effect of an entry in &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; is twofold: first, it specifies a version constraint which a dependency solver takes into account when picking a version of the package; second, it brings the modules of that package into scope, so that they can be used.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The extension.&lt;/strong&gt; We added support for &amp;quot;internal libraries&amp;quot; in Cabal, which allow you to specify multiple libraries in a single package. For example, suppose you're writing a library, but there are some internal functions that you want to expose to your test suite but not the general public. You can place these functions in an internal library, which is depended upon by both the public library and the test suite, but not available to external packages.&lt;/p&gt;
&lt;p&gt;For more motivation, see the original &lt;a class="reference external" href="https://github.com/haskell/cabal/issues/269"&gt;feature request&lt;/a&gt;, but for the purpose of this blog post, we're interested in the question of how to specify a dependency on one of these internal libraries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Attempt #1: Keep the old syntax.&lt;/strong&gt; My first idea for a new syntax for internal libraries was to keep the syntax of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; &lt;em&gt;unchanged&lt;/em&gt;. To refer to an internal library named &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt;, you simply write &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends:&lt;/span&gt; foo&lt;/tt&gt;; an internal library shadows any external package with the same name.&lt;/p&gt;
&lt;p&gt;Backwards compatible? Absolutely not. Remember that the original interpretation of entries in &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; is of &lt;em&gt;package&lt;/em&gt; names and version ranges. So if you had code that assumed that there actually is an external package for each entry in &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; would choke in an unexpected way when a dependency on an internal library was specified. This is exactly what happened with cabal-install's dependency solver, which needed to be updated to filter out dependencies that corresponded to internal libraries.&lt;/p&gt;
&lt;p&gt;One might argue that it is acceptable for old code to break if the new feature is used. But there is a larger, philosophical objection to overloading package names in this way: don't call something a package name if it... isn't actually a package name!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Attempt #2: A new syntax.&lt;/strong&gt; Motivated by this philosophical concern, as well as the problem that you couldn't simultaneously refer to an internal library named &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; and an external package named &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt;, we introduce a new syntactic form: to refer to the internal library &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; in the package &lt;tt class="docutils literal"&gt;pkg&lt;/tt&gt;, we write &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends:&lt;/span&gt; pkg:foo&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;Since there's a new syntactic form, our internal AST also has to change to handle this new form. The obvious thing to do is introduce a new type of dependency:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data BuildDependency =
  BuildDependency PackageName
                  (Maybe UnqualComponentName)
                  VersionRange
&lt;/pre&gt;
&lt;p&gt;and say that the contents of &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; is a list of &lt;tt class="docutils literal"&gt;BuildDependency&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;When it comes to changes to data representation, this is a &amp;quot;best-case scenario&amp;quot;, because we can easily write a function &lt;tt class="docutils literal"&gt;BuildDependency &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Dependency&lt;/tt&gt;. So supposing our data structure for describing library build information looked something like this:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data BuildInfo = BuildInfo {
    targetBuildDepends :: [Dependency],
    -- other fields
  }
&lt;/pre&gt;
&lt;p&gt;We can preserve backwards compatibility by turning &lt;tt class="docutils literal"&gt;targetBuildDepends&lt;/tt&gt; into a function that reads out the new, extend field, and converts it to the old form:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data BuildInfo = BuildInfo {
    targetBuildDepends2 :: [BuildDependency],
    -- other fields
  }

targetBuildDepends :: BuildInfo -&amp;gt; [Dependency]
targetBuildDepends = map buildDependencyToDependency
                   . targetBuildDepends2
&lt;/pre&gt;
&lt;p&gt;Critically, this takes advantage of the fact that record selectors in Haskell look like functions, so we can replace a selector with a function without affecting downstream code.&lt;/p&gt;
&lt;p&gt;Unfortunately, this is not actually true. Haskell also supports &lt;em&gt;record update&lt;/em&gt;, which lets a user overwrite a field as follows: &lt;tt class="docutils literal"&gt;bi { targetBuildDepends = new_deps }&lt;/tt&gt;. If we look at Hackage, there are actually a dozen or so uses of &lt;tt class="docutils literal"&gt;targetBuildDepends&lt;/tt&gt; in this way. So, if we want to uphold backwards-compatibility, we can't delete this field. And unfortunately, Haskell doesn't support overloading the meaning of record update (perhaps the lesson to be learned here is that you should never export record selectors: export some lenses instead).&lt;/p&gt;
&lt;p&gt;It is possible that, in balance, breaking a dozen packages is a fair price to pay for a change like this. But let's suppose that we are dead-set on maintaining BC.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Attempt #3: Keep both fields.&lt;/strong&gt; One simple way to keep the old code working is to just keep both fields:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data BuildInfo = BuildInfo {
    targetBuildDepends  :: [Dependency],
    targetBuildDepends2 :: [BuildDependency],
    -- other fields
  }
&lt;/pre&gt;
&lt;p&gt;We introduce a new invariant, which is that &lt;tt class="docutils literal"&gt;targetBuildDepends bi == map buildDependencyToDependency (targetBuildDepends2 bi)&lt;/tt&gt;. See the problem? Any legacy code which updates &lt;tt class="docutils literal"&gt;targetBuildDepends&lt;/tt&gt; probably won't know to update &lt;tt class="docutils literal"&gt;targetBuildDepends2&lt;/tt&gt;, breaking the invariant and probably resulting in some very confusing bugs. Ugh.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Attempt #4: Do some math.&lt;/strong&gt; The problem with the representation above is that it is redundant, which meant that we had to add invariants to &amp;quot;reduce&amp;quot; the space of acceptable values under the type. Generally, we like types which are &amp;quot;tight&amp;quot;, so that, as Yaron Minsky puts it, we &amp;quot;make illegal states unrepresentable.&amp;quot;&lt;/p&gt;
&lt;p&gt;To think a little more carefully about the problem, let's cast it into a mathematical form.  We have an &lt;tt class="docutils literal"&gt;Old&lt;/tt&gt; type (isomorphic to &lt;tt class="docutils literal"&gt;[(PN, VR)]&lt;/tt&gt;) and a &lt;tt class="docutils literal"&gt;New&lt;/tt&gt; type (isomorphic to &lt;tt class="docutils literal"&gt;[(PN, Maybe CN, VR)]&lt;/tt&gt;).  &lt;tt class="docutils literal"&gt;Old&lt;/tt&gt; is a subspace of &lt;tt class="docutils literal"&gt;New&lt;/tt&gt;, so we have a well-known injection &lt;tt class="docutils literal"&gt;inj :: Old &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; New&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;When a user updates &lt;tt class="docutils literal"&gt;targetBuildDepends&lt;/tt&gt;, they apply a function &lt;tt class="docutils literal"&gt;f :: Old &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; Old&lt;/tt&gt;. In making our systems backwards compatible, we implicitly define a new function &lt;tt class="docutils literal"&gt;g :: New &lt;span class="pre"&gt;-&amp;gt;&lt;/span&gt; New&lt;/tt&gt;, which is an extension of &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; (i.e., &lt;tt class="docutils literal"&gt;inj . f == g . inj&lt;/tt&gt;): this function tells us what the &lt;em&gt;semantics&lt;/em&gt; of a legacy update in the new system is.  Once we have this function, we then seek a decomposition of &lt;tt class="docutils literal"&gt;New&lt;/tt&gt; into &lt;tt class="docutils literal"&gt;(Old, T)&lt;/tt&gt;, such that applying &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; to the first component of &lt;tt class="docutils literal"&gt;(Old, T)&lt;/tt&gt; gives you a new value which is equivalent to the result of having applied &lt;tt class="docutils literal"&gt;g&lt;/tt&gt; to &lt;tt class="docutils literal"&gt;New&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;Because in Haskell, &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; is an opaque function, we can't actually implement many &amp;quot;common-sense&amp;quot; extensions. For example, we might want it to be the case that if &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; updates all occurrences of &lt;tt class="docutils literal"&gt;parsec&lt;/tt&gt; with &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;parsec-new&lt;/span&gt;&lt;/tt&gt;, the corresponding &lt;tt class="docutils literal"&gt;g&lt;/tt&gt; does the same update. But there is no way to distinguish between an &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; that updates, and an &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; that deletes the dependency on &lt;tt class="docutils literal"&gt;parsec&lt;/tt&gt;, and then adds a new dependency on &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;parsec-new&lt;/span&gt;&lt;/tt&gt;. (In the bidirectional programming world, this is the distinction between &lt;a class="reference external" href="https://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides.pdf"&gt;state-based and operation-based approaches&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;We really only can do something reasonable if &lt;tt class="docutils literal"&gt;f&lt;/tt&gt; only ever adds dependencies; in this case, we might write something like this:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
data BuildInfo = BuildInfo {
    targetBuildDepends :: [Dependency],
    targetSubLibDepends :: [(PackageName, UnqualComponentName)],
    targetExcludeLibDepends :: [PackageName],
    -- other fields
  }
&lt;/pre&gt;
&lt;p&gt;The conversion from this to &lt;tt class="docutils literal"&gt;BuildDependency&lt;/tt&gt; goes something like:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;For each &lt;tt class="docutils literal"&gt;Dependency pn vr&lt;/tt&gt; in &lt;tt class="docutils literal"&gt;targetBuildDepends&lt;/tt&gt;, if the package name is not mentioned in &lt;tt class="docutils literal"&gt;targetExcludeLibDepends&lt;/tt&gt;, we have &lt;tt class="docutils literal"&gt;BuildDependency pn Nothing vr&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;For each &lt;tt class="docutils literal"&gt;(pn, cn)&lt;/tt&gt; in &lt;tt class="docutils literal"&gt;targetSubLibDepends&lt;/tt&gt; where there is a &lt;tt class="docutils literal"&gt;Dependency pn vr&lt;/tt&gt; (the package names are matching), we have &lt;tt class="docutils literal"&gt;BuildDependency pn (Just cn) vr&lt;/tt&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Stepping back for a moment, &lt;em&gt;is this really the code we want to write&lt;/em&gt;? If the modification is not monotonic, we'll get into trouble; if someone reads out &lt;tt class="docutils literal"&gt;targetBuildDepends&lt;/tt&gt; and then writes it into a fresh &lt;tt class="docutils literal"&gt;BuildInfo&lt;/tt&gt;, we'll get into trouble. Is it really reasonable to go to these lengths to achieve such a small, error-prone slice of backwards compatibility?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conclusions.&lt;/strong&gt; I'm still not exactly sure what approach I'm going to take to handle this particular extension, but there seem to be a few lessons:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Records are bad for backwards compatibility, because there is no way to overload a record update with a custom new update. Lenses for updates would be better.&lt;/li&gt;
&lt;li&gt;Record update is bad for backwards compatibility, because it puts us into the realm of &lt;em&gt;bidirectional programming&lt;/em&gt;, requiring us to reflect updates from the old world into the new world. If our records are read-only, life is much easier. On the other hand, if someone ever designs a programming language that is explicitly thinking about backwards compatibility, bidirectional programming better be in your toolbox.&lt;/li&gt;
&lt;li&gt;Backwards compatibility may be worse in the cure. Would you rather your software break at compile time because, yes, you really do have to think about this new case, or would you rather everything keep compiling, but break in subtle ways if the new functionality is ever used?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;What's your take? I won't claim to be a expert on questions of backwards compatibility, and would love to see you weigh in, whether it is about which approach I should take, or general thoughts about the interaction of programming languages with backwards compatibility.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/LJoUTRs0Rjc" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2016/12/a-tale-of-backwards-compatibility-in-asts/#comments" thr:count="9" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2016/12/a-tale-of-backwards-compatibility-in-asts/feed/atom/" thr:count="9" />
		<thr:total>9</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2016/12/a-tale-of-backwards-compatibility-in-asts/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Backpack and the PVP]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/I5t00gun6tg/" />
		<id>http://blog.ezyang.com/?p=9814</id>
		<!-- ezyang: omitted update time -->
		<published>2016-12-30T06:32:31Z</published>
		<category scheme="http://blog.ezyang.com" term="Backpack" />		<summary type="html"><![CDATA[In the PVP, you increment the minor version number if you add functions to a module, and the major version number if you remove function to a module. Intuitively, this is because adding functions is a backwards compatible change, while removing functions is a breaking change; to put it more formally, if the new interface [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2016/12/backpack-and-the-pvp/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;In the &lt;a class="reference external" href="http://pvp.haskell.org/"&gt;PVP&lt;/a&gt;, you increment the minor version number if you add functions to a module, and the major version number if you remove function to a module. Intuitively, this is because adding functions is a backwards compatible change, while removing functions is a breaking change; to put it more formally, if the new interface is a &lt;em&gt;subtype&lt;/em&gt; of the older interface, then only a minor version number bump is necessary.&lt;/p&gt;
&lt;p&gt;Backpack adds a new complication to the mix: signatures. What should the PVP policy for adding/removing functions from signatures should be? If we interpret a package with required signatures as a &lt;em&gt;function&lt;/em&gt;, theory tells us the answer: signatures are &lt;a class="reference external" href="http://blog.ezyang.com/2014/11/tomatoes-are-a-subtype-of-vegetables/"&gt;contravariant&lt;/a&gt;, so adding required functions is breaking (bump the major version), whereas it is &lt;strong&gt;removing&lt;/strong&gt; required functions that is backwards-compatible (bump the minor version).&lt;/p&gt;
&lt;p&gt;However, that's not the end of the story.  Signatures can be &lt;em&gt;reused&lt;/em&gt;, in the sense that a package can define a signature, and then another package reuse that signature:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
unit sigs where
  signature A where
    x :: Bool
unit p where
  dependency sigs[A=&amp;lt;A&amp;gt;]
  module B where
    import A
    z = x
&lt;/pre&gt;
&lt;p&gt;In the example above, we've placed a signature in the sigs unit, which p uses by declaring a dependency on sigs. B has access to all the declarations defined by the A in sigs.&lt;/p&gt;
&lt;p&gt;But there is something very odd here: if sigs were to ever remove its declaration for x, p would break (x would no longer be in scope). In this case, the PVP rule from above is incorrect: p must always declare an exact version bound on sigs, as any addition or deletion would be a breaking change.&lt;/p&gt;
&lt;p&gt;So we are in this odd situation:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;If we include a dependency with a signature, and we never use any of the declarations from that signature, we can specify a loose version bound on the dependency, allowing for it to remove declarations from the signature (making the signature easier to fulfill).&lt;/li&gt;
&lt;li&gt;However, if we ever import the signature and use anything from it, we must specify an exact bound, since removals are now breaking changes.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I don't think end users of Backpack should be expected to get this right on their own, so GHC (in this &lt;a class="reference external" href="https://phabricator.haskell.org/D2906"&gt;proposed patchset&lt;/a&gt;) tries to help users out by attaching warnings like this to declarations that come solely from packages that may have been specified with loose bounds:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
foo.bkp:9:11: warning: [-Wdeprecations]
    In the use of ‘x’ (imported from A):
    &amp;quot;Inherited requirements from non-signature libraries
    (libraries with modules) should not be used, as this
    mode of use is not compatible with PVP-style version
    bounds.  Instead, copy the declaration to the local
    hsig file or move the signature to a library of its
    own and add that library as a dependency.&amp;quot;
&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;UPDATE.&lt;/strong&gt; After the publishing of this post, we ended up removing this error, because it triggered in situations which were PVP-compatible. (The gory details: if a module reexported an entity from a signature, then a use of the entity from that module would have triggered the error, due to how DEPRECATED notices work.)&lt;/p&gt;
&lt;p&gt;Of course, GHC knows nothing about bounds, so the heuristic we use is that a package is a &lt;em&gt;signature package&lt;/em&gt; with exact bounds if it does not expose any modules. A package like this is only ever useful by importing its signatures, so we never warn about this case. We conservatively assume that packages that do expose modules might be subject to PVP-style bounds, so we warn in that case, e.g., as in:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
unit q where
  signature A where
    x :: Bool
  module M where -- Module!
unit p where
  dependency q[A=&amp;lt;A&amp;gt;]
  module B where
    import A
    z = x
&lt;/pre&gt;
&lt;p&gt;As the warning suggests, this error can be fixed by explicitly specifying &lt;tt class="docutils literal"&gt;x :: Bool&lt;/tt&gt; inside &lt;tt class="docutils literal"&gt;p&lt;/tt&gt;, so that, even if &lt;tt class="docutils literal"&gt;q&lt;/tt&gt; removes its requirement, no code will break:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
unit q where
  signature A where
    x :: Bool
  module M where -- Module!
unit p where
  dependency q[A=&amp;lt;A&amp;gt;]
  signature A where
    x :: Bool
  module B where
    import A
    z = x
&lt;/pre&gt;
&lt;p&gt;Or by putting the signature in a new library of its own (as was the case in the original example.)&lt;/p&gt;
&lt;p&gt;This solution isn't perfect, as there are still ways you can end up depending on inherited signatures in PVP-incompatible ways. The most obvious is with regards to types.  In the code below, we rely on the fact that the signature from q forces T to be type equal to Bool:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
unit q where
  signature A where
    type T = Bool
    x :: T
  module Q where
unit p where
  dependency q[A=&amp;lt;A&amp;gt;]
  signature A where
    data T
    x :: T
  module P where
    import A
    y = x :: Bool
&lt;/pre&gt;
&lt;p&gt;In principle, it should be permissible for q to relax its requirement on T, allowing it to be implemented as anything (and not just a synonym of Bool), but that change will break the usage of x in P.  Unfortunately, there isn't any easy way to warn in this case.&lt;/p&gt;
&lt;p&gt;A perhaps more principled approach would be to ban use of signature imports that come from non-signature packages. However, in my opinion, this complicates the Backpack model for not a very good reason (after all, some day we'll augment version numbers with signatures and it will be glorious, right?)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;To summarize.&lt;/strong&gt; If you want to reuse signatures from signature package, specify an &lt;em&gt;exact&lt;/em&gt; version bound on that package. If you use a component that is parametrized over signatures, do &lt;em&gt;not&lt;/em&gt; import and use declarations from those signatures; GHC will warn you if you do so.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/I5t00gun6tg" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2016/12/backpack-and-the-pvp/#comments" thr:count="4" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2016/12/backpack-and-the-pvp/feed/atom/" thr:count="4" />
		<thr:total>4</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2016/12/backpack-and-the-pvp/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Left-recursive parsing of Haskell imports and declarations]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/EH7wf5bO08o/" />
		<id>http://blog.ezyang.com/?p=9803</id>
		<!-- ezyang: omitted update time -->
		<published>2016-12-22T01:24:11Z</published>
		<category scheme="http://blog.ezyang.com" term="GHC" />		<summary type="html"><![CDATA[Suppose that you want to parse a list separated by newlines, but you want to automatically ignore extra newlines (just in the same way that import declarations in a Haskell file can be separated by one or more newlines.) Historically, GHC has used a curious grammar to perform this parse (here, semicolons represent newlines): decls [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2016/12/left-recursive-parsing-of-haskell-imports-and-declarations/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Suppose that you want to parse a list separated by newlines, but you want to automatically ignore extra newlines (just in the same way that &lt;tt class="docutils literal"&gt;import&lt;/tt&gt; declarations in a Haskell file can be separated by one or more newlines.) Historically, GHC has used a curious grammar to perform this parse (here, semicolons represent newlines):&lt;/p&gt;
&lt;pre class="literal-block"&gt;
decls : decls ';' decl
      | decls ';'
      | decl
      | {- empty -}
&lt;/pre&gt;
&lt;p&gt;It takes a bit of squinting, but what this grammar does is accept a list of decls, interspersed with one or more semicolons, with zero or more leading/trailing semicolons.  For example, &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;;decl;;decl;&lt;/span&gt;&lt;/tt&gt; parses as:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{- empty -}                             (rule 4)
{- empty -} ';' decl                    (rule 1)
{- empty -} ';' decl ';'                (rule 2)
{- empty -} ';' decl ';' ';' decl       (rule 1)
{- empty -} ';' decl ';' ';' decl ';'   (rule 2)
&lt;/pre&gt;
&lt;p&gt;(Rule 3 gets exercised if there is no leading semicolon.)&lt;/p&gt;
&lt;p&gt;This grammar has two virtues: first, it only requires a single state, which reduces the size of the parser; second, it is left-recursive, which means that an LALR parser (like Happy) can parse it in constant stack space.&lt;/p&gt;
&lt;p&gt;This code worked quite well for a long time, but it finally fell over in complexity when we added annotations to GHC. Annotations are a feature which track the locations of all keywords/punctuation/whitespace in source code, so that we byte-for-byte can reconstruct the source code from the abstract syntax tree (normally, this formatting information is lost at abstract syntax). With annotations, we needed to save information about each semicolon; for reasons that I don't quite understand, we were expending considerable effort to associate each semicolon with preceding declaration (leading semicolons were propagated up to the enclosing element.)&lt;/p&gt;
&lt;p&gt;This lead to some very disgusting parser code:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
importdecls :: { ([AddAnn],[LImportDecl RdrName]) }
        : importdecls ';' importdecl
                                {% if null (snd $1)
                                     then return (mj AnnSemi $2:fst $1,$3 : snd $1)
                                     else do
                                      { addAnnotation (gl $ head $ snd $1)
                                                      AnnSemi (gl $2)
                                      ; return (fst $1,$3 : snd $1) } }
        | importdecls ';'       {% if null (snd $1)
                                     then return ((mj AnnSemi $2:fst $1),snd $1)
                                     else do
                                       { addAnnotation (gl $ head $ snd $1)
                                                       AnnSemi (gl $2)
                                       ; return $1} }
        | importdecl             { ([],[$1]) }
        | {- empty -}            { ([],[]) }
&lt;/pre&gt;
&lt;p&gt;Can you tell what this does?! It took me a while to understand what the code is doing: the null test is to check if there is a &lt;em&gt;preceding&lt;/em&gt; element we can attach the semicolon annotation to: if there is none, we propagate the semicolons up to the top level.&lt;/p&gt;
&lt;p&gt;The crux of the issue was that, once annotations were added, &lt;strong&gt;the grammar did not match the logical structure of the syntax tree.&lt;/strong&gt; That's bad. Let's make them match up. Here are a few constraints:&lt;/p&gt;
&lt;ol class="arabic"&gt;
&lt;li&gt;&lt;p class="first"&gt;The leading semicolons are associated with the &lt;em&gt;enclosing&lt;/em&gt; AST element. So we want to parse them once at the very beginning, and then not bother with them in the recursive rule. Call the rule to parse zero or more semicolons &lt;tt class="docutils literal"&gt;semis&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
semis : semis ';'
      | {- empty -}
&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;If there are duplicate semicolons, we want to parse them all at once, and then associate them with the preceding declarations. So we also need a rule to parse one or more semicolons, which we will call &lt;tt class="docutils literal"&gt;semis1&lt;/tt&gt;; then when we parse a single declaration, we want to parse it as &lt;tt class="docutils literal"&gt;decl semis1&lt;/tt&gt;:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
semis1 : semis1 ';'
       | ';'
&lt;/pre&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Then, we can build up our parser in the following way:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
-- Possibly empty decls with mandatory trailing semicolons
decls_semi : decls_semi decl semis1
           | {- empty -}

-- Non-empty decls with no trailing semicolons
decls : decls_semi decl

-- Possibly empty decls with optional trailing semicolons
top1 : decls_semi
     | decls

-- Possibly empty decls with optional leading/trailing semicolons
top : semi top1
&lt;/pre&gt;
&lt;p&gt;We've taken care not to introduce any shift-reduce conflicts. It was actually a bit non-obvious how to make this happen, because in Haskell source files, we need to parse a list of import declarations (&lt;tt class="docutils literal"&gt;importdecl&lt;/tt&gt;), followed by a list of top-level declarations (&lt;tt class="docutils literal"&gt;topdecl&lt;/tt&gt;). It's a bit difficult to define the grammar for these two lists without introducing a shift-reduce conflict, but this seems to work:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
top : importdecls_semi topdecls_semi
    | importdecls_semi topdecls
    | importdecls
&lt;/pre&gt;
&lt;p&gt;It looks so simple, but there are a lot of plausible looking alternatives which introduce shift/reduce conflicts. There's an important meta-lesson here, which is that when trying to work out how to do something like this, it is best to experiment with on a smaller grammar, where re-checking is instantaneous (happy takes quite a bit of time to process all of GHC, which made the edit-recompile cycle a bit miserable.)&lt;/p&gt;
&lt;p&gt;I'd love to know if there's an even simpler way to do this, or if I've made a mistake and changed the set of languages I accept. Let me know in the comments. I've attached below a simple Happy grammar that you can play around with (build with &lt;tt class="docutils literal"&gt;happy filename.y; ghc &lt;span class="pre"&gt;--make&lt;/span&gt; filename.hs&lt;/tt&gt;).&lt;/p&gt;
&lt;pre class="literal-block"&gt;
{
module Main where

import Data.Char
}

%name parse
%expect 0
%tokentype { Token }
%error { parseError }

%token
      import          { TokenImport }
      decl            { TokenDecl }
      ';'             { TokenSemi }

%%

top     : semis top1                        { $2 }
top1    : importdecls_semi topdecls_semi    { (reverse $1, reverse $2) }
        | importdecls_semi topdecls         { (reverse $1, reverse $2) }
        | importdecls                       { (reverse $1, []) }

id_semi : importdecl semis1                 { $1 }
importdecls
        : importdecls_semi importdecl       { $2:$1 }
importdecls_semi
        : importdecls_semi id_semi          { $2:$1 }
        | {- empty -}                       { [] }

topdecls
        : topdecls_semi topdecl             { $2:$1 }
topdecls_semi
        : topdecls_semi topdecl semis1      { $2:$1 }
        | {- empty -}                       { [] }

semis   : semis ';'                         { () }
        | {- empty -}                       { () }

semis1  : semis1 ';'                        { () }
        | ';'                               { () }

importdecl
        : import                            { &amp;quot;import&amp;quot; }
topdecl : decl                              { &amp;quot;decl&amp;quot; }

{
parseError :: [Token] -&amp;gt; a
parseError p = error (&amp;quot;Parse error: &amp;quot; ++ show p)

data Token
      = TokenImport
      | TokenDecl
      | TokenSemi
 deriving Show

lexer :: String -&amp;gt; [Token]
lexer [] = []
lexer (c:cs)
      | isSpace c = lexer cs
      | isAlpha c = lexVar (c:cs)
lexer (';':cs) = TokenSemi : lexer cs

lexVar cs =
   case span isAlpha cs of
      (&amp;quot;import&amp;quot;,rest) -&amp;gt; TokenImport : lexer rest
      (&amp;quot;decl&amp;quot;,rest) -&amp;gt; TokenDecl : lexer rest

main = print . parse . lexer $ &amp;quot;import;;import;;decl&amp;quot;
}
&lt;/pre&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/EH7wf5bO08o" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2016/12/left-recursive-parsing-of-haskell-imports-and-declarations/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2016/12/left-recursive-parsing-of-haskell-imports-and-declarations/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2016/12/left-recursive-parsing-of-haskell-imports-and-declarations/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[The problem of reusable and composable specifications]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/PDiB4XkVYzY/" />
		<id>http://blog.ezyang.com/?p=9793</id>
		<!-- ezyang: omitted update time -->
		<published>2016-12-17T10:54:31Z</published>
		<category scheme="http://blog.ezyang.com" term="Backpack" />		<summary type="html"><![CDATA[It's not too hard to convince people that version bounds are poor approximation for a particular API that we depend on. What do we mean when we say &#62;= 1.0 &#38;&#38; &#60; 1.1? A version bound is a proxy some set of modules and functions with some particular semantics that a library needs to be [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2016/12/the-problem-of-reusable-and-composable-specifications/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;It's not too hard to convince people that version bounds are poor approximation for a particular API that we depend on. What do we mean when we say &lt;tt class="docutils literal"&gt;&amp;gt;= 1.0 &amp;amp;&amp;amp; &amp;lt; 1.1&lt;/tt&gt;? A version bound is a proxy some set of modules and functions with some particular semantics that a library needs to be built. Version bounds are imprecise; what does a change from 1.0 to 1.1 mean? Clearly, we should instead write down the actual specification (either types or contracts) of what we need.&lt;/p&gt;
&lt;p&gt;This all sounds like a good idea until you actually try to put it into practice, at which point you realize that version numbers had one great virtue: they're very short. Specifications, on the other hand, can get quite large: even just writing down the types of all the functions you depend on can take pages, let alone executable contracts describing more complex behavior.  To make matters worse, the same function will be depended upon repeatedly; the specification must be provided in each case!&lt;/p&gt;
&lt;p&gt;So we put on our PL hats and say, &amp;quot;Aha! What we need is a mechanism for &lt;em&gt;reuse&lt;/em&gt; and &lt;em&gt;composition&lt;/em&gt; of specifications. Something like... a &lt;em&gt;language&lt;/em&gt; of specification!&amp;quot; But at this point, there is disagreement about how this language should work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Specifications are code.&lt;/strong&gt;  If you talk to a Racketeer, they'll say, &amp;quot;Well, &lt;a class="reference external" href="https://docs.racket-lang.org/reference/contracts.html"&gt;contracts&lt;/a&gt; are just &lt;a class="reference external" href="https://docs.racket-lang.org/guide/Building_New_Contracts.html"&gt;code&lt;/a&gt;, and we know how to reuse and compose code!&amp;quot; You have primitive contracts to describe values, compose them together into contracts that describe functions, and then further compose these together to form contracts about modules.  You can collect these contracts into modules and share them across your code.&lt;/p&gt;
&lt;p&gt;There is one interesting bootstrapping problem: you're using your contracts to represent versions, but your contracts themselves live in a library, so should you version your contracts? Current thinking is that you shouldn't.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;But maybe you shouldn't compose them the usual way.&lt;/strong&gt; One of the things that stuck out to me when I was reading the frontmatter of Clojure's spec documentation is that &lt;a class="reference external" href="http://clojure.org/about/spec#_map_specs_should_be_of_keysets_only"&gt;map specs should be of keysets only&lt;/a&gt;, and &lt;a class="reference external" href="http://clojure.org/about/spec#_global_namespaced_names_are_more_important"&gt;how they deal with it&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The core principle of spec's design is that specifications for records should NOT take the form &lt;tt class="docutils literal"&gt;{ name: string, age: int }&lt;/tt&gt;. Instead, the specification is split into two pieces: a set of keys &lt;tt class="docutils literal"&gt;{ name, age }&lt;/tt&gt;, and a mapping from keys to specifications which, once registered, apply to all occurrences of a key in all map specifications. (Note that keys are all namespaced, so it is not some insane free-for-all in a global namespace.)  The justification for this:&lt;/p&gt;
&lt;blockquote&gt;
In Clojure we gain power by dynamically composing, merging and building up maps. We routinely deal with optional and partial data, data produced by unreliable external sources, dynamic queries etc. These maps represent various sets, subsets, intersections and unions of the same keys, and in general ought to have the same semantic for the same key wherever it is used. Defining specifications of every subset/union/intersection, and then redundantly stating the semantic of each key is both an antipattern and unworkable in the most dynamic cases.&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Back to the land of types.&lt;/strong&gt; Contracts can do all this because they are code, and we know how to reuse code. But in (non-dependently) typed languages, the language of types tends to be far more impoverished than than the language of values. To take Backpack as an (unusually expressive) example, the only operations we can perform on signatures is to define them (with full definitions for types) and to merge them together. So Backpack signatures run head long into the redundancy problem identified by spec: because the signature of a module includes the signatures of its functions, you end up having to repeat these function signatures whenever you write slightly different iterations of a module.&lt;/p&gt;
&lt;p&gt;To adopt the Clojure model, you would have to write a separate signature per module (each in their own package), and then have users combine them together by adding a &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;build-depends&lt;/span&gt;&lt;/tt&gt; on every signature they wanted to use:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
-- In Queue-push package
signature Queue where
  data Queue a
  push :: a -&amp;gt; Queue a -&amp;gt; Queue a

-- In Queue-pop package
signature Queue where
  data Queue a
  pop :: Queue a -&amp;gt; Maybe (Queue a, a)

-- In Queue-length package
signature Queue where
  data Queue a
  length :: Queue a -&amp;gt; Int

-- Putting them together (note that Queue is defined
-- in each signature; mix-in linking merges these
-- abstract data types together)
build-depends: Queue-push, Queue-pop, Queue-length
&lt;/pre&gt;
&lt;p&gt;In our current implementation of Backpack, this is kind of insane: to write the specification for a module with a hundred methods, you'd need a hundred packages. The ability to concisely define multiple public libraries in a single package might help but this involves design that doesn't exist yet. (Perhaps the cure is worse than the disease. The package manager-compiler stratification rears its ugly head again!) (Note to self: signature packages ought to be treated specially; they really shouldn't be built when you instantiate them.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conclusions.&lt;/strong&gt; A lot of my thinking here did not crystallize until I started reading about how dynamic languages like Clojure were grappling with the specification problem: I think this just goes to show how much we can learn by paying attention to other systems, even if their context is quite different. (If Clojure believed in data abstraction, I think they could learn a thing or two from how Backpack mix-in links abstract data declarations.)&lt;/p&gt;
&lt;p&gt;In Clojure, the inability to reuse specs is a deal breaker which lead them to spec's current design.  In Haskell, the inability to reuse type signatures flirts on the edge of unusability: types are &lt;em&gt;just&lt;/em&gt; short enough and copy-pasteable enough to be tolerable. Documentation for these types, less so; this is what lead me down my search for better mechanisms for signature reuse.&lt;/p&gt;
&lt;p&gt;Although Backpack's current design is &amp;quot;good enough&amp;quot; to get things done, I still wonder if we can't do something better. One tempting option is to allow for downstream signatures to selectively pick out certain functions from a larger signature file to add to their requirements. But if you require &lt;tt class="docutils literal"&gt;Queue.push&lt;/tt&gt;, you had better also require &lt;tt class="docutils literal"&gt;Queue.Queue&lt;/tt&gt; (without which, the type of &lt;tt class="docutils literal"&gt;push&lt;/tt&gt; cannot even be stated: the avoidance problem); this could lead to a great deal of mystery as to what exactly is required in the end. Food for thought.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/PDiB4XkVYzY" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2016/12/the-problem-of-reusable-and-composable-specifications/#comments" thr:count="3" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2016/12/the-problem-of-reusable-and-composable-specifications/feed/atom/" thr:count="3" />
		<thr:total>3</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2016/12/the-problem-of-reusable-and-composable-specifications/</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Edward Z. Yang</name>
						<uri>http://ezyang.com</uri>
					</author>
		<title type="html"><![CDATA[Thoughts about Spec-ulation (Rich Hickey)]]></title>
		<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/ezyang/~3/MAbEgNhYPns/" />
		<id>http://blog.ezyang.com/?p=9781</id>
		<!-- ezyang: omitted update time -->
		<published>2016-12-17T00:26:00Z</published>
		<category scheme="http://blog.ezyang.com" term="Backpack" /><category scheme="http://blog.ezyang.com" term="Programming Languages" />		<summary type="html"><![CDATA[Rich Hickey recently gave a keynote at Clojure/conj 2016, meditating on the problems of versioning, specification and backwards compatibility in language ecosystems. In it, Rich considers the &#34;extremist&#34; view, what if we built a language ecosystem, where you never, ever broke backwards compatibility. A large portion of the talk is spent grappling with the ramifications [&#8230;]]]></summary>
		<content type="html" xml:base="http://blog.ezyang.com/2016/12/thoughts-about-spec-ulation-rich-hickey/">
&lt;div class="document"&gt;


&lt;!-- -*- mode: rst -*- --&gt;
&lt;p&gt;Rich Hickey recently gave a &lt;a class="reference external" href="https://www.youtube.com/watch?v=oyLBGkS5ICk"&gt;keynote&lt;/a&gt; at Clojure/conj 2016, meditating on the problems of versioning, specification and backwards compatibility in language ecosystems. In it, Rich considers the &lt;a class="reference external" href="http://blog.ezyang.com/2012/11/extremist-programming/"&gt;&amp;quot;extremist&amp;quot; view&lt;/a&gt;, &lt;em&gt;what if we built a language ecosystem, where you never, ever broke backwards compatibility.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;A large portion of the talk is spent grappling with the ramifications of this perspective. For example:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Suppose you want to make a backwards-compatibility breaking change to a function. Don't &lt;em&gt;mutate&lt;/em&gt; the function, Richard says, give the function another name.&lt;/li&gt;
&lt;li&gt;OK, but how about if there is some systematic change you need to apply to many functions? That's still not an excuse: create a new namespace, and put all the functions there.&lt;/li&gt;
&lt;li&gt;What if there's a function you really don't like, and you really want to get rid of it? No, don't remove it, create a new namespace with that function absent.&lt;/li&gt;
&lt;li&gt;Does this sound like a lot of work to remove things? Yeah. So don't remove things!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In general, Rich wants us to avoid breakage by turning all changes into &lt;em&gt;accretion&lt;/em&gt;, where the old and new can coexist. &amp;quot;We need to bring functional programming [immutability] to the library ecosystem,&amp;quot; he says, &amp;quot;dependency hell is just mutability hell.&amp;quot; And to do this, there need to be tools for you to make a commitment to what it is that a library provides and requires, and not accidentally breaking this commitment when you release new versions of your software.&lt;/p&gt;
&lt;p&gt;He says a lot more in the talk, so I encourage you to give it a watch if you want to hear the whole picture.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;In general, I'm in favor of this line of thinking, because my feeling is that a large amount of breakage associated with software change that is just a product of negligence; breakage not for any good reason, breakage that could have been avoided if there was a little more help from tooling.&lt;/p&gt;
&lt;p&gt;That being said, I do have some thoughts about topics that are not so prominently featured in his talk.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Accretion is not a silver bullet... if you believe in data hiding.&lt;/strong&gt; In his talk, Rich implies that backwards compatibility can be maintained simply by committing not to &amp;quot;remove things&amp;quot;. As a Haskeller, this sounds obviously false to me: if I change the internal representation of some abstract type (or even the internal invariants), I &lt;em&gt;cannot&lt;/em&gt; just load up both old and new copies of the library and expect to pass values of this type between the two. Indeed, the typechecker won't even let you do this even if the representation hasn't changed.&lt;/p&gt;
&lt;p&gt;But, at least for Clojure, I think Rich is right. The reason is this: &lt;a class="reference external" href="http://codequarterly.com/2011/rich-hickey/"&gt;Clojure doesn't believe data hiding&lt;/a&gt;! The &lt;a class="reference external" href="http://clojure.org/reference/datatypes"&gt;prevailing style&lt;/a&gt; of Clojure code is that data types consist of immutable records with public fields that are passed around. And so a change to the representation of the data is a possibly a breaking change; non-breaking representation changes are simply not done. (I suspect a similar ethos explains why &lt;a class="reference external" href="http://stackoverflow.com/questions/25268545/why-does-npms-policy-of-duplicated-dependencies-work"&gt;duplicated dependencies in node.js&lt;/a&gt; work as well as they do.)&lt;/p&gt;
&lt;p&gt;I am not sure how I feel about this. I am personally a big believer in data abstraction, but I often admire the pragmatics of &amp;quot;everything is a map&amp;quot;. (I &lt;a class="reference external" href="https://twitter.com/ezyang/status/809704816150597633"&gt;tweeted&lt;/a&gt; about this earlier today, which provoked some thoughtful discussion.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Harmful APIs.&lt;/strong&gt; At several points in the talk, Rich makes fun of developers who are obsessed with taking away features from their users. (&amp;quot;I hate this function. I hate it, I hate it, I hate that people call it, I just want it out of my life.&amp;quot;) This downplays the very real, very important reasons why infinite backwards compatibility has been harmful to the software we write today.&lt;/p&gt;
&lt;p&gt;One need look no further than the &lt;a class="reference external" href="https://youtu.be/oyLBGkS5ICk?t=1h8m18s"&gt;systems with decades of backwards compatibility&lt;/a&gt; that Rich cites: the Unix APIs, Java and HTML. In all these cases, backwards compatibility has lead to harmful APIs sticking around far longer than they should: &lt;a class="reference external" href="https://randomascii.wordpress.com/2013/04/03/stop-using-strncpy-already/"&gt;strncpy&lt;/a&gt;, &lt;a class="reference external" href="http://stackoverflow.com/questions/1694036/why-is-the-gets-function-so-dangerous-that-it-should-not-be-used"&gt;gets&lt;/a&gt;, legacy parsers of HTML (XSS), &lt;a class="reference external" href="http://www.odi.ch/prog/design/newbies.php"&gt;Java antipatterns&lt;/a&gt;, etc. And there are examples galore in Android, C libraries, everywhere.&lt;/p&gt;
&lt;p&gt;In my opinion, library authors should design APIs in such a way that it is easy to do the right thing, and hard to do the wrong thing. And yes, that means sometimes that means you that you need to stop people from using insecure or easy-to-get wrong library calls.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Semantic versioning doesn't cause cascading version bumps, lack of version ranges is the cause.&lt;/strong&gt; In the slide &lt;a class="reference external" href="https://youtu.be/oyLBGkS5ICk?t=13m49s"&gt;&amp;quot;Do deps force Versioning?&amp;quot;&lt;/a&gt;, Rich describe a problem in the Clojure ecosystem which is that, when following semantic versioning, a new release of a package often causes cascading version bumps in the system.&lt;/p&gt;
&lt;p&gt;While the problem of cascading version bumps is &lt;a class="reference external" href="https://github.com/mojombo/semver/issues/148"&gt;a real question&lt;/a&gt; that applies to semantic versioning in general, the &amp;quot;cascading version bumps&amp;quot; Rich is referring to in the Clojure ecosystem stem from a much more mundane source: best practices is to &lt;a class="reference external" href="https://nelsonmorris.net/2012/07/31/do-not-use-version-ranges-in-project-clj.html"&gt;specify a specific version of a dependency&lt;/a&gt; in your package metadata. When a new version of a dependency comes out, you need to bump the version of a package so that you can update the recorded version of the dependency... and so forth.&lt;/p&gt;
&lt;p&gt;I'm not saying that Clojure is &lt;em&gt;wrong&lt;/em&gt; for doing things this way (version ranges have their own challenges), but in his talk Rich implies that this is a failure of semantic versioning... which it's not. If you use version ranges and aren't in the habit of reexporting APIs from your dependencies, updating the version range of a dependency is not a breaking change. If you have a solver that picks a single copy of a library for the entire application, you can even expose types from your dependency in your API.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;p&gt;Overall, I am glad that Clojure is thinking about how to put backwards compatibility first and foremost: often, it is in the most extreme applications of a principle that we learn the most. Is it the end of the story? No; but I hope that all languages continue slowly moving towards explicit specifications and tooling to help you live up to your promises.&lt;/p&gt;
&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/ezyang/~4/MAbEgNhYPns" height="1" width="1" alt=""/&gt;</content>
			<link rel="replies" type="text/html" href="http://blog.ezyang.com/2016/12/thoughts-about-spec-ulation-rich-hickey/#comments" thr:count="8" />
		<link rel="replies" type="application/atom+xml" href="http://blog.ezyang.com/2016/12/thoughts-about-spec-ulation-rich-hickey/feed/atom/" thr:count="8" />
		<thr:total>8</thr:total>
		<feedburner:origLink>http://blog.ezyang.com/2016/12/thoughts-about-spec-ulation-rich-hickey/</feedburner:origLink></entry>
	</feed>
