User loginNavigation |
Imperative Functional Programs that Explain their Work
Imperative Functional Programs that Explain their Work
Program slicing provides explanations that illustrate how program outputs were produced from inputs. We build on an approach introduced in prior work by Perera et al., where dynamic slicing was defined for pure higher-order functional programs as a Galois connection between lattices of partial inputs and partial outputs. We extend this approach to imperative functional programs that combine higher-order programming with references and exceptions. We present proofs of correctness and optimality of our approach and a proof-of-concept implementation and experimental evaluation. Dynamic slicing answers the following question: if I only care about these specific part of the trace of my program execution, what are the only parts of the source program that I need to look at? For example, if the output of the program is a pair, can you show me that parts of the source that impacted the computation of the first component? If a part of the code is not involved in the trace, or not in the part of the trace that you care about, it is removed from the partial code returned by slicing. What I like about this work is that there is a very nice algebraic characterization of what slicing is (the Galois connection), that guides you in how you implement your slicing algorithm, and also serves as a specification to convince yourself that it is correct -- and "optimal", it actually removes all the program parts that are irrelevant. This characterization already existed in previous work (Functional Programs that Explain Their Work, Roly Perera, Umut Acar, James cheney, Paul Blain Levy, 2012), but it was done in a purely functional setting. It wasn't clear (to me) whether the nice formulation was restricted to this nice language, or whether the technique itself would scale to a less structured language. This paper extends it to effectful ML (mutable references and exceptions), and there it is much easier to see that it remains elegant and yet can scale to typical effectful programming languages. The key to the algebraic characterization is to recognize two order structures, one on source program fragment, and the other on traces. Program fragments are programs with hole, and a fragment is smaller than another if it has more holes. You can think of the hole as "I don't know -- or I don't care -- what the program does in this part", so the order is "being more or less defined". Traces are also partial traces with holes, where the holes means "I don't know -- or I don't care -- what happens in this part of the trace". The double "don't know" and "don't care" nature of the ordering is essential: the Galois connection specifies a slicer (that goes from the part of a trace you care about to the parts of a program you should care about) by relating it to an evaluator (that goes from the part of the program you know about to the parts of the trace you can know about). This specification is simple because we are all familiar with what evaluators are. Databases from finite categoriesSpivak and Kent (2011). Ologs: A categorical framework for knowledge representation:
Ologs are essentially RDFs extended to encompass commuting diagrams, so a visual little language. The paper talks about how database schema can automatically be extracted from ologs. By Charles Stewart at 2017-05-24 06:43 | Category Theory | login or register to post comments | other blogs | 8760 reads
Type Systems as MacrosType Systems as Macros, by Stephen Chang, Alex Knauth, Ben Greenman:
This looks pretty awesome considering it's not limited to simple typed languages, but extends all the way to System F and F-omega! Even better, they can reuse previous type systems to define new ones, thereby reducing the effort to implement more expressive type systems. All code and further details available here, and here's a blog post where Ben Greenman further discusses the related "type tailoring", and of course, these are both directly related to Active Libraries. Taken to its extreme, why not have an assembler with a powerful macro system of this sort as your host language, and every high-level language would be built on this. I'm not sure if this approach would extend that far, but it's an interesting idea. You'd need a cpp-like highly portable macro tool, and porting to a new platform consists of writing architecture-specific macros for some core language, like System F. This work may also conceptually dovetail with another thread discussing fexprs and compilation. By naasking at 2017-04-19 23:38 | DSL | Functional | Lambda Calculus | Meta-Programming | Type Theory | 7 comments | other blogs | 26310 reads
Idris 1.0 ReleasedIdris version 1.0 corresponds to the language as described in Type-Driven Development with Idris, published last week by Manning. kdb+ 3.5 released last monthkdb+ is a real-time time series database, known in the financial services universe as the fastest tick database on the market. It was first conceived by Arthur Whitney at Morgan Stanley as a prototype, and over the last 35+ years has grown to add many features. The database makes such aggressive usage of mmap() POSIX function for mapping file chunks into main memory, to the point where it has exposed issues with the implementation of mmap itself. Recently, the company now behind kdb+ has also built Kx for DAAS (Data-as-a-Service), which is basically a cloud-based, massively clustered version of kdb+ that deals with the curious oddity that kdb+ is effectively entirely singly threaded. For those interested in reading more about kdb+'s unique cloud architecture (as compared to "big data" solutions like Hadoop), you can read the following whitepapers as suggestive guidelines for how the q community thinks about truly "big data" several orders of magnitude faster and larger than most Hadoop data sets:
While I don't suggest these papers are the blueprint for copying/mimicking the DAAS product, it does help the LtU reader imagine a "different world" of data processing than the often cited Map/Reduce paper and other more mainstream approaches. What is particularly striking is how tiny q.exe (the program that runs kdb+ and provides a CLI for q scripting) is. Language researchers are looking at provably correct C compilers, and it is not a huge leap to think about the world soon seeing provably correct real-time time series databases using kdb+ as an inspiration. Another curiosity, relevant to us here at LtU, is that kdb+ has its own programming language, q. q is a variant of APL with a special library for statistics. Most "big data" solutions don't have native implementations for weighted average, which is a fairly important and frequently used function in quantitative finance, useful for computing volume weighted average price (VWAP) as well as tilt and weighted spread. q is itself implemented in another language, k. The whole language of each is just a couple lines of (terse) code. By Z-Bo at 2017-03-25 15:45 | DSL | Parallel/Distributed | Scientific Programming | 2 comments | other blogs | 16446 reads
Prolog vs mini-KanrenThere's an interesting Q&A on Stack Overflow. By Charles Stewart at 2017-03-10 18:37 | login or register to post comments | other blogs | 14717 reads
The complexity of abstract machinesI previously wrote about a brand of research by Guy Blelloch on the Cost semantics for functional languages, which let us make precise claim about the complexity of functional programs without leaving their usual and comfortable programming models (beta-reduction). While the complexity behavior of weak reduction strategies, such as call-by-value and call-by-name, is by now relatively well-understood, the lambda-calculus has a much richer range of reduction strategies, in particular those that can reduce under lambda-abstractions, whose complexity behavior is sensibly more subtle and was, until recently, not very well understood. (This has become a practical concern since the rise in usage of proof assistants that must implement reduction under binders and are very concerned about the complexity of their reduction strategy, which consumes a lot of time during type/proof-checking.) Beniamino Accatoli, who has been co-authoring a lot of work in that area, recently published on arXiv a new paper that has survey quality, and is a good introduction to this area of work and other pointers from the literature.
By gasche at 2017-01-12 01:09 | Lambda Calculus | login or register to post comments | other blogs | 41979 reads
Stroustrup's Rule and Layering Over TimeDave Herman is the voice of the oppressed: syntax is important, contrary to what you have been told! To illustrate he discusses what he calls Stroustrup's Rule:
Do Be Do Be DoMonads and algebraic effects are general concepts that give a definition of what a "side-effect" can be: an instance of monad, or an instance of algebraic effect, is a specific realization of a side-effect. While most programming languages provide a fixed family of built-in side-effects, monads or algebraic effects give a structured way to introduce a new notion of effect as a library. A recent avenue of programming language research is how to locally define several realizations of the same effect interface/signature. There may be several valid notions of "state" or "non-determinism" or "probabilistic choice", and different parts of a program may not require the same realization of those -- a typical use-case would be mocking an interaction with the outside world, for example. Can we let users locally define a new interpretation of an effect, or write code that is generic over the specific interpretation? There are several existing approaches, such as monad transformer stacks, free monads interpreters, monad reification and, lately, effect handlers, as proposed in the programming language Eff. Frank, presented in the publication below, is a new language with user-defined effects that makes effect handling a natural part of basic functional programming, instead of a separate, advanced feature. It is a significant advance in language design, simplifying effectful programming. Functions, called operators, do not just start computing a result from the value of their arguments, they interact with the computation of those arguments by having the opportunity to handle any side-effects arising during their evaluation. It feels like a new programming style, a new calling convention that blends call-by-value and effect handling -- Sam Lindley suggested the name call-by-handling. Frank also proposes a new type-and-effect system that corresponds to this new programming style. Operators handle some of the effects raised by their arguments, and silently forward the rest to the computation context; their argument types indicate which effects they handle. In other words, the static information carried by an argument types is the delta/increment between the effects permitted by the ambient computation and the effects of evaluating this argument. Frank calls this an adjustment over the ambient ability. This delta/increment style results in lightweight types for operators that can be used in different contexts (a form of effect polymorphism) without explicit quantification over effect variables. This design takes part in a research conversation on how to make type-and-effect systems usable, which is the major roadblock for their wider adoption -- Koka and Links are also promising in that regard, and had to explore elaborate conventions to elide their polymorphic variables. Another important part of Frank's type-system design is the explicit separation between values that are and computations that do. Theoretical works have long made this distinction (for example with Call-By-Push-Value), but programmers are offered the dichotomy of either having only effectful expressions or expressing all computations as values (Haskell's indirect style). Frank puts that distinction in the hands of the user -- this is different from distinguishing pure from impure computations, as done in F* or WhyML. If you wish to play with the language, a prototype implementation is available.
Do Be Do Be Do
(arXiv)
Contextual isomorphisms
Contextual Isomorphisms
This paper is based on a very simple idea that everyone familiar with type-systems can enjoy. It then applies it in a technical setting in which it brings a useful contribution. I suspect that many readers will find that second part too technical, but they may still enjoy the idea itself. To facilitate this, I will rephrase the abstract above in a way that I hope makes it accessible to a larger audience. The problem that the paper solves is: how do we know what it means
for two types to be equivalent? For many languages they are reasonable
definitions of equivalence (such that: there exists a bijection
between these two types in the language), but for more advanced
languages these definitions break down. For example, in presence of
hidden mutable state, one can build a pair of functions from the
one-element type To define what it means for two program fragments to be equivalent, we have a "gold standard", which is contextual equivalence: two program fragments are contextually equivalent if we can replace one for the other in any complete program without changing its behavior. This is simple to state, it is usually clear how to instantiate this definition for a new system, and it gives you a satisfying notion of equivalent. It may not be the most convenient one to work with, so people define others, more specific notions of equivalence (typically beta-eta-equivalence or logical relations); it is fine if they are more sophisticated, and their definiton harder to justify or understand, because they can always be compared to this simple definition to gain confidence. The simple idea in the paper above is to use this exact same trick to define what it means for two types to be equivalent. Naively, one could say that two types are equivalent if, in any well-typed program, one can replace some occurrences of the first type by occurrences of the second type, all other things being unchanged. This does not quite work, as changing the types that appear in a program without changing its terms would create ill-typed terms. So instead, the paper proposes that two types are equivalent when we are told how to transform any program using the first type into a program using the second type, in a way that is bijective (invertible) and compositional -- see the paper for details. Then, the author can validate this definition by showing that, when instantiated to languages (simple or complex) where existing notions of equivalence have been proposed, this new notion of equivalence corresponds to the previous notions. (Readers may find that even the warmup part of the paper, namely the sections 1 to 4, pages 1 to 6, are rather dense, with a compactly exposed general idea and arguably a lack of concrete examples that would help understanding. Surely this terseness is in large part a consequence of strict page limits -- conference articles are the tweets of computer science research. A nice side-effect (no pun intended) is that you can observe a carefully chosen formal language at work, designed to expose the setting and perform relevant proofs in minimal space: category theory, and in particular the concept of naturality, is the killer space-saving measure here.) |
Browse archivesActive forum topics |
Recent comments
3 hours 13 min ago
2 days 55 min ago
2 days 59 min ago
2 days 1 hour ago
2 days 2 hours ago
2 days 22 hours ago
2 days 22 hours ago
3 days 34 min ago
3 days 4 hours ago
3 days 11 hours ago