The blog post is disingenuous. We tried many times to contribute upstream fixes to Terraform providers, but HashiCorp would never accept them. So we've had to maintain forks. They lost their OSS DNA a long time ago, and this move just puts the final nail in the coffin.
Thankfully over time, they already pushed responsibility for most Terraform providers back onto their partners, so I'm hopeful the ecosystem of providers can still stay vibrant and open.
We are deep believers in open source---heck my last project at Microsoft was to take .NET open source and cross-platform, our CTO helped found TypeScript, and Pulumi is an Apache open source project---it seems HashiCorp no longer is.
If they think we'll go crawling back to their 100x more expensive 6-7 figure Terraform Enterprise garbage just because we can't use spacelift anymore, then I'll show them the team of engineers we can hire for the same dollars to move the whole stack to pure pulumi or crossplane or the various CDKs
The bald faced disingenuous nature of this change here is wild. They can't compete at their pricing because their pricing is absolutely insane over what the market can bear and they refuse to accept it.
They are going out of their way to make it less expensive to stop using terraform altogether right as so many new options have entered the market
I believe it's a bit too early to make this call but based on the experience of interacting with Terraform the binary it would be absolutely amazing for the community if Terraform could be turned into a library that can become a building block for higher level services.
I don't know the details of BSL, but can HashiCorp now require compensation/$$$ from Spacelift, Scalr, Env0, etc? In that case, these products can be forced to offer similar pricing as Terraform Cloud.
IANAL but I believe the BSL restrictions only apply to new upstream code versions. All HashiCorp repos can and always will be usable under MPL as they were up until the moment immediately before the license change.
In other words: if you hard fork now, you don't need to pay.
>We tried many times to contribute upstream fixes to Terraform providers, but HashiCorp would never accept them. So we've had to maintain forks. They lost their OSS DNA a long time ago, and this move just puts the final nail in the coffin.
OSS doesn't mean that you have to accept any PRs that showed up in your repo, nor does it mean that you have to let a competitor steer your project simply because you're building in the open. Without further elaboration, what you're calling "upstream fixes" may have been considered "working as intended" at HashiCorp. As I'm sure you're well aware, every contribution has to be maintained and each increasing contribution comes with an additional burden. Responsible maintainers on large scale OSS projects must be selective about the code they let in.
You have to acknowledge that all these OSS projects officially backed by a corporation don't want you to contribute certain features that are part of their enterprise offering. As soon as there's an "enterprise" tier, contributions are not only based on their merit, but also evaluated as a threat to their business model.
Sometimes it's not even obvious for external contributors, but there may be some small overlap with other paid features that are part of their product roadmap.
If a project on Github only has maintainers from the corporate side, you can be certain that they will ultimately drive the product for their own interest solely.
We should always pay close attention to the governance model of projects we depend on or that we wish to contribute to.
Of course, but the major difference is that if I don't like the maintainers, I can create a fork, build a community around it and happily run it in production. Imagine where nodejs would be right now if they had been BSL licensed in the iojs times.
Sure, OSS doesn't mean you have to take all PRs, but if your claim is that others are just taking your code and not giving anything back, one of the alleged leeches showing up to talk about how they've tried to give back is very much pertinent.
Having been on the upstream side of things, it's very he-said-she-said. HashiCorp code could be a PITA to contribute to, or maybe this particular provider was bad at contributing.
The only way to know for sure is to dive into the merged and unmerged PRs and see how they were handled.
I'm not affiliated in any way with one of their competitors. Co-workers and I sent bug fix PR's to for example Vault. The last couple of years almost none of them were merged. These were small bug fixes, not (large) feature additions.
I’m sorry, but no. These are usually simple bugs like “forgot to a set a field during refresh”. They almost always correspond to one or more Terraform issues too, often ones that have been open for 4-5 years or have been “marked as stale” by some infuriating bot.
Then don't complain about people not contributing to your projects. You reserve the right to reject my PR, and I reserve the right not to contribute any more.
I don't know if it is the case for the fixes pulumi sent, but for PRs I've made to terraform providers it can take a very long time for them to be looked at, and even longer to get merged. And I think it is mostly from nor having enough resources to approve and merge PRs. Although that could possibly be fixed by inviting developers outside the company to help with approval and merging, especially for providers.
That's a lot of assumptions you're making here. From my little use of terraform it did had a bunch of issues that were purely a bug and laying unfixed for a long time.
For example, the widely used 'count' anti-pattern is still present, and no actions have been taken up to this day. This topic has persisted for 5 years. 5 YEARS!!! That's what triggered my decision to migrate to Pulumi.
isn't the simpler explanation that they would in effect lose the ability to relicense the project and therefore lose control of their baby?
To not lose control you need to have people assign copyright which is generally a headache. I've only heard of the FSF doing that .. (not sure why this hasn't been streamlined electronically somehow)
Can I ask where Pulumi gets revenue from? (Honest question first time I have heard of you, quick look seems to be a CentOS for hashicorp ?)
I love the ethos of open source and have spoken at and helped run conferences, and had the pleasure of being paid to develop it - but the productivity I had when paid ten hours a day to work on OSS compared to whenever I get a chance between work family and everything else, well, it's better for everyone to get paid and release code, than not get paid and not write the code.
I see these semi-commercial licenses as the equivalent of a legal "just don't take the piss".
Would be interested in your side of the question. How do we keep on developing the code as well as keeping it open?
I am a paying Pulumi user. Their tool integrates with a cloud platform and we pay per resource managed by Pulumi.
Pulumi is one of several products where I like that it’s open source in case I need to move off their cloud, but hope that I don’t have to (Plausible is another).
Said well (and thank you for being a customer and valuable member of our community!)
The analogy I draw sometimes is that our open source infrastructure as code SDK ("Pulumi") is like Git, and our commercial offering ("Pulumi Cloud") is like GitHub.
Like GitHub, the Pulumi Cloud offers valuable features that go beyond the open source project for teams looking to manage lots of projects securely at scale, but we definitely love our open source community and want folks to have the choice to use Pulumi however makes the most sense for them.
This approach also has the nice consequence that we can be fully transparent with our community at all times while also building a strong, long-term business. If a new feature is part of the infrastructure as code SDK, it's open source and free; if it's part of the Pulumi Cloud SaaS, it's part of our commercial offering. This avoids needing to do things like artificially hold back features (like open core) or violating our commitment to the open source community (like Hashi's new license).
So here's my perspective on these two competing models:
1. I can read all of the code, modify it, and self-host it for my own purposes, but the license disallows me from re-selling it.
2. I can read, modify, self-host, and commercialize a subset of the code, and the rest is an opaque SaaS.
To me, as a customer with no interest in re-selling this code, I don't see how #2 is better than #1 in any way. And I find it incredibly mystifying that I keep seeing companies ragged on here for doing #1 while the model in #2 is somehow held up as the paragon of ethical virtue or something.
Can you help me understand? Why is it better for me to be able to read less of the code I'm running?
Convenience and reliability from a business perspective
For #2 in good faith using the github model here,
Sure there’s Git and Github. Also sourcehut, using google cloud source repository or any managed git service.
Either 1) I need the software and I can have a team maintain it.
Electing for the software-as-a-service vs self hosted model is in itself.
1. I can compute, resources, maintenance and time The proprietary or a version of the product myself or a fork and get the feature functionality from other open source or provider.
2. Pay for the GitHub licensing cost and using the service and ok with magic abstractions to operate the software. (Which admirably lately has been bad.
Also frame it as from the beginning of the git project elected to also build all the same parity features, would it be the same tool, be the product that exists, or brain share it has today. Maybe not.
I maybe misunderstanding you here but in these cases opaqueness is part your trade off to offload fairly complex for a marginal cost that’s s decision by you.
What I like best is to use a fully managed service that I can contribute changes or even self host a modified version if that's what I need to do to get what I need. But I don't want to self-host. But I highly value the option. And I highly value the ability to go read the code when I wonder "hmmm why is it doing that?" and maybe contribute a bug fix if it shouldn't be.
All of this works great with model #1 - with full "source available" with a license that limits re-sale - but is limited with model #2, where it only works for the "open core" portions of the product, but not the proprietary SaaS portion.
It’s also fair to say this isnt black and white and open source is vast and there are certain software and companies where opting for #2 that definitely feel like a big rug pull, money grab, and smack in the face to the community that supported them. (Is red hat in my opinion)
If you want to build something with a bunch of smart people for a long period of time the outcome is raising venture capital, paying people salaries that are competitive to share holders, and won’t implode. The bsl is a consequence of that but it is a rule to guard from the few bad apple in this case.
What’s ethical or virtuous or perfect is very nuanced
Usually the two are not mutually exclusive. Terraform Cloud (the HC equivalent of the aforementioned "opaque SaaS") is afaik not open source and never has been, you can't read the code for it or self-host it. Not opining on the broader issue, just clarifying this point.
Is the Pulumi Individual Edition open for use by solo founders operating as a sole proprietership? I can't find anything clarifying whether it's individual (as in hobbyist, nonprofessional) or individual, as in, one person not collaborating with anyone else.
Yes it is open to individual comercial use (companies much larger then sole proprieterships may only have one person doing inferstructure). They also have a version for nonprofits https://www.pulumi.com/pricing/open-source-free-tier/
Plausible is amazing, I love it. I moved off of another platform that started as open source and then went closed source, but Plausible ended up being a better platform over all anyways.
I'm not sure if open sourcing .NET is the best bit to put on your resume when Microsoft has been sabotaging the developer ecosystem to keep VS relevant. [1]
Not that I don't appreciate the effort. I'm sure what has been achieved involved a fair share of convincing too.
I really don’t think that was ever in doubt. You only need to use it for a very short time to find that the ergonomics are infinitely nicer than Terraform.
My biggest issue with terraform is concept of state file. It seems that Pulumi continues on this model. I wish someone came up with innovative idea of not needing state file.
> this is difficult for me to think about in the same way as trying to picture a 4th dimension.
Why? There are many possibilities, just not that efficient. One of them would involve tagging. The problem with that approach is that not all resources are taggable, and it would take longer to query them.
tagging provider resources? not speaking for every provider or api, but at least on AWS not everything that can be managed supports tagging.
also this requires N api calls each time you check the state for an update. scaling that to a larger team may be difficult due to rate limits.
infra as code tools have to operate within the parameters that the provider apis allow. so when i say it’s hard to imagine, i’m thinking about limitations of the underlying apis.
totally possible im
missing something obvious, but there’s a good reason that centralized state exists today. genuinely curious about how we could ditch it in a reasonable way with the existing big cloud APIs.
From my prelimary search, Bicep does utilize a state file, but it is completely hidden from the end users. Seems to be managed in Azure directly and automagically.
I think you might be misunderstanding this page. Bicep's "state file" is actual Azure state. Bicep is a nicer way to write ARM (Azure Resource Manager), which is pretty much an Azure API. if you are familiar with Kubernetes, ARM represents Azure internal state in the same way kubernetes YAML represents actual kubernetes objects stored in the cluster.
> No state or state files to manage: All state is stored in Azure. Users can collaborate and have confidence their updates are handled as expected.
Not sure how I could "misunderstand" this quote to mean anything else.
By that logic, CloudFormation is also the same; end users don't have access to the state so therefore the state file is the state of the environment itself? We, as users, have no idea what intermediate step exists between the configuration yaml and actual representation of an account's resources.
It sounds like a Terraform alternative, but looking at the website it doesn't really convey if it's a Terraform fork or ground-up re-write, or something else?
Not sure why you're so insistent on this when Pulumi folks themselves on this post have said that they contribute back up to the AWS provider source repo (1).
Maybe not all their providers utilize Terraform, but it was definitely how their product started out. When you run Pulimi using their AWS provider and get errors at runtime, some of those are verbatim Terraform errors.
Their docs and site is very clear to muddle this connection but if you ever used their API, you'd know this.
It's definitely possible. I patched the AWS Terraform provider. It took three months to merge the two line bugfix though. Terraform's biggest weakness may be that it's too ambitious for its own good. 1.7k issues on Terraform itself and another 3.7k on the AWS provider. Ended up using boto3 to build out my CD platform.
This anecdote is a lot less interesting, both because of the separation (you know some people vs they run a company with direct exposure) and lack of detail. I'm sure you do know some people who contribute, but you haven't given any details about their experience that would contradict OP's claim that contributing is hard.
I work at AWS in Professional Services until tomorrow. I worked with the SA and had meetings with the service team responsible for the AWS Service in question to discuss the API shortcomings that we needed for automations. He contributed to Terraform once the APIs became available from our service team and I wrote the equivalent CloudFormation custom resources for a project (and open sourced on the public AWS Samples GitHub repo after going through the approval process) as part of a larger project.
I found a bug in the underlying service API that affected both my implementation and the Terraform provider my coworker wrote. I posted a sample in our internal Slack channel where the developers of the AWS service hang out and they fixed the API bug relatively quickly.
My coworker, had no problem getting his TF code merged in. I know the code he wrote and I went to the GitHub page before posting this and it is in there.
Later as native CloudFormation support was added, I replaced my custom resources with the native equivalent as part of the larger project I was working on.
There's probably not much AWS contribution to the core of Terraform from AWS, but there's very little contribution to that from anybody outside of HashiCorp because contributions happen on the providers.
AWS is definitely involved with their provider though. AWS ProServ built out a whole account vending machine thing that was in Terraform (the name escapes me atm), and various other service teams and SAs are regularly involved in contributing to and growing the Terraform ecosystem.
It would be super disingenuous to imply AWS is not contributing to the success and growth of Terraform.
I am very much wondering this too. I've used Pulumi and like it a lot, it has a great UX in general. But the ecosystem for Terraform is orders of magnitude bigger, e.g. searching for help on Terraform is going to give a lot more results than Pulumi. As someone who can dig into details, this is not a big deal and can use Pulumi on personal projects but cannot in good faith recommend it for team projects only because of the ability to find resources is more important then.
I don't know if the license change actually means providers will not be able to work with Pulumi, but if it does, it seems risky to use Pulumi even for personal projects if newer provider versions (i.e., versions that work with newer products released by the cloud provider) will not work with Pulumi, it's a dead end. And that's not to mention the useful providers that aren't cloud and completely community developed that will not have the resources to maintain two codebases in any case (I'm thinking of Sendgrid).
I looked at terraform-sdk license - it still seems to be MPL. I think this means that all providers can continue to be open and work with both platforms, it will be important for Pulumi to clarify this to prevent the death spiral. Given some negative feedback towards the Hashicorp blog post from Pulumi employees on this thread, I am somewhat skeptical of this since if everything is fine, then complaining will otherwise have a negative effect, that us users have to assume that Hashicorp is actually stomping them out. And if it's the case, sorry but in good faith to everyone else that may need to work on infrastructure I make, I will have to be complicit in the stomping.
Pulumi is arguably the worst software I’ve ever used in my 15y career. I’d rather pay Hashicorp than use that dogshit.
On top of that, whether or not an OSS project accepts your PR means nothing about its quality or utility.
This change appears to have very little or nothing to do with most of us engineers and everything to do with companies wrapping and reselling. As far as I’m concerned it’s a good change.
Anyone who’s thinking about it. Stay away from Pulumi unless you’re okay moving from declarative IAC to some bullshit imperative Python or node constructors and for loops, and everything else that comes with writing OOP. I don’t care about the Hashicorp brand. I care about writing quality IAC and Pulumi is not it.
As a concrete example of what this enables, the Pulumi Automation API lets you embed IaC straight into a bigger program. Folks have used this to create infrastructure-oriented SaaS products, self-service portals, and new higher level abstraction frameworks and tools, for instance, often spanning multiple cloud resource types (AWS, Azure, Kubernetes, Cloudflare, others) -- https://www.pulumi.com/automation/. Transpiling to YAML and handing it over to CloudFormation is clumsy and wouldn't work for these cases among others.
Pulumi is an open source infrastructure as code platform enabling you to program the cloud in your favorite languages. We help developers and infrastructure teams build better software together. Pulumi is polyglot in many ways, so you'll have the opportunity to work with Go, Node.js, Python, and other language ecosystems, on a daily basis; Pulumi is also multi-cloud, meaning you'll also get to work with Kubernetes, serverless, AWS, Azure, and much, much more. We have open roles at all layers of our stack, including our open source platform in addition to our SaaS application. We are a remote team, growing fast, and also hiring engineering managers. If this sounds fun, apply online, or email me at joe AT pulumi DOT com!
Pulumi is fully functional in open source form. The analogy I like to draw is Git and GitHub. You can use Git fully independent of GitHub, or you can choose to use them together, for a seamless experience within a team. (Not a perfect analogy since we built both Pulumi open source and the Pulumi SaaS, which causes this very confusion!) We don't hold anything back, if it's in the SDK, it's open.
We recently added concurrency control to the alternative backends. I'm sorry the docs are confusing on this matter -- we will get that fixed up. We also have many large customers in production on the open source alone. It's easier with the SaaS just because we handle security, reliability, and sharing with your team along with access controls, auditing, etc. But if you prefer to roll your own there we are entirely happy to have you in the community and help out. Admittedly our marketing materials aren't super clear here and we are working to fix this.
Hope this helps to clear things up and again apologies for the confusion.
You are right that it's not easy. Thankfully the cloud providers themselves have moved in the direction of auto-generation for their own SDKs, documentation, etc., which has forced this to get better over time. This is motivated by much the same reason we've done it -- keeping one's own sanity, ensuring high quality and timeliness of updates, and ensuring good coverage and consistency, when supporting so many cloud resource types across many language SDKs.
Microsoft, for instance, created OpenAPI specifications very early on, and standardized on them (AFAIK, they are required for any new service). Those API specifications contain annotations to describe what properties are immutable (as you say, the need to "re-create" the resource rather than update it in place). Google Cloud's API specifications similarly contain such information but it's based on the presence of PATCH support for resources and their properties. Etc.
The good news is that we've partnered with the cloud providers themselves to build these and we expect this to be increasingly the direction things go.
This is good news. This will reduce the barrier to entry to all sorts of software leveraging Cloud APIs, not only Pulumi. (I believe crossplane.io might be following a similar approach).
Much sweat and tears has gone into hand writing Terraform provider code. The vast majority of which has come from and continues to be maintained by volunteers.
To have had to replicate this manual effort all over again just to create a competitor would have been silly.
There will surely be a lot of wrinkles to iron out with this 100% automated approach. But indisputably this is a positive development.
Indeed what you say is true of many other "multi-language" platforms. I was an early engineer on .NET at Microsoft, and although it was multi-language from the outset (COBAL.NET was a thing!), the reality is most folks write in C# these days. And yet, you still see a lot of excitement for PowerShell, Visual Basic, and F#, each of which has a rich community, but uses that same common core. A similar phenomenon has happened in the JVM ecosystem with Java dominating most usage until the late 2000s, at which point my impression is that Groovy, Scala, and now Kotlin won significant mindshare.
I have reasons to be optimistic the infrastructure language domain will play out similarly. Especially as we are fundamentally serving multiple audiences -- we see that infrastructure teams often elect Python, while developers often go with TypeScript or Go, because they are more familiar with it. For those scenarios, the new multi-language package support is essential, since many companies have both sorts of engineers working together.
A "default language for IaC" may emerge with time, but I suspect that is more likely to be Python than, say, HCL. (And even then, I'm not so sure it will happen.) One of the things I'm ridiculously excited about, by the way, is bringing IaC to new audiences -- many folks learn Python at school, not so much for other infrastructure languages. Again, I'm biased. But, even if a default emerges, I guarantee there will be reasons for the others to exist. I for one am a big functional language fan and especially for simple serverless apps, I love seeing that written in F#. And we've had a ton of interest in PowerShell support since many folks working with infrastructure for the Microsoft stack know it. And Ruby due to the Chef and Puppet journeys.
I also won't discount the idea of us introducing a cloud infrastructure-specific language ;-). But it would be more general purpose than not. I worked a lot on parallel computing in the mid-2000s and that temptation was always there, but I'm glad we resisted it and instead just added tasks/promises and await to existing languages.
As to the Pulumi schema, you're right, that's a step we aim to remove soon. For TypeScript, we'll generate it off the d.ts files; for Go we'll use struct tags; and so on. Now that the basic runtime is in place, we are now going to focus there. This issue tracks it: https://github.com/pulumi/pulumi/issues/6804. Our goal is to make this as ridiculously easy as just writing a type/class in your language of choice.
> There might be "a default language" but I suspect that is more likely to be Python than, say, HCL. One of the things I'm ridiculously excited about is bringing IaC to new audiences -- many folks learn Python at school, not so much for other infrastructure languages.
As someone that has worked largely on backend systems and infra for a long time, are colleges actually training these infra skills? Some app engineers at a place I consult for was just having this conversation last week, interested in how to train infra skills and pretty much everyone had sort of fallen into infra roles over their careers, with no formal training in it. Most of us were just *nix hackers as kids, learned to program at some point, and now we’re here. Working on infra is quite a lot more than just knowing the language.
I’m not mad at having something other than HCL — years ago when I worked at Engine Yard we developed a cross platform cloud library that let us write things in Ruby which was nice. But when thinking of solving infra problems I’ve never once thought “you know, if I could just write this in Python these problems would go away”. Actually I personally hate Python as a language, I’d much prefer to write Go or Rust or TypeScript, and it does feel like a bonus that everyone touching infra sort of just has to use HCL which removes a lot of bike shedding.
Totally open to improvements! More isn’t always better though.
The defining characteristic about Pulumi compared to other tools is that it's not a transpiler, in fact. It's a multi-language runtime written in Go that can host many language plugins (Node.js, Python, .NET, Go, etc), as well as many resource provider plugins (native ones, OpenAPI-based, Terraform-based). So although yes it can use Terraform providers -- great for coverage across many infrastructure providers as well as easy portability if you're coming from Terraform/HCL -- it's not correct to say that it's "just using Terraform" or is a "transpiler".
So they are implementing their own N language runtimes? That sounds like trouble waiting to happen. How do they have the man power to support N languages as a single company?
(edit)
Apparently they shell out to other language runtimes. So my original question about needing N runtimes available still applies. I'm not interested in supporting N runtimes for my IaC, TF seems good enough. Put Cuelang on top of that and you have a much better system. There are people already doing this.
As a Pulumi employee, what is your response to the N runtime problem and sharing of modules in an org?
Pulumi is a cloud engineering startup whose flagship infrastructure as code platform helps infrastructure teams and developers create modern cloud architectures. Our open source SDK uses general-purpose languages and ecosystems of tools to deliver apps and infrastructure across any cloud -- including AWS, Azure, Google Cloud, and Kubernetes -- with software engineering practices like testing, sharing and reuse, and more. We offer a SaaS for teams and enterprises using this SDK at scale.
If you enjoy working at the intersection of cloud infrastructure and developer platforms, you'd love it here!
We've worked with a lot of end users to migrate from Terraform, and we honestly do see a lot of copy-and-paste. I agree that it's not as rampant as with YAML/JSON, however, in practice we find a lot of folks struggle to share and reuse their Terraform configs for a variety of reasons.
Even though HCL2 introduced some basic "programming" constructs, it's a far cry from the expressiveness of a language like Python. We frequently see not only better reuse but significant reduction in lines of code when migrating. Being able to create a function or class to capture a frequent pattern, easily loop over some data structure (e.g., for every AZ in this region, create a subnet), or even conditionals for specialization (e.g., maybe your production environment is slightly different than development, us-east-1 is different, etc). And linters, test tools, IDEs, etc just work.
For comparison, this Amazon VPC example may be worth checking out:
It's common to see a 10x reduction in LOCs going from CloudFormation to Terraform and a 10x reduction further going from Terraform to Pulumi.
A key importance in how Pulumi works is that everything centers around the declarative goal state. You are shown previews of this (graphically in the CLI, you can serialize that as a plan, you always have full diffs of what the tool is doing and has done. This helps to avoid some of the "danger" of having a turing-complete language. Plus, I prefer having a familiar language with familiar control constructs, rather than learning a proprietary language that the industry generally isn't supporting or aware of (schools teach Python -- they don't teach HCL).
In any case, we appreciate the feedback and discussion -- all great and valid points to be thinking about -- HTH.
> It's common to see a 10x reduction in LOCs going from CloudFormation to Terraform and a 10x reduction further going from Terraform to Pulumi.
I don't see this as such a terrible problem. The configurations may have more LOC's but there are not as many surprises. The dependency of declarable configuration makes it rock solid and favorable among operations teams who need to make these kinds of changes all the time.
> A key importance in how Pulumi works is that everything centers around the declarative goal state. You are shown previews of this (graphically in the CLI, you can serialize that as a plan, you always have full diffs of what the tool is doing and has done. This helps to avoid some of the "danger" of having a turing-complete language. Plus, I prefer having a familiar language with familiar control constructs, rather than learning a proprietary language that the industry generally isn't supporting or aware of (schools teach Python -- they don't teach HCL).
I understand the reason to want this. Having worked closely with developers, lack of familiarity with HCL makes it much less accessible. However, from an operations perspective, I am GLAD that HCL is a very limited language. No imports of libraries all over the place (in your infrastructure configurations, no less!).
> I don't see this as such a terrible problem. The configurations may have more LOC's but there are not as many surprises. The dependency of declarable configuration makes it rock solid and favorable among operations teams who need to make these kinds of changes all the time.
The issue is that your static configs often have lots of boilerplate sections that have to be kept in sync. Further, you can use an imperative language like Python, JS, etc and still write in a completely declarative fashion (or you can use a functional language which tend to be declarative out of the box). Conversely, you can model an AST in YAML (which is what CloudFormation is trending toward) and get the worst of all worlds. Bottom line: don't conflate "reusability" with "imperative" or "static" with "declarative".
> The issue is that your static configs often have lots of boilerplate sections that have to be kept in sync.
Yes, I agree with this. However, its predictable. As an operations person, I value predictability and am willing to pay the price of keeping static configs in sync.
> Further, you can use an imperative language like Python, JS, etc and still write in a completely declarative fashion (or you can use a functional language which tend to be declarative out of the box). Conversely, you can model an AST in YAML (which is what CloudFormation is trending toward) and get the worst of all worlds. Bottom line: don't conflate "reusability" with "imperative" or "static" with "declarative".
Hold on, I'm not conflating anything. Saying that "you can write terrible things in any language" isn't anything new. We choose to use languages that provide certain guarantees that we need for the domain that we're working in. For infrastructure, declarative languages are a lot more suitable for the properties they provide (i.e. no surprises, limited functionality etc.). Its "possible" to use static types in Python, how many do that?
> Yes, I agree with this. However, its predictable. As an operations person, I value predictability and am willing to pay the price of keeping static configs in sync.
I think there's wisdom in this at small scales, but as the volume and complexity of your boilerplate grows, I think you lose any advantages. I also think this threshold is quite low (as an ops person and a dev person) since it's not much harder to look at/read the YAML generated by a script vs that which is hand-rolled and committed to git.
> Hold on, I'm not conflating anything.
Are you sure? Because you just said "I am willing to pay the price of keeping static configs sync" and then "For infrastructure, declarative languages are a lot more suitable for the properties they provide" and then you started to talk about "static types" in Python, which is different than "static" in the YAML sense (YAML isn't statically typed, but it is static in that it isn't evaluated or executed).
I'm not trying to be a jerk, it just sounds like a lot of concepts are being confused. I also wasn't making the argument "you can write terrible things in any language" (not sure if you were attributing that argument to me or if that was a point you were trying to make).
It's fully declarative, but it does evaluate, so it's not static in the YAML sense. It outputs a JSON CloudFormation template (but it could easily output in YAML) which you could inspect visually before passing onto CloudFormation.
It's also statically typed although that's not evident from this file since all types are inferred in this file (however there are annotations in the imported libraries), and while the static typing is a very useful property, it's not what I've been talking about in this thread.
In my opinion, this is no less readable than the equivalent YAML; however, it's capable of doing much more (albeit if your infrastructure is just one S3 bucket, then this is overkill--to really understand the power of dynamic configuration, you would want a more complex example).
I can’t trust my teammates to write code that doesn’t use raw eval()’s all over the place.
Getting them, nevermind relying on them to write Python/JS in the correct way is straight up out of the question.
At least I know in Terraform/HCL they can’t map a config change over the 1000 new instances they spun up because they happened to write their for loop wrong.
> I can’t trust my teammates to write code that doesn’t use raw eval()’s all over the place. Getting them, nevermind relying on them to write Python/JS in the correct way is straight up out of the question.
> At least I know in Terraform/HCL they can’t map a config change over the 1000 new instances they spun up because they happened to write their for loop wrong.
To be clear, the proposal is to use a programming language to generate your HCL-equivalent configs, not to imperatively modify infrastructure. Consequently, you can inspect the generated "HCL" (or whatever the output is) and make sure it looks like the code they would write manually. Further, you can even write automated tests.
So, things need to be comprehensible by the humans that work with them. A 10x reduction in LoC / 10x increase in expressibility may or may not be a good thing, but if it captures intent better and with less ceremony and cruft, then it most decidedly is a FANTASTIC thing. Whereas a 10x LoC improvement that makes it harder to glean intent would be DISASTROUS.
Then again, code has to be run in order to analyze its output -- that or code has to be data you can analyze (like a Lisp), but that can be very difficult to reason about.
So my preference would be to have libraries for constructing configuration data. Then you can execute a program to generate the configuration, and that you can use without further ado. The output may not be easy for a human to understand, though it should be possible to write code to analyze it.
So as a user, can I configure this Pulumi VPC stack before it's instantiated? Or do I have to use the defaults first and then use the CLI to change things? Do these CLI changes then get placed into code, or just into state? Does that mean I'm now in a situation where the code doesn't match the state?
Personally I find the Terraform configuration much easier to reason about, I see exactly where resources are declared just by scanning the file. (But I've also used Terraform a lot).
Edit: Ah, maybe I have to configure it via this config.py file [1]? I appreciate what Pulumi is trying to accomplish, but that is certainly not a config format I'd like to be using. Maybe you could use HCL or YAML for it? ;)
Edit 2: Another last thought, I think a lot of the mindset in Terraform comes from Go, where the proverb "A little copying is better than a little dependency" is pretty well adopted. Before I started writing Go as my main language I didn't appreciate that mindset, but after 5 years with Go I've found it more and more appropriate [2].
You're right, the Pulumi example is a project, not a reusable module. There are a few approaches to making it modular:
1) The project does support config. So if you want to change (e.g.) the number of AZs, you can say
$ pulumi config set numberOfAvailabilityZones 3
$ pulumi up
And Pulumi will compare the current infrastructure with the new goal state, show you the diff, and then let you deploy the minimal set of changes to bring the actual state in line with the new goal state. This works very much like Terraform, CloudFormation, Kubernetes, etc.
2) You can make this into a library using standard language techniques like classes, functions, and packages. These can use a combination of configuration as well as parameterization. If you wrote it in Python, you can publish it on PyPI, or JavaScript on NPM, or Go on GitHub -- or something like JFrog Artifactory for any of them. This makes it easy to share it with the community or within your team.
3) We offer some libraries of our own, like this one: https://github.com/pulumi/pulumi-awsx/tree/master/nodejs/aws.... That includes an abstraction that's a lot like the Terraform module you've shown, and cuts down even further on LOC to spin up a properly configured VPC.
I am a big Go fan too, so I very much know what you're saying. (In fact, we implemented Pulumi in Go.) Even with Go, though, you've got funcs, structs, loops, and solid basics. Simply having those goes a long way -- as well as great supporting tools -- and you definitely do not need to go overboard with abstraction to get a ton of benefit right out of the gate.
"The project does support config. So if you want to change (e.g.) the number of AZs, you can say..."
Cool, is it possible to do that without having to use the CLI? Are you doing any sort of state locking here? I've seen ops teams get saved from potentially horrible situations by Terraform's dynamodb state locking.
"You can make this into a library using standard language techniques like classes, functions, and packages."
That's pretty nice and it seems like it'll get you the same functionality as a Terraform module. Do you have any plans of releasing something like the Terraform Registry to help with discoverability?
Also, do you have any docs on writing providers? I've had to do that a few times for Terraform and getting up and running with that was pretty easy as a Go developer. I wouldn't really want to do that for every supported language though (no offense C#).
I'm seeing that some of this is using codegen to read the equivalent Terraform provider and generate the Pulumi provider from that schema. Is that the preferred workflow here for providers that already exist in the Terraform ecosystem?
> is it possible to do that without having to use the CLI? Are you doing any sort of state locking here?
Yeah it's just a file if you prefer to edit it. By default, Pulumi uses our hosted service so you don't need to think about state or locking. That said, if you don't want to use that, you can manage state on your own[1]. At this time, you also need to come up with a locking strategy. Most of our end users pick the hosted service -- it's just super easy to get going with.
> Do you have any plans of releasing something like the Terraform Registry to help with discoverability?
I expect us to do that eventually, absolutely. For us it'll be more of an "index" of other package managers since you already have NPM and PyPI, etc. But definitely get that it's helpful to find all of this in one place -- as well as knowing which ones we bless and support.
> Also, do you have any docs on writing providers?
We have boilerplate repos that help you get started:
These packages are inherently multi-language and our code-generator library will generate the various JavaScript, Python, Go, C#, etc, client libraries after you've authored the central Go-based provider schema.
> Is that the preferred workflow here for providers that already exist in the Terraform ecosystem?
Yes. We already have a few dozen published (check the https://github.com/pulumi org when in question). In general, we will support any Terraform-backed provider, so if you have one that's missing that you'd like help with, just let us know. We have a Slack[2] where the team hangs out if you want to chat with us or the community.
I would point out that Dhall solves these problems with one simple fundamental construct: the function.
And still keep it terminating.
You can do it but it means doing more cognitive engineering than "just throw python at it".
Another point: you can have a declarative turing complete language. I would really like to see people bring prolog like languages to things like pulumi and terraform.
That would also allow to get convergent concurrent application which means we could get proper collaboration. That would be a strong move ahead for devops.
> We've worked with a lot of end users to migrate from Terraform, and we honestly do see a lot of copy-and-paste. I agree that it's not as rampant as with YAML/JSON, however, in practice we find a lot of folks struggle to share and reuse their Terraform configs for a variety of reasons.
I would risk to say that it’s not the Terraform that makes the people to copy / paste. It’s the people. Call it lack of knowledge, not enough time, laziness, tight schedules...
Once your customers are on their own, new people join - no knowledge of Pulumi, resources get added / moved / evolve, there will be copy / paste in their Pulumi code too.
Not defending Terraform here. Just adding a point to the discussion.
Some of this is truly on terraform. The for construct (and looping in general) was only added in TF 12, released in May 2019. Older codebases didn't have a real way to support looping so there's more copy paste there. TF supports ternary conditionals, but not true if statements, which makes adding more complicated if logic difficult.
The reality is that all programming languages have significant copy paste codebases using them, but there are features which help reduce the amount of it. Terraform is missing some of those features, and many of the features it does have were introduced in tf 12, which is less than a year old.
Yes. But Terraform (hcl) is not a programming language.
It’s interesting that some people bring up sbt as an example of how to use a „programming language” for configuration. The reason why sbt became dominant was the weight of Lightbend (Typesafe). There was no way to get away from it. Frankly, sbt can be awful mashup of copy / paste too. sbt is so much magic, I would not be surprised to discover that majority the folks who use sbt, have no actual clue why stuff works the way it works.
I haven’t tried Pulumi yet, I will try when I get the chance. I am eagerly waiting for an opportunity to use it. Hopefully it will surprise me in a positive way. Surely, it can deliver on what it promises. I have very fond memories of Chef and cookbooks in Ruby, it can be done.
Edit: personally, Chef solo (with right tooling to eliminate the server), was the best experience so far. If Pulumi can improve on that (no agent), I’m looking forward to take it for a test drive.
> I would risk to say that it’s not the Terraform that makes the people to copy / paste. It’s the people. Call it lack of knowledge, not enough time, laziness, tight schedules...
Well, the problem is that a majority of people don't want to / don't have the time to learn HCL, because it's not the most effective use of their time / not worth the "investment" to do so.
Learning HCL is not very rewarding, unless you are an ops person.
Learning a general purpose language language like Python, TypeScript or whatever language your company uses is rewarding both for ops and dev people (or devops people if you like that term) and typically can be used for a much wider set of use-cases.
When introducing a new language the pros and cons of doing so should always be carefully considered, however unfortunately for devops tools new languages like HCL,Jsonnet,Starlark,zillions of YAML pseudo-programming DSLs etc. are often introduced very lightly, mentioning a handful of use cases where the new language shines, but ignoring the cons and intrinsic costs (learning curve, new tools, editor integrations, package manager etc. to be built).
Terraform works great for teams where you have a strict separation between ops and dev people. The ops people will spend their time learning HCL, the dev people will learn Python, TypeScript or whatever that is.
However if you are trying to truly embrace a "DevOps" model Terraform shows its flaws. Developers will either still heavily rely on ops people to "help them" even for trivial infra changes or they will write sub-par copy pasta HCL code that tends to be verbose.
TF 0.12 may have a bunch of new constructs which make it easier to reduce duplication, but the boilerplate that is required to create an actual reuse module with variables and import it (and overall awkwardness of the module system/syntax compared to any other language) vs the simplicity of creating a reuse function/file in Python/TS is like night and day.
Furthermore the subpar editor support for TF makes it actually hard to follow references between modules and safely refactor code, so there is a much lower threshold at which an abstraction appears "magic"/incomprehensible in HCL, compared to typed TS/Python where you can easily follow references.
Source: ~2 years worth of Terraform (incl. 0.12) and ~1 years worth of Pulumi use within multiple companies and teams.
Looking at your Terraform and Python scripts I see two different scripts doing different things with different abstraction levels and different configuration toggles.
It's ironic that you sell as a plus that Python allows you to easily loop over data structures and make resources codnitional, because pretty much all your Terraform resources there are conditional (with a few looping over lists for DRY purposes), while few of the Python resources are.
Many of the lines saved for declaring identical resource types are just because either the Terraform resource is declared with unnecessary values or because the Python one has a default value, which can be provided as well in Terraform.
But yeah, the bulk of the difference is that the scripts are doing different things by declaring different sets of resources.
> Plus, I prefer having a familiar language with familiar control constructs, rather than learning a proprietary language that the industry generally isn't supporting or aware of (schools teach Python -- they don't teach HCL).
Which comes back to my point about inexperienced (or the "10x" ones that cut corners until the table is round and then leave) developers preferring familiarity over using a specialized tool that takes into account common pain points, further fragmenting the space through "worse is better". I am certain I will die employed on cleaning up ORM messes left by developers that didn't want to learn SQL despite having a whole field of mathematics backing it; so if you're successful, odds are I will also end up fixing some day the "declarative output" a Pulumi script produced in a developers computer that is not reproducible anywhere else because it makes a request to his home server and mutates an array of resources somewhere depending on that response, the current time, the system locale and the latest tweet by Donald Trump.
"Many of the lines saved for declaring identical resource types..."
Yeah, it seems a bit silly to say that a benefit is saved lines of code, yet the Terraform example is setup to do quite a lot more than the Pulumi example. The resources are just there and turned off with the "count" configuration. The Pulumi example isn't doing any of the RDS, Redshift, Elasticache, Database ACL, VPN gateway, etc things. This example is a pretty substantial module and I'd guess the LOC would be pretty similar between the two if the functionality were closer.
My belief is that we've been slowly building up to using general purpose languages, one small step at a time, throughout the infrastructure as code, DevOps, and SRE journeys this past 10 years. INI files, XML, JSON, and YAML aren't sufficiently expressive -- lacking for loops, conditionals, variable references, and any sort of abstraction -- so, of course, we add templates to it. But as the author (IMHO rightfully) points out, we just end up with a funky, poor approximation of a language.
I think this approach is a byproduct of thinking about infrastructure and configuration -- and the cloud generally -- as an "afterthought," not a core part of an application's infrastructure. Containers, Kubernetes, serverless, and more hosted services all change this, and Chef, Puppet, and others laid the groundwork to think differently about what the future looks like. More developers today than ever before need to think about how to build and configure cloud software.
We started the Pulumi project to solve this very problem, so I'm admittedly biased, and I hope you forgive the plug -- I only mention it here because I think it contributes to the discussion. Our approach is to simply use general purpose languages like TypeScript, Python, and Go, while still having infrastructure as code. An important thing to realize is that infrastructure as code is based on the idea of a goal state. Using a full blown language to generate that goal state generally doesn't threaten the repeatability, determinism, or robustness of the solution, provided you've got an engine handling state management, diffing, resource CRUD, and so on. We've been able to apply this universally across AWS, Azure, GCP, and Kubernetes, often mixing their configuration in the same program.
Again, I'm biased and want to admit that, however if you're sick of YAML, it's definitely worth checking out. We'd love your feedback:
This is a great analysis, but it's missing a fundamental point: why do we have a problem with these approximations of a programming language or just using a programming language to template stuff?
Because your build then becomes an actual program (i.e. Turing complete) and you have to refactor and maintain it! This is the common problem of using a "programming language as configuration" (e.g. gulp?)
It has the same premises of Pulumi, but without the Turing completeness (I don't know if/how Pulumi avoids that, but if it does it should be part of the pitch), so you cannot shoot yourself in the foot by building an abstraction castle in your build system/infrastructure config.
We use it at work to generate all the Infra-as-Code configurations from a single Dhall config: Terraform, Kubernetes, SQL, etc.
> We use it at work to generate all the Infra-as-Code configurations from a single Dhall config
This is the key bit and not something which is pitched well enough from the Dhall landing pages: using straight YAML forces you to repeat yourself in multiple areas for each Individual tool being used, and these repetitions have to stay consistent across multiple tools. What Dhall does is allow you to write a single config and use it to derive the correct configurations for each tool that you use. So you can write a single configuration file from which, eventually, every single part of your system is derived - Terraform infrastructure, Kubernetes objects, application config, everything. When you pull it off, it's simply magical.
You can think of it like this: JavaScript is a horrible, no-good, very bad language, and yet all browser programming is done in JavaScript because every browser supports it - so too, are JSON and YAML horrible configuration languages. But JavaScript gave rise to abstractions like TypeScript which are much better languages which compile down to JavaScript for compatibility. TypeScript is to JavaScript what Dhall is to JSON and YAML - the fact is, pretty much everything is configured with JSON and YAML, and Dhall makes it much, much easier to live in that world, with no need for the systems being configured to support it.
Considering the relative obscurity of Dhall, it's basically the best-kept secret in the DevOps world right now, and it's a shame more people don't know about it.
Dhall appears to be expressive enough that I can't see why you wouldn't have to refactor and maintain the Dhall code?
Writing Dhall code look exactly like programming to me, and the programmer must possess the necessary programming skills to produce good Dhall code. A random guy with a text editor will make an equal mess in Dhall as they would with a “real” programming language.
I don't see how the restrictions in Dhall really help much in this regard. Turing completeness feels like a red herring to me.
Not a user of Dhall, just a fan, but refactoring of Dhall configuration should be extremely easy. You make a change, and your configuration stays the same, which is easy to verify. (Thanks to https://en.wikipedia.org/wiki/Normalization_property_(abstra... )
For TC languages, comparing if two programs (original and refactored) do the same thing is not solvable in general. If the language is not TC then it is more feasible.
You can do more than just compare the output of two programs in Dhall. You can verify using a semantic integrity check that two programs are the same for all possible inputs. For example:
Actually with Dhall, you should be able to compare the programs themselves, even without full "input" (there is even example on the Dhall page, see "You can reduce functions to normal form, even when they haven't been applied to all of their arguments").
So you can for example leave some parameters out of your config and still validate the correctness of refactoring.
If you use general purpose programming language, then even comparing just output might be difficult - most languages allow to do I/O, so it's possible that the configuration is dependent on some side channel.
I would say if you are only using general language "sensibly" for configuration then you are effectively restricting yourself in the same way that Dhall does.
I don't get the problem with using a turing complete language to generate configuration. There's nothing wrong with maintaining and refactoring a program, that's a natural process for any program. If you don't want an infinite loop, don't write one, as you wouldn't in any other program. You can choose as much or as little abstraction as you so wish.
Give me a real language any day over dhall or jsonnet.
What's so bad about Turing Completeness? I haven't a decent look at Dhall, but I'm betting I could probably write an exponential Dhall program that won't terminate in the lifetime of the universe.
The real reason for giving up Turing equivalence was probably to get dependent types. This gives very powerful static guarantees, including the presence/absence of fields under non-trivial record operations such as merge. In using dependent types, they have also had to give up significantly on type inference, which is really going to annoy the average JavaScript/Ruby programmer.
> My belief is that we've been slowly building up to using general purpose languages, one small step at a time, throughout the infrastructure as code, DevOps, and SRE journeys this past 10 years.
I think that you’re right, and I think it’s great, because we have a programming model in which code is data and data is code: Lisp & S-expressions.
It’d be downright awesome to have a Lisp-based system which used dynamic scoping to meld geographical & environmental (e.g. production/development) configuration items. But then, it’d be downright awesome if the world had seriously picked up Lisp in the 80s & 90s, and had spent the last twenty years innovating, rather than reïnventing the wheel, only this time square-shaped. But then, the same thing could be said about Plan 9 …
I’ve not yet had the time to take a look at Pulumi, but I hope to have time soon.
Seriously, this has happened again and again and again. You have software, so you configure it via a clean and simple text syntax, then the configuration needs to be generated and the syntax becomes more complicated, then the next system you do has an "API" instead so you can configure it via programming, which is too complicated so the next time you Do it Right and go with a simple text file, which is then outgrown when the configuration it stores becomes too complicated...
I think the parts of Lisp that tended to be rebuilt have mostly been incorporated into the newer languages. (At least, it's been a very long time since I've had to rewrite a fundamental data structure, etc.)
You don’t need code-is-data for what your parent is describing. All you need is code that outputs data. Or even better, code that initiates contact with other code.
The only requirement is a commitment to doing things imperatively in a real programming language. It’s hard to resist the temptation to do things declaratively (because it’s easier to imagine a declarative interface that describes your problem than an abstraction of the procedure which will solve it) but you are never forced to.
As the kids say: stop trying to make Lisp happen, it's not going to happen.
It has become yet another community that's fighting a struggle that everyone else ended years ago, like the few Japanese in jungles who refused to surrender. I'm not entirely sure why it's not been adopted, but I suspect it's because most people strongly prefer (a) visually semantically different scope delimiters and (b) function-outside-brackets syntax ie f(a, b) rather than (f a b).
Or you could go the other way and say that JSON is s-exps with curly brackets so it should be made executable as such, and build that language.
> As the kids say: stop trying to make Lisp happen, it's not going to happen.
That's probably true, but I think it's useful to fight the good fight regardless. Even if Lisp & s-expressions don't, in fact, take over the world (and I think they will), arguing in their favour might help increase the chance that whatever inferior technology does end up getting adopted is better than it could have been.
> Or you could go the other way and say that JSON is s-exps with curly brackets so it should be made executable as such, and build that language.
The problem is that without symbols, that ends up being hideously ugly. This:
["if",
["<", 1, 2],
"less than",
"greater than or equal to"
]
is appreciably worse than:
(if (< 1 2)
"less than"
"greater than or equal too")
And alternatives like:
{"if": [[1, "<", 2], "less than", "greater than or equal to"]}
are so much worse that I don't think anyone could seriously expect to use them.
> It has become yet another community that's fighting a struggle that everyone else ended years ago, ... like the few Japanese in jungles who refused to surrender.
Nice imagery, but the wrong point.
Except for the syntax, everybody else joined Lisp.
"We were not out to win over the Lisp programmers; we were after the C++ programmers. We managed to drag a lot of them about halfway to Lisp." --Guy Steele
Flash back to the mid-1980's (when the mainstream was C, Pascal, BASIC, FORTRAN, COBOL, etc.) and it's Lisp/Scheme (and Smalltalk) that have features like Garbage Collection, interactive development, lexical closures, decent built-in data structures, dynamic typing.
The fact that all of this is commonplace today, both justifies a lot what Lisp did in the first half of its existence and undermines its (technical) competitive advantages now.
> but I suspect it's because most people strongly prefer (a) visually semantically different scope delimiters and (b) function-outside-brackets syntax ie f(a, b) rather than (f a b).
It's not technical. I don't think it ever was. So much of it is around social concerns: a performance stigma dating back to the 1970's, fear of being able to hire people to do the work, fear of what VC's will think, worries that the language will still be available... And then at the end of the day, the problems whatever language will solve are a tiny fraction of the overall problem of doing something relevant and lasting and useful to others.
> As the kids say: stop trying to make Lisp happen, it's not going to happen.
Life is too short and the world is too big to try to confine other people's ideas of how they should think or work.
The point of the market economy and of the scientific process is that people get to try what they think is going to be useful and then let the world decide. The fact that Lisp is still in the conversation at all, when its contemporaries (Autocoder, Fortran) either aren't or are highly specialized, says a lot that we can learn from.
I think what you're doing with pulumi is the right answer and it's only a matter of time before this becomes the norm. The author's examples could easily be done with plain ol' JS/ES/TS with more far more extensibility and customization when the need arises.
I also feel this is where JSX got it right. Instead of creating yet-another-templating-language (looking at you Angular!), they used JavaScript and did a great job of outlining how interpolation works. Any new templating language is always going to be missing some key feature you expect out of a general programming language and your customers will continue to ask for more features.
Paired with Typescript, we would have the clearness of a declarative language, with the power and flexibility of a real language that is also easy to extend and navigate.
In ROS we have these XML launch files that are just awful. They have enough features to be a really bad programming language for configuring and launching (often conditionally) numerous robot software nodes.
In ROS2 the launchfile can now just be a Python script. Very much learned all this the hard way and the solution was to just support Python. I think it's brilliant.
- the django like situation: the configuration is pure code, and it's a mistake. It was not necessary, it brought plenty of problems. I wish they went with a templated toml file.
- the ansible like situation: the configuration is templated static text. But with something as complex as deployment, they ended up adding more and more constructs, until they created a monstrous DSL on top of a their implementation language, with zero benefits compared to it and plenty of pitfalls. In that case, they should have made a library, with an API and documentation making an emphasis on best practices.
- and of course a big spectrum between those
The thing is, we see configuration as one big problem, but it's not. Not every configuration scenario has the same constraints and goals. Maybe you need to accept several sources of data. Maybe you need validation. Maybe you need generation. Maybe you to be able to change settings live. Maybe you need to enforce immutable settings. Maybe you need to pub sub your settings. Maybe you need to share them in a central place. Maybe they are just for you. Maybe you want them to be distributed. Maybe you need logic. Maybe you want to be protected from logic. Maybe the user can input settings. Maybe you just read conf. Maybe you generate it.
So many possibilities. And that's why there is not a single configuration tool.
What you would need, is a configuration framework, dealing with things like marging conf, parsing file, getting conf from the network, expressing constraints, etc.
But if you recreate a DSL for your config, it's probably wrong.
In defence of Django, the way settings.py works has been very stable for the entire lifetime of Django.
It may have its problems (I don't have many issues with it) but it doesn't seem to have this problem of attracting ever more layers of abstraction on top of it. It works.
Actually, I think settings.py is not a bad idea, but it's half backed.
There should have a schema checking the setting file. There should have a better way to extend settings, and make different settings according to context, such as prod, staging or dev.
There should be a linter avoiding stupid mistakes like missing a coma in a tuple, resulting in string concatenation.
There should be variables giving you basic stuff like current dir, log dir, var dir, etc. We all make them anyway.
And there should be a better to debug the import settings problem.
But all in all, it's quick and easy to edit, and very powerful.
> There is already a mechanism to validate the settings.py file inside django.
It's not exposed, but it's very limited.
> The different context stuff can be handled by using env vars, and a nice python wrapper, like python-decouple.
It's just one of the way to do it. Go to a new project, they use a different way. The main benefit of Django is the fact that a Django project is well integrated, and you find similar conventions and structure from project to project, allowing to reuse the skill you learned and build an ecosystem of pluggable app.
Does anybody here personally suffered those problems that the Turing complete Django configuration creates? (I mean, not the ones caused by lack of a completness checks, or good library support, but the ones caused by too much power.)
Now that you say it, it's true I didn't have problems with too much power.
I never had an untrusted party editing my config, nor did I use data from any.
Also, you can make the same mistakes in the setting file that in any code file, but it's not more or less important.
In fact, all the problem I had could have been solved by better integration: solving the import problem, making composition easy, adding checks, allow loading data from several sources and merge them, presenting them in a unify interface.
If I'm being honest, problem with settings.py may have not been that it's Python, but that it's a flat file with no strong conventions, tooling or best practices.
I could raise the issue that you can't read the config from another language, but I never had to, and good tooling would allow a synced export or an API to consume the settings.
After years of working with cfengine then ansible I finally went to a bespoke BSD ports work alike with optional client/server and json configuration components. Never looked back.
RCS stored directory based modules with tasks in subdirectories. Make or shell script style module execution as part of each task dir + variable files containing settings for the install task. Json configuration files that define all necessary module params (ex:log, task selection, stop on error, initialization, build command per task, etc...) remote scheduling of module/task execution via per agent sysv
ipc command queue serviced by a JSON-RPC microsvc which allows both serialized and non blocking task scheduling by queue priority.
I owned the majority of the configuration system and ecosystem for Borg, Google's internal cluster management and application platform.
Unfortunately, what described here is good in many level, but not excellent in any.
If you are OK to describe the complexity of your infrastructure in a programming language similar to the general purpose language, then a well abstracted API built on original APIs from cloud providers are more familiar to devs. And it will be more reliable performance and flexible.
If you want a config experience, something like kustomize is leaner and more compatible with the text config model.
I also cannot see how this interoperate with other tools, which will seriously limit it's appeals to people using other tools.
The problem with code as configuration is that the config file is indeterministic and it takes longer to extract information from the file.
This has long been a problem in the python/pip community, as its basically impossible for the build tools to determine the dependencies of a package without fully downloading and running the setup.py file.
Unless you import rand() your code should be deterministic. You're right about needing to run the thing to get the data (that's the point) but there is a middle ground between pure literals and fully side effects code. By example you could impose pure functions (no side effects).
That's what Haskell already does. Dhall is optimizing on different dimensions (making sure the script execution ends, making the scripts verifiable at static time, making it convenient to marge files, making it convenient to centralize your configuration).
As a happy pulumi user, I have to say I am very impressed with the experience. An order of magnitude improvement on maintainability over our old terraform code base. Highly recommended.
This is my experience and it's clearly biased from maybe one bad example but ... Scons is an example of code over configuration and from what I could tell I never met someone that truly understood it. Because it was code over configuration every programmer added their own interpretation of what was supposed to happen, no programmer truly understood what was really going one and it turned into one giant mess of trying to understand different programmers hacks and code to get the build to work. I'm sure some Scons expert will tell me how I'm full of crap but I'm just saying, that's my experience.
So, what's my point? My point is configuration languages help in that they push "the one true way" and help to enforce it. Sure there are times you end up having to work around the one true way but given very powerful tools of a full language for configuration leads to chaos or at least that's my experience. Instead of being able to glance at the configuration and understand what's happening because it follows the one true way you instead end up with configuration language per programmer since every programmer will code up stuff a different way.
For what it's worth--I've been using Pulumi on a couple of different projects and, today, I couldn't imagine starting a cloud-based project on anything else. The Pulumi team has spent more time than almost anybody I know on understanding how to attack these problems; I guess I have a bit of an understanding of just how much work that is, as I've tried to do the same thing and their solution is better.
I appreciate that their revenue model doesn't require making the open-source version frustrating or stupid and I appreciate that they're incredibly responsive. And some of the stuff you'll see around cloud functions/Lambdas and the deployment thereof will fucking blow your mind.
I have been using ksonnet but that is now officially dead. Working with jsonnet seemed unnecessarily painful when coming from coding typescript. This information is quite timely and welcome, I'll look further at the ts example.
We have ksonnet expats on the team (we're all in cloud city -- Seattle), and I've been keeping an eye on that project myself, since I think it got a lot of things right and frankly many of the ideas for Pulumi were inspired by early chats with the Heptio team. But, as you say, why create a new language when an existing one will do -- that was our original stance and it's working great in practice.
Oh! I don’t know where I got that impression from then! perhaps I just thought that we couldn’t use the free tier because of the number of licenses we’d need, but you’re right, it’s still there!
Build files (e.g. makefiles are their various descendants like SCons, rake, etc) seem to be in the same general boat except very early on mixing "real languages" (or at least shell scripting) was obviously allowed so they've always leaned far more towards the "yes, it is a general purpose language" end of the spectrum.
> My belief is that we've been slowly building up to using general purpose languages, one small step at a time, throughout the infrastructure as code, DevOps, and SRE journeys this past 10 years. INI files, XML, JSON, and YAML aren't sufficiently expressive -- lacking for loops, conditionals, variable references, and any sort of abstraction -- so, of course, we add templates to it. But as the author (IMHO rightfully) points out, we just end up with a funky, poor approximation of a language.
This is the why I prefer to use a JS file for configuration instead of native JSON or YAML file if those options are available.
I still don't know how to get it to do exactly what I want. There is far too much magic involved, and experience has long demonstrated that magic is bad (Webpack confirms that belief).
That being said, the concept of defining a function in, essentially, a config file seems like a step in the right direction. I don't think I'd trust that functionality outside of builds or infra-as-code, though.
What's magic about webpack? The online documentation provides quite a lot of insight into how it all fits together.
It probably only seems like magic because you didn't build a fundamental understanding of how it works before using it. I use some massive webpack configurations and I understand them all quite thoroughly thanks to well-written, modularized configuration files.
Javascript is a scripting language without native module support. That isn't Webpack's fault.
Webpack also handles much, much more than just Javascript. It handles CSS, HTML, images, files, pretty much any kind of asset. Java/Scala doesn't have anything like that. Asset management is completely different due to the nature of how assets are transferred to the client.
And Android? Give me a break. The moment you stray from the strict layout of an Android app you run into a wall and have to learn how Gradle operates. This strict layout is good for some but others hate when an environment forces particular constraints upon them.
Webpack is completely configurable at every stage, works with plugins (which compilers don't do) and again, isn't magic. Not knowing how something works doesn't make it magic. That's not what magic means with respect to code.
Besides... Maybe if you just like getting by, you can program in C/Java/etc without learning about compilers. Web dev is fucked and transpiler knowledge is basically required, but sure you can get by in other domains without it. But if you want to be a good programmer, an expert at what you do, someone who lives and breathes and understands computer science, someone who will excel in his career and not remain a code monkey forever... You have to learn about how your compilers work just like you should know how the silicon in your computer is doing its own "magic".
It was very successful. Complicated projects require complicated build config. Parcel does fine for simple projects, but lacks the raw power & configurability of webpack.
Webpack now does simple config as well with the 'mode: "production"' and 'mode: "development"' presets.
Having dealt with puppet, cloudformation, ansible and other solutions that have gone in and out of fashion and also dealing regularly with Kotlin, Java, Javascript, and recently typescript, my view is that configuration files are essentially DSLs.
DSLs ought to be type safe and type checked since getting things wrong means all kinds of trouble. E.g. with cloudformation I've wasted countless hours googling for all sort of arcane weirdness that amazon people managed to come up with in terms of property names and their values. Getting that wrong means having to dig through tons of obscure errors and output. Debugging broken cloudformation templates is a great argument against how that particular system was designed. It basically requires you know everything listed ever in the vastness of its documentation hell and somehow be able to produce thousands of lines of json/yaml without making a single mistake, which is about as likely as it sounds. Don't get me started on puppet. Very pleased to not have that in my life anymore.
On a positive note, kotlin recently became a supported language for defining gradle build files in. Awesome stuff. Those used to be written in Groovy. The main difference: kotlin is statically compiled and tools like intellij can now tell you when your build file is obviously wrong and autocomplete both the standard stuff as well as any custom things you hooked up. Makes the whole thing much easier to customise and it just removes a whole lot of uncertainty around the "why doesn't this work" kind of stuff that I regularly experience with groovy based gradle files.
Not that I'm arguing using Kotlin in place of Json/yaml. But typescript seems like a sane choice. Json is actually valid javascript, which in turn is valid typescript. Add some interfaces and boom you suddenly have type safety. Now using a number instead of a boolean or string is obviously wrong. Also typescript can do multi line strings, comments, etc. and it supports embedding expressions in strings. No need to reinvent all of that and template JSON when you could just be writing type script.
I recently moved a yaml based localization file to typescript. Only took a few minutes. This resulted in zero extra verbosity (all the types are inferred) but I gained type safety. Any missing language strings are now errors that vs code will tell me about and I can now autocomplete language strings all over the code base which saves me from having to look them up and copy paste them around. So no pain, plenty of gain.
And yes, people are ahead of me and there are actually several projects out there offering typescript support for cloudformation as well.
To go with your general line of thought, see how many JS-based projects are increasingly moving towards a JS file with a default export as a config file.
Definitely. I was a part of C# in the early days, so little else would make me happier than awesome class .NET support. This'll be great for Azure folks -- who knows, PowerShell too?
We are actively working on https://github.com/pulumi/pulumi/issues/2430, which will make it easier for our small team to manage multiple languages. Once that lands, I would expect this to be high priority.
> Definitely. I was a part of C# in the early days, so little else would make me happier than awesome class .NET support. This'll be great for Azure folks -- who knows, PowerShell too?
Powershell would be great, it has nice support for building DSLs.
The blog post is disingenuous. We tried many times to contribute upstream fixes to Terraform providers, but HashiCorp would never accept them. So we've had to maintain forks. They lost their OSS DNA a long time ago, and this move just puts the final nail in the coffin.
Thankfully over time, they already pushed responsibility for most Terraform providers back onto their partners, so I'm hopeful the ecosystem of providers can still stay vibrant and open.
We are deep believers in open source---heck my last project at Microsoft was to take .NET open source and cross-platform, our CTO helped found TypeScript, and Pulumi is an Apache open source project---it seems HashiCorp no longer is.