08 Jul 2021
Stop me if you’ve heard this one before …
Here is a three tier web stack. It has lots of web and app servers but only one database box. You can substitute this with cloud things but the principles are the same. I bet your infrastructure looks really similar. For the remainder of the post, assume I mean a traditional RDMS when I say database.
Why is there always only db01
? Your box might be called prod-db1
or mysql-01
. You know the one I mean. Maybe you have a db02
but I bet it’s running in a special mode. Db02 is another exception to the rule (maybe a read-only replica for reporting). Whatever the case, I bet there’s only one db01
but there are so many of the other things. Why?
We can summarize scaling each tier in this whole stack like this:
Each tier is either easy to reason about scaling out horizontally except for the database. What is going on here? I’m going to go over a few good ideas and why they die on the database tier.
Load balancer pools work great for tiers without state. You can even use tricks like sticky sessions when you have some state. But the request is short. A database resists these ideas because connections are long and the entirety of state is local. You can’t put database volumes on a shared drive and expect it to work (well). So the problem is at least state but let’s keep chatting about some other ideas.
Docker works great for tiers without state. You can dockerize your database but you don’t get the scaling and uniformity advantages of the other tiers. Docker is great for production and deployment but you are not deploying your database that often without a lot of fancy uptime tooling. In addition, you have footguns with non-obvious behavior around volumes. You can do it but it’s the exception when the app and web tiers are so easy to explain and reason about.
There are few threads and debates about dockerizing the other tiers. Dockerizing the database layer can be debated and googled.
Horizontal scaling doesn’t work on the database tier. You can’t easily have a read/write (active) pair. There are alternate daemons and methods (NewSQL) but here I mean common relational SQL databases.
What about NixOS? Or some other hot and trendy new idea? My first concern and question when I heard about NixOS was about the database layer. I have asked this question about NixOS and apparently it’s ok to do so. However, I don’t completely grok this but I guess this is part of my point. The database tier is a special case again.
You definitely can’t do the cattle thing because you can’t have a load balancer. You can only do the cattle/pets thing in the app tier because you have a load balancer with a health check.
During unit testing you might want your tests not to hit an API. You can mock out the HTTP interface and test against a mock response (or even better, ignore the response entirely). This is basically mocking out someone else’s (or your own) app server. So why don’t people do this with the database? Is it because the response is so important? It’s more of a language and state engine than a simple message passing metaphor?
You can find fakeredis adapters in Python, fake caches in Ruby and in-memory databases in C#. But it’s all surrounded by caveats. It’s just easier to create a test database because databases ruin all good ideas. At least database tech enables a workaround.
There is so much state and back-and-forth protocol in a relational database that treating it like client/server message passing is too simple. All the state and data lives in the database. Even triggers and internals would be too complicated to account for. It’s just easier to create a test database because database namespaces/collections are very easy to create. Databases also have the advantage of rolling back in a transaction which works great for unit testing.
So your project might have fake adapters but not for mysql/postgres. Or maybe you use sqlite in dev/tests and something bigger in prod. But you don’t change entire products for your caches/queues based on the environment do you? See what I mean?
Renting large boxes usually doesn’t make sense financially. You’d be better off just buying. The same is true for performance clusters and GPUs. The scaling and pooling problems from above don’t change. Even a SaaS has the same issue. In this case the singular db01
box just moves to the cloud.
Very long ago, I worked on an Oracle cluster that required a ton of specialized software, hardware, admin and configuration. Almost the entire idea was about availability and performance. The CEO just couldn’t stand the fact that half of the system is wasting money being read-only. He wanted read-write on both nodes. Active active. This was a long time ago but the CAP theorum isn’t going to change. I learned a ton about splitbrain mostly through trauma.
At the time, you couldn’t just download a relational database that will do horizontal scaling. You had to buy all these vendor options and stuff. It was super expensive. I forget the price, probably $40k for the db license and $20k for the clustering addon. And then you needed specialized disk and volume software. The hardware was really pricey too because it was Sun at the time.
During cluster install it tells you to plug in a crossover cable to a dedicated NIC. Like, you had eth1 just sitting there free or you had to buy a NIC for it. I think we bought a NIC. The install isn’t going to work unless you do this crossover thing. In addition, you need to set up a quorum disk on your SAN to act as a tiebreaker (more on that later). All the traffic over this crossover cable is SSH. All it’s doing is doing relational database agreement over SSH. There’s no data sharding or splitting you have to do so it’s all or nothing. Full-on ACID agreement, all the time. This is why you have a dedicated NIC because of network load.
So you finally beat the CAP theorum. You got your active-active database and you didn’t have to change your app at all. Now comes the trade off, the the devil’s details. ACID means we have to 100% agree on every query. That means, all nodes, all the time. This is why scaling nodes was so bad. You got about 50% on the second node and then +25% on the third node. It stopped scaling after 4. Remember, each node is incredibly expensive. Also, your nervous system is this crossover cable (actually a pair). What happens if I take some scissors to it?
Well, db01 thinks it’s up. And db02 thinks it’s up. But db02 thinks db01 is gone. And db01 thinks db02 is gone. So, now what? What happens if a write comes in to both db01 and db02?
db01: foo=bar
db02: foo=ohno
What’s foo supposed to be? s p l i t b r a i n
So this is why you configured a quorum disk. When the cluster looses quorum, there’s a race to the quorum disk. It writes a magic number to the start of the disk sector (not even like in the normal part of the disk iirc) and whoever arrives 2nd, panics on purpose. Now you have survived split brain. But you needed crazy shared disk technology to even do this for arbitrary reasons.
It was a crazy time and I should share this as production horror chops sometime later. A lot of the technology in this story is super old. But some of it hasn’t changed. When I learned Mongo, I had a high degree of context from this horror and I didn’t have to ask “yeah but why” a lot.
Way back when, our CEO couldn’t stand to have half the hardware sitting around doing nothing. He wanted it involved. It’s not like it’s a “dumb idea”. It was a good idea! A lot of people have good ideas around the database. To me though, databases ruin all good ideas. This is how I chunk it. I know it’s cute but it keeps coming up.
07 Jul 2021
There’s this popup thing happening. I’m not sure it’s because I really don’t care about GDPR cookies is making me exhausted. I think there’s this business optimization thing that I want to talk about.
Imagine we have a company. We’ve been in business for a while and we’re public. But along comes automation and analytics and we find if you pull this lever, you get a few more hits on the website. If we send an email, we make X. If we send an SMS, we make Y. If we put a banner on the site, we make Z. On and on. And us having scale, dashboards and reports; this is almost like a noise function through a filter. We’re tracking our lever pulls and our knob twists. This is what we wanted all this information for. We wanted to optimize and act.
So we make our site, our cart, our onboarding, our existing users’ experiences all have some options to randomly upsell or increase revenue. Not on purpose from the start but iteratively through many small changes. Why wouldn’t we? If someone finishes checking out, we send an email making sure that everything went find and that email has more product links. When we do this, we notice that we make +X%. Just on random noise from sampling.
# sample all users as some_users
# send marketing to some_users
Flipper.enable_percentage_of_actors :youtube_red_popup, 1
But now in this (long established) digital world everything is like this. I get sampled so often that I get popups as not occasional crackles and pops but as constant noise. This is aggregate personalized noise across all the services I use. I get the random sampling so often that I approach the constant random noise that the feature flags were trying to avoid from their perspective. But this is the problem, it’s just one perspective.
If I am 1% sampled on the many services I use, I experience annoyance beyond what each service by themselves expected.
The particular numbers don’t matter. My point is, it’s not 1% sampling to me. I’m a part of many things but the single things think that they are everything.
This is what they think their sampling is like. From their perspective the annoyance, call to action, popups, upsells are rare.
But this is how it is from my perspective when I’m a user of many services.
In Fast and the Furious everything is cars. Cars solve all problems. There are no bikes. So A/B testing which car produces the most click-through makes all the sense in the world. But you can’t consider bikes. Bikes don’t exist and certainly not bike click-throughs or bike prompt exhaustion. “It’s only 1% car prompts, that’s not that annoying.”
Ok, back to the youtube red popup. Even if we could design a popup with memory (this absolutely could be a thing), no for-profit company will use it. We could absolutely design a popup component that has memory. “How many times has Chris dismissed me? Maybe I’m annoying!”. No one would use it. Certainly not at scale. At scale, 1% is amazing. It enables projects, it destroys worry.
There’s this great talk from Velocity NY 2013 where Richard Cook explains that businesses never know where the failure line is. This isn’t really in the same domain as reliability but I think it applies. It’s a great talk, you should watch it.
You fiddle with these knobs and see the profits coming in but who is going to represent the users? It’s only until after you have negative revenue impact that you’d have ammunition to argue against money. The feature flags continue.
05 Jul 2021
I’ve sometimes seen people asking about dependency management, hooks, tracking bugs and other sort of higher level (to me) things than git provides. You can see this if you look at stackoverflow questions about submodules. What’s wrong with submodules? Well, compared to what exactly? When I do a clone of a project and run yarn install, it gives me a list of CVEs that match. When I do a bundle exec
it loads my project and has an opportunity (with a very high level of context) to tell me that I’ve forgotten to run migrations or run yarn update in a while. You don’t get this with git. Maybe these examples are too web-tech specific. But I’d like to suggest that this pattern will probably apply to Go, Rust and whatever else. Git is below your project and your project is trying to get better stuff done. So stop trying to solve your problem with Git and listen to how a few other communities do their thing.
I’ll also say that every project is different and as much as I want there to be universal truths, project differences really put some of this stuff into a spin. A lot of this is “to me”. But, I’ve also seen people doing “weird stuff” with Git and when I probe, they haven’t seen or felt success and so they are turning to Git as the tool they already have in place.
Git is really dumb (I mean the cli utility, not the “wrapper frontends” like Github or Gitlab). It’s in the name. It’s in the slogan: “stupid content tracker”. It only really knows how to work with text. You can teach it to understand machine learning binaries and possibly image assets but this is fighting the default. Game developers know this well (I don’t). So it’s interesting when I see other people trying to do it anyway. What I see is a lack of tooling in the language they are using.
Let me give some concrete examples.
In each case, the details don’t matter too much. Someone is trying to trick Git into doing something. It’s almost like a challenge. “If I can sneak past the guard then I can …”. Just stop for a minute. Listen to other projects and how they are doing it. Explore other languages. You don’t have to learn the whole thing. If you are stuck in Java or C++, learn about Yarn/Cargo/Bundler. Look at what Go went through.
But most of all, move up a level. Instead of hatching a Git plot, move toward your language, your IDE, your framework or your engine. Git is this plumbing bringing you water and you need to add the Kool-Aid packet for your Kool-Aid. It’s so much closer to what you are trying to make.
Let me give you two more examples while sharing a couple of neat tips.
You are working on a team and two people modify package.json at the same time. Your project is using yarn. This means the machine generated file yarn.lock
is going to conflict. What should you do? Do some git cherry picking wizardry? If we follow our above rules, we will use Git eventually but we want to lean in to higher level tools, in this case yarn.
# dealing with a yarn conflict
git checkout origin/master -- yarn.lock
yarn install
git add yarn.lock
git rebase --continue
We’re keeping our package.json
changes but letting yarn do the work of resolving the graph. Easy and it’s higher level than text.
You have many clients on many versions. You have concurrent support. You want to make a change but you don’t know if you are going to break anyone. Should you create a complex system of tags, SHAs or feature flags? Maybe. But if you want to track where you’ve deployed your code and on what version, you could do this with a spreadsheet (maybe automating later) but what about this particular problem of “did I break someone?”. Using Git, the idea would be something like reacting. You have all these concurrent versions and you want to track each of them so that you can do this whole backporting and parallel support thing (which is expensive).
If you have a web app, you could use contract testing with pact do handle the “can-i-deploy” question (it even has this as a feature). But what if you have a CLI? Well, can’t we see the pattern here? Look beyond Git and see how Pact is approaching the problem. It’s parallel specs and you want to know if your change is going to break anything.
Of these things involved:
Only Backporting Code is related to Git and it’s really not that interesting.
The thing with git is: it’s almost always better to move toward your language tooling. A lot of communities have different values and different strengths. What is obvious in one is not so obvious in another. Tour around a little bit and sample. Bring back what you’ve learned.
31 Mar 2020
My new year’s resolution was not to do advice for one fiscal quarter. That ends tomorrow. I thought I’d share what I’ve learned and what happened generally.
Why? Advice to me is very uninvolved. It’s fire and forget.
“I won the lottery, use the numbers I did.”
This is advice. Give me a break. In addition to this, advice is exhausting to me because I worry the most after delivering advice. I continue to think about what we both said. Advice is inaccurate because I actually don’t know everything and your project or your life really is unique. I’ve learned this the most when answering the question “what’s the best git workflow?”. Advice is in demand, there is so much information and tooling available but very little filtering and recommendation engines. People want advice because they want to save themselves the time to find out. Advice is almost always sought ahead of time when we haven’t tried anything. Every once and a while, we want advice for when we are stuck but this just doesn’t seem to happen that much.
So my new year’s resolution was to not do advice of any kind for 3 months. How did it go? I failed miserably. I think I broke the rule 5 times. Even though I said no advice and explained it very plainly to people, I ended up telling stories as plain facts but would wander into summaries that were advice actions. Advice is easy to avoid. Just never say “you should”. For example, an intern asked me if they should switch to CS in college. There’s no answer here but “you should” advice. I avoided it. I told them my early career stories and how I had very bad jobs in the early Internet / IT age. But ultimately this conversation and conversations like this would have me slip up with “you should study CS because it sounds like you’re into it”. Whoops.
But it was nice having this goal. I chimed in a lot less on Slack. I have essentially quit twitter so that’s that. I ignored flamewar bait of any kind, even pre-flame-war bait. “It’s going to turn into advice”. It was good practice. My opinion doesn’t matter was the general mindset and that was good.
Unfortunately, in the abstract world of software and sometimes being in a senior position, advice is what it comes down to. People are looking for optimization. They want lessons learned and stories. You have something to say and they know it. You can’t just be quiet. They want advice because they want to pre-learn. So do I! I want advice. I want to pre-learn instead of bumbling my way through first-hand experiences. But I also know that these questions are so insanely hard to answer without a few paragraphs of context and history.
These are questions that Quora will accept but Stackoverflow won’t. People want to know these things. But they are valid and terrible questions to me. Mostly because the answer is so long it would be a book. Not because the asker is stupid/evil but because software is too abstract and too rapid. If we could measure anything then there wouldn’t be bias, guesses and we all wouldn’t have to give advice that ultimately, changes with time and perspective.
Advice bit-rots and it wasn’t very good content to begin with. A real mystery is if this experiment taught me anything. Will I chime in less? Will I avoid opinion threads? Will I try to die on hills / represent? I don’t know. It was a good experiment either way. More mindfulness.
04 Jun 2019
Take any decision, question, concept, idea, team, project or plan and sprinkle the concept of time on it.
It might seem obvious or silly but I’ll explain with some concrete tech/project examples in a bit. Almost everything exists within the concept of time. Only a few things don’t. Constants sometimes don’t. PI is a constant. It’s not obvious how time relates to PI. Except when you consider that no one has yet found its end. The final value of PI (if there is one) is constrained by time. The answer of what PI actually is (if it can have a complete identity) is constrained by time. The speed of light is another constant. It might seem like it is unrelated to time. But speed is a function of time. Speed is meaningless without time. You can sprinkle time on the speed of light but this isn’t really what I mean.
What I mean is, you should consider time if you are building something, asking questions, trying to figure out what is best or doing what I’d consider everyday engineering things. Sprinkle time on the thing. It’s always there. So add it back.
Let’s look at some common questions:
Someone: What programming language should I learn?
The question is too open-ended in many ways. Let’s put time related things back into the original question. We’ll ask the person for more detail.
Sprinkle Time on it:
- How experienced are you?
- What year is it right now?
- What has happened recently that is influencing the possible future
- When do you need a job if that’s what you mean?
All those time details should always be packaged with that question. Without it, we are missing a dimension of the question and it barely makes sense. You can see this context/setup in well-phrased and experienced question-askers on stackoverflow. They setup the situation (present time) and maybe frame their constructed reality (their past).
Someone: What’s the point of a CI/CD pipeline? Doesn’t that slow down everything?
Maybe the person asking doesn’t even understand what CI is about. Add time to what they are doing now.
Sprinkle Time on it:
- What’s your employee turnover rate?
- How long does it take to onboard someone and have them deploy to prod?
- How many deploys do you do per day?
- How would you guarantee and feel drift of tests to code over time?
We could do the same thing with discussions about waterfall, big design up front, feedback loops, bit rot and just change in general. Change is time. Time is change. Nothing exists outside of time. It’s a dimension in our very universe. And sometimes/somehow it is excluded or forgetten about in decisions, thinking and conversations. It’s easy to add back in. You just sprinkle it like a salt shaker. It’s like a seasoning. Sprinkle time on that thing!
Someone: I hate javascript.
I know what they mean to say but … what if it changes? Software can change. Software will change. Culture is forever but software usually changes.
Sprinkle Time on it:
- When’s the last time you used javascript?
- How long did you work at it. We tend to like things we understand (experience/time).
- What version are you talking about, software can change.
So should you hate javascript forever? Should I bake my criticisms of any language into my brain forever? I understand chunking. But you need to tag these chunks as possibly out of date or have an expiration date. That memory chunk or bias you have is probably going to expire. Especially in software. Software’s whole advantage is that it can change. And that’s going to be change over time. I have a few niggles with Go 1.x (right now) but I know it could change (in the future).
Even this blog post needs time sprinkled on it. This very post will go out of date and bit rot. I have a blog post titled “Why Aren’t You Running Gentoo?”. A very low empathy start of a post from 2004. It’s not a relevant question to me anymore (time) and the tech landscape has changed (time). I have blog posts with the date of the post and no other information for this reason.
But to me, the biggest problem mistake I see is not considering time and ignoring that things can change. Take git history. You have 15 years of git commits on a web app going back to 2004. You have archived releases saved just in case you need to rollback. But why? Sprinkle time on that idea. Heartbleed came out in 2014. All your commits are useless beyond 2014. What are you going to do? Boot an unpatched system, compile against vulnerable openssl and deploy it? The world has changed. You can’t go back. This is true of most security patches. It’s not that security flaws are being found. It’s that the world is changing. The state of the world does not stand still even for your core technologies. It’s not standing still for your business or features either. You can’t rollback your database that far. Feature flags, sure. But deleting whole columns or tables? Breaking the user experience? It’s forward-only. And that’s how time works.
Ok, that’s enough about that (time). I think you get the point (time).
08 Nov 2018
I’m reading the n-th article where someone mentions TDD (test-driven development) as a magic word that means “doing testing” or something else and I thought I’d write down a few things as a note. There have been nearly infinity plus one articles and discussions about testing and TDD. I don’t mean to pile on the old, dead and old-dead horse here but I just keep hearing language that makes me want to pull out a tobacco pipe near a fireplace and puff Well Actually … and that’s not great without helpful context and more detail.
So, let me TLDR this real quick. There is no right or wrong way to test if you’ve tried many different types and flavors and have your own personal preference. There probably is a wrong way to test your project if you have never had time, don’t care that much or someone sent you this link and this is just another opinion man.
The TLDR here is:
Me: Do you have tests?
Someone: Hehe, I know. We didn’t really have time for this project. Look, I joined late and …
No one does “No Testing” but people think they do. This is typing the code, never running it and shipping it to production or scheduling it to run for real. You never observe it running. No one does this but they think they have no tests when they only have manual tests.
Think about this with “hello world”. You would type hello world code, save it and put it somewhere as production. You would dust off your hands, say “another job well done!” and go home. No one does this. From here on out, this isn’t what I mean by testing vs no testing. By testing, I mean automated testing and that includes your local machine.
Pros:
Cons:
No one does No Testing. What they mean is Manual Testing. And this is the point about time. They didn’t have time then for automated tests and they are running manual tests now. Do they have time for manual tests now? Maybe. I’m fine with it as long as it’s an informed decision and it’s not causing bugs/outages/delays.
This is what most people do when they don’t have a test suite. You type hello world, run it, look at the output.
$ ./some_program
Hellow orld
"Whoops! Let me fix that real quick. Ok. I think I fixed it, let me see …"
$ ./some_program
Hello world
Great. It’s “done”. It worked right?
Pros:
Cons:
./some_program
a lotThis is when there are tests but maybe only small coverage or one type (like unit tests only).
There’s a huge inflection between partial testing and manual testing. The manual testing project has never had time, doesn’t deeply care (or deeply know) and has had little positive experience (if any) with testing. There is a huge gap here in culture and minds. It could be developer minds, it could be boss minds, who knows. This is the mind-gap where you have to explain what testing is. This is the mind-gap where you try to tell stories and impart wisdom. This is where you try to explain the feelings of having tests at your back when you are trying to refactor or understand legacy code. Sometimes, you might sound crazy or religious.
Cutting the team some slack, maybe there are constraints. Usually there are constraints. Constraints can keep a project or team from making their tests better. Maybe the domain, language, history or some other constraint is keeping the tests from becoming better. But there is a test suite and someone cares or understands at least some aspect of testing.
Maybe people are trying their best. But I would argue that partial test teams haven’t deeply felt tests helping them stay on track and ship quality projects. If they can explain this blog post then I believe them. If they can’t, they haven’t had time yet and maybe they will. It’s not their fault but they also aren’t requiring that testing is a required tool of the trade.
Pros:
Cons:
Excuses can live here too. And some products are hard to test. But have these options been tried?
I would argue some of these things can be mitigated and you really need to reach for a lot of tooling and language features to fake some of this stuff. And maybe it’s hard to test everything But that’s why you don’t need 100% test coverage. But yes, some projects are hard to test. Some code is hard to test too but sometimes that can be fixed and the developer learns a ton about refactoring, their language and design.
I had a project where I thought AWS services were hard to test and my app was breaking in weird ways everytime I pushed it out. Then I researched a bit and found some tricks and my app wasn’t so different between my laptop tests and public reality.
Some form of complete testing where units are tested and the system is tested (like UI testing or regression tests).
Engineers on the team have a culture of including tests with work estimates and expectation. The organization either supports this kind of work explicitly or implicitly. It doesn’t really matter. This is a purely practical decision and there is limited value in “abstract testing values”. Tests exist, that’s good enough.
This probably means the project probably run many different test suites where regression tests are run occasionally but some other group of tests are run more often as code happens.
Your repo:
src/... # the source
test/... # unit tests (fast)
integration/... # end-to-end tests (slow)
This is trickier than it seems. That means that git commits, pushes, CI and
other tools all have this culture baked in. You aren’t going to run integration
tests all the time. Everyone has to know that to be the quickest they can be.
Scripts to separate tests have to exist. You can’t just run all tests that
match under test/*
.
Where this ends is in philosophy and world view. There’s no perceived value (at least as far as schedule, job, work, too busy etc) in doing anything differently. As long as tests exist, that’s good enough. It stops production bugs.
Pros:
Cons:
You simply write your tests before you start coding implementation. There’s a key difference to TDD that I will get to.
You can start top-down or bottom up. You can start a top-down test first:
# top-down starts with a story
When a user logs into the site
Then they see a logout button
Or you can start a bottom-up test first:
# bottom-up starts with low level details
describe "login controller returns a token on success"
It doesn’t matter. The thing you don’t do is write any code in src
or lib
.
You don’t even start. You don’t even spike. You write a test. Hopefully your
test is in a reaction to a conversation you just had with someone who is paying
you. Hopefully your quickness to writing a test captures a conversation in an
executable format that is checked in and lives and acts. Compare this with an
email or a Slack message which sits and rots.
Pros/Cons: I don’t know many projects doing purely this. I guess the pro is not being religious about letting tests drive the design.
You let the tests lead design decisions. This is hands-off-the-wheel dogmatic stuff. You limit your thinking really. Requirements are captured early and tests are written first. But more so, you let go of the wheel. If something is hard to test, that means the design is wrong. If a test can’t be written then you don’t start the feature. If you can’t start the test, you ask someone a question. See how testing is the thing?
You don’t really need to do TDD to have confidence before deployment. But it’s trying to fight human behavior. Almost each step is trying to fight some historically problematic behavior (except for manual and no testing).
You probably need a tight feedback loop, tooling and automation to make this happen. It’s also not the best way to test just because it’s at the end of this list.
Cons:
Pros:
Let’s say your flow is something like this:
While working, you might just run your tests or tests that are relevant to your feature/work. But before you ship it to production, you’re going to make sure you don’t break everything right? So, just have a tool that runs your tests. Have the tool tell you when it passes. Don’t deploy until it passes. Call this continuous integration (CI).
Now something else happens when you have continous testing. You can have continous deployment. So, tests passed and you have a build that those tests ran against right? Then that build is good to go! Why throw it away? Why deploy from your laptop? Deploy it! This is continous deployment (CD). Note that you don’t need to do CI or CD but testing is enabling you to do so.
After everything beyond manual testing, a non-obvious thing happens. Automation and tooling.
Let’s say your flow is something like this:
Would you have refactored and tried one more thing if you didn’t have tests?
Boss: “Hey, right now our calculator only has numbers on it. Could we put a plus sign on there and call it addition?” You: “Sure!”
You go to the repo and start work. You add some code to handle the addition, in this contrived world life is simple.
Ready to deploy? Great! Oops. Someone noticed that you forgot to add a test in case an emoji character shows up. Ok, write a test for that. What is your confidence like now? You have a lot of edge cases covered. Is it going to work?
How does this code work? What is it supposed to do? Maybe you have code comments (but they bit-rot), maybe you have language features like annotations, typespecs or something. Maybe you don’t. But how do you use the calculator from the previous example? Can it handle numbers that are strings? It can’t! Did you write that down? Can you, yourself remember in a few months? Tests really aren’t docs but they are executable and they can stand-in as docs, especially as API usage. So until you write docs (and maybe you won’t), tests can act as capability documentation, ala “What was the developer thinking”.
You have this testing habit now, why not add some libraries?
When you run your tests:
Notice all these enablers and multipliers happening now that you have a test step.
Let me head back to the start of the spectrum. “We don’t have tests”
Compare the workflow where someone just types a program and copies it to a server and then closes their laptop. It seems insane from the very dogmatic land of TDD because their point of view is beyond just “having tests”. But that doesn’t mean it’s “best”. But there is a knowledge gap between each of the points on the spectrum. I’m ok with ignoring parts of the spectrum for a project if it’s understood. If you can explain this blog post to me then we’re cool. If you cannot then I feel like there is a blind spot and any pain points the project is feeling is fair game to improve. If you can explain this post then I’ll mark it up to semantics. If you cannot explain the spectrum to me, then the term TDD is being misused as “testing” and I’d like to explain and help because there is some wisdom to be shared.
There’s very little right or wrong in all this. I am trying to communicate that “no tests” does not mean “no testing”. You are probably doing manual testing. And for certain projects, who cares? Do you care? Do you see yourself hating the manual testing? Then automate it. Are you manually typing in a test user and password, clicking a login button and then clicking around to see if anything broke? Does that annoy you even to talk about it? Then automate it. Are you hand-crafting test data and sharing it with colleagues? Then automate the creating of your test data.
There’s no such thing as no testing and TDD is not completely required although I have enjoyed TDD or Test-First quite a bit when I’ve had the opportunity to use it as an approach. It’s not required to go all the way to TDD because testing is a gradient spectrum.
23 May 2017
Gb is a fantastic tool for Golang that let’s you define dependencies but more importantly (to me) is it lets
you work out of a
normal1
src
directory wherever you want. You don’t have to mess with $GOPATH
and you don’t have to put your
own creations next to libraries. You could even code directly in Dropbox if you wanted to be super lazy
about source control and sharing. Overall, I really like gb for projects. It’s more normal to other
languages and I don’t have to have Go be the exception to my project backups / paths / scripts / everything.
But I think examples are lacking. The gb docs are great, I’m not saying that. I just wanted to walk through growing a project from small to medium to large and see how organization changes. First, we’ll start by building a fake calculator with no working pieces so it doesn’t need a lot of organization. Then as we add features, we’ll pretend that it needs lots of separation and structure for future expansion and work.
You’ll need to install gb with go get
. You probably already have it installed and you
know how to google so I’ll just skip that stuff.
I’m going to use the terms small / medium / large but please note that doesn’t mean stupid / insignificant / important. These size terms are just for labeling and explanation, don’t read anything else into it. If you make a small project, it’s not “stupid” just as a large project is not automatically “important” 2.
First, a gb project is really just a directory with a src
directory in it. Of course, nothing will work
without some files for it to build. Below is the same error you’ll get even if you make gb_project/src
(which gb looks under for source files).
$ mkdir gb_project && cd gb_project
$ gb build
FATAL: command "build" failed: no packages supplied
So, delete that directory and let’s do something more useful.
Gb wants a subdirectory for a package under src to tell it what to build. For our examples let’s make a pretend calculator.
Our working directory is going to be pretend_calculator
. This can be anywhere. Under your home, tmp or
Desktop. Put it wherever you want. Just assume we’re in pretend_calculator
as the project root after this point.
$ mkdir -p pretend_calculator/src/calculator
Let’s write minimal code for this to build.
// src/calculator/calculator.go
package main
func main() {
}
$ cd pretend_calculator
$ gb build
calculator # showing us gb built the pkg, I'm going to omit this output from here on out
So our project tree looks like this:
.
├── src
└── calculator
└── calculator.go
When you gb build
, it will create a binary ./bin/calculator
that doesn’t print anything (not surprising,
our main is empty). This project layout isn’t that great because the main is really a cmd
. If we wanted to
add more than executable, we’d have to change where the main()
is and rename a few directories and files.
So this isn’t great if we’re building an equivalent of Hello World, it’s hard to tell where func main()
is
if you just look at the filesystem.
$ tree -I pkg
.
├── bin
│ └── calculator
└── src
└── calculator
└── calculator.go
So let’s make this more obvious. Let’s create the start of a simple gb project with a command entry point.
In this case, we want some actual code that runs something. We’ll have everything in one file under cmd/
.
Later, we’ll move some code out to a package as the project examples grow in size. The cmd folder in gb
projects tell gb to build binaries of that same name of the file or the package. It’s the executable we’re going
to run from ./bin
.
Now this is a bit tricky. If you name your source file src/cmd/calculator.go
then you’ll get a binary
called cmd
. So what I’d do is name it something like src/cmd/calculator/main.go
just show that this is
where the main lives for this binary. You can name the file something other than main.go
but it needs to be
in a subdirectory. The gb docs are a bit vague in their example
tree output describing this. Also, note that binaries will always show up in ./bin
. So I’m skipping that
output in the tree listings.
// src/cmd/calculator/main.go
package main
import "fmt"
func main() {
fmt.Println("Calculator Fun Time™")
fmt.Printf("2 + 2 = %d\n", 2+2)
}
.
└── src
└── cmd
└── calculator
└── main.go # <-- file can be named anything, needs .go extension
$ gb build && ./bin/calculator
Calculator Fun Time™
2 + 2 = 4
So this is a nice layout for a small CLI app with not too much logic that would be ok to put into a single file under cmd. If I wanted to break it apart more where the entry point (the main) and the app logic and functions were separated and kept organized, I’d use the medium project layout which we’re going to talk about next.
You could also just add functions to main.go to keep that file clean and then later move the functions around to different packages later.
Let’s move some of the app logic to another file and package. This can be super confusing and yet it’s the
most common thing to do (in my opinion) when working with Go projects. We’re going to make an add function in
a new file and a new package called calculator
. Note that this package is sort of arbitrary, it doesn’t
need to be your project folder name or anything. Packages are subfolders under src. This will be more clear in
the next gb project examples.
// src/cmd/calculator/main.go
package main
import (
"fmt"
"calculator" // <- this is really our local package in src/calculator/*
)
func main() {
fmt.Println("Calculator Fun Time™")
result := calculator.Add(2, 2)
fmt.Printf("2 + 2 = %d\n", result)
}
// src/calculator/calculator.go
package calculator
// Let's not name them num1 and num2 if we can :)
func Add(number int, addend int) int {
return number + addend
}
.
└── src
├── calculator
│ └── calculator.go
└── cmd
└── calculator
└── main.go
$ gb build && ./bin/calculator
Calculator Fun Time™
2 + 2 = 4
Note that we would be planning on putting all functions into src/calculator/calculator.go
here. If we
wanted to only put the Add
function into src/calculator/add.go
, we could do that. In the context of a
medium sized Go project, we might not want to do that.
Also note that the main.go
needs to import calculator
. This refers to the package we created. If we want
sub-packages and more sub-division, we can do that but we’ll get to that in a bit.
Just a reminder, my label of large is very arbitrary.
Ok, now what if we want to add more functions and packages. We can continue to do so across files and packages. Let’s add subtraction and the concept of memory storage (you know the MR button?).
Adding subtraction is the same as addition. We just add a Subtract
function to
src/calculator/calculator.go
with a Capital letter to export it. It’s the same as Add. We could split this
out to different files if we wanted. Maybe that’s more interesting. We’ll do that in the next example.
// src/calculator/calculator.go
package calculator
// Nothing changes here.
func Add(number int, addend int) int {
return number + addend
}
// Subtract 1 from 4 is 3.
func Subtract(from int, number int) int {
return number - from
}
Let’s add memory storage. We need to create a struct to store stuff in. So our memory.go
code is going to
have a struct initializer in it. The function naming is just Go convention, nothing here is specific to gb.
// src/calculator/memory.go
package calculator
type memory struct {
register int
}
func NewMemory() memory {
return memory{
register: 0,
}
}
// MR means memory recall, it returns the contents of a number in memory
func (m *memory) MR() int {
return m.register
}
// MS means memory store, it stores a number (normally would be the screen)
func (m *memory) MS(number int) {
m.register = number
}
We only export the NewMemory function to keep people from creating structs themselves.
Using this struct in main.go
for the command goes like this:
package main
import (
"fmt"
"calculator"
)
func main() {
fmt.Println("Calculator Fun Time™")
add_result := calculator.Add(2, 2)
fmt.Printf("2 + 2 = %d\n", add_result)
subtract_result := calculator.Subtract(1, add_result)
fmt.Printf("%d - 1 = %d\n", add_result, subtract_result)
fmt.Println() // spacing
// memory functions
memory := calculator.NewMemory()
fmt.Println("Storing the result in memory ...")
memory.MS(subtract_result) // store
fmt.Printf("Memory has <%d> in it.\n", memory.MR()) // recall
}
Running this now produces:
$ gb build && ./bin/calculator
Calculator Fun Time™
2 + 2 = 4
4 - 1 = 3
Storing the result in memory ...
Memory has <3> in it.
Our current tree structure looks like this. We are doing file organization at this point but we really still have one package (other than main).
.
└── src
├── calculator
│ ├── calculator.go
│ └── memory.go
└── cmd
└── calculator
└── main.go
What does a more complicated project look like?
Let’s split out every function into a file to make the project very easy to navigate. Intuition should drive file search. Add and subtract will go in their own files. We’ll add the concept of a tape to display information that will also have the opportunity to save state that will make the memory feature more realistic to how physical calculators work.
I said we’d break out functions into intuitive files. Let’s put Add()
into add.go
// src/calculator/function/add.go
package function
func Add(number int, addend int) int {
return number + addend
}
And the same for Subtract
// src/calculator/function/subtract.go 1 ↵
package function
func Subtract(from int, number int) int {
return number - from
}
Next, let’s make a tape.
// src/calculator/tape/tape.go
package tape
import "fmt"
// represents an empty memory instead of using nil which does not communicate well
const emptyRegister = 0
// For simplicity's sake, the calculator tape is essentially the entire electronics
// of this fake calculator. A tape probably wouldn't care about current previous
// values for undo functionality.
type tape struct {
lastNumber int
CurrentNumber int
}
func NewTape() tape {
return tape{
lastNumber: emptyRegister,
}
}
func (s *tape) Clear() {
s.CurrentNumber = emptyRegister
}
// Updates the internal state of the tape
func (s *tape) Update(number int) {
s.lastNumber = s.CurrentNumber
s.CurrentNumber = number
}
// Displays the current number
func (s *tape) Display(message string) {
fmt.Printf("| %-22s|%7.1f|\n", message, float32(s.CurrentNumber))
}
// Just print a blank line like the calculator tape is advancing
func (s *tape) Advance() {
fmt.Printf("|%31s|\n", "")
}
// Roll the tape back, behaves kind of like one-time undo
func (s *tape) Rollback() {
s.CurrentNumber = s.lastNumber
s.lastNumber = emptyRegister
}
func formatNumber(number int) string {
if number == emptyRegister {
return " "
} else {
return fmt.Sprintf("%1.0f", number)
}
}
It’s very similar to the last examples, just more code. We have some types and structs in this file but you can see that any Capitalized anything is expected to be used externally. The package has no hierarchy but later when we use it, we’ll need to alias it.
The changes to the last project are simpler that it seems. All we did was:
src/calculator/function
. The package is now calculator/function
.Add
and Subtract
into files named add.go
and subtract.go
. We don’t explicitly need to care
about this when importing.package function
. You can’t declare package calculator/function
at the
top. Doing that won’t even pass go fmt
, it will error.Memory.go
stays the same, it’s in the root calculator
package just because.
// src/cmd/calculator/main.go
package main
import (
"fmt"
"strings"
"calculator"
fn "calculator/function"
tape "calculator/tape"
)
func main() {
fmt.Println("Calculator Fun Time™")
fmt.Println(strings.Repeat("-", 32))
tape := tape.NewTape()
tape.Update(fn.Add(2, 2))
tape.Display("2 + 2")
tape.Update(fn.Subtract(1, tape.CurrentNumber))
tape.Display("Subtract 1")
tape.Advance()
// memory functions
memory := calculator.NewMemory()
memory.MS(tape.CurrentNumber) // store
tape.Display("Hit Memory Store")
tape.Clear()
tape.Display("Cleared screen")
tape.Update(fn.Add(10, memory.MR()))
tape.Display("Add 10 to memory")
tape.Advance()
// rollback feature
tape.Clear()
tape.Update(fn.Add(1, 1))
tape.Display("1 + 1")
tape.Update(fn.Add(1, tape.CurrentNumber))
tape.Display("+ 1")
tape.Rollback()
tape.Display("Rolled the tape back")
}
Our main file has expanded dramatically as we try to exercise the new packages and files we’re making.
We need to give calculator/function
an alias (in this case fn
) to use a hierarchical package.
It’s very arbitrary. We still are using Memory
out of the calculator
package so we need to import
that explicitly like we were before. If you wanted to break memory out, you’d follow what we did with add &
subtract.
Our tree now looks like this:
.
└── src
├── calculator
│ ├── function
│ │ ├── add.go
│ │ └── subtract.go
│ ├── memory.go
│ └── tape
│ └── tape.go
└── cmd
└── calculator
└── main.go
Running it shows how main.go works. It might be easier to skim the code and just read the output. It’s a very contrived example but more “real”.
$ gb build && ./bin/calculator
Calculator Fun Time™
--------------------------------
| 2 + 2 | 4.0|
| Subtract 1 | 3.0|
| |
| Hit Memory Store | 3.0|
| Cleared screen | 0.0|
| Add 10 to memory | 13.0|
| |
| 1 + 1 | 2.0|
| + 1 | 3.0|
| Rolled the tape back | 2.0|
I hope this was interesting. I’ve been wanting an article like this to exist ever since I started using gb as a tool. I’ve found example gb projects on github that were useful examples but believe me when I’m blogging all this for myself as a future reference. Shoot me a note on twitter if you liked this or would like to see something else, it’s nice to know who’s reading.
[1] Nothing is normal
[2] I prefer better/worse over good/bad. In this case it'd be smaller and larger which is awkward in this case. The only rock we have to stand on in C.S. is metrics, everything else is opinion (like this very statement!).
10 Apr 2017
I wrote about setting up a Dev Log about two years ago. At this point, I’ve been using this setup for two years so I thought I’d write a little bit about it as a follow up. After all, I hate uninvolved advice.
I’ve learned that the dev log works as a pensieve. It’s a dumping ground for code snippets and dreams. I found it a good outlet for frustrations too. But most importantly, it’s like an archeology site. Let me give you the best payoff of the dev log as a small story.
We use an external API for mobile stats tracking. It will track installs and other things
from the app store. It’s wired up to our own API through a webhook. This webhook has a
URL configuration. Originally, it was something like http://api.example.com/...
and it had
a payload and other such details. We hadn’t received data from them in many, many months.
I started looking at this but basically had no context other than this.
Of course, the first question in debugging is what has changed. So what did changed? I didn’t think anyone had messed with the config in months because essentially this service was a set and forget kind of thing. We also hadn’t received Android or iOS data on the same day. Too suspicious, too convenient. So, I knew the data it stopped working. Let’s go to the dev log!
What did we find on the day that it stopped working:
Switch to temporary SSL redirects by default
Later, there are some clues that we were working on making the temporary redirects into permanent redirects. There’s notes all around this timeframe that we were working on making the site SSL-only. Ah.
Change the webhook to https
and bam, we start getting rows in the database of payloads.
The URL didn’t jump out to me as wrong. It’s obvious now but the dev log helped trigger
some clues around this. The clues were also in the git log but not the surrounding context
that we were working on making the site SSL-only.
It especially didn’t seem wrong as redirects are supported. It turns out the service we are using
doesn’t handle redirects (or at least seems to not). Just looking at the URL as http://
doesn’t
seem wrong at all. But with the dev log context, it does. This is what changed.
Just as when you don’t have a pair and you need to be the “high level person” and the “low level person” all at once, I’ve found that my complaints and frustrations come off TO MYSELF as whining. This is amazing. Let me say this again.
Logging frustrations in my dev log comes off as “whining” to myself later.
I still think this is good if it’s a healthy outlet. It’s not good if it lets you polish your whining so it can be delivered as a pithy zinger to an unexpecting listener. The dev log is about capturing your thoughts. Be careful what your thoughts are, you might get what you want. I still like to capture task changes as this represents time lost or spent. Maybe this sounds like whining in the log but that’s ok.
Don’t tag or organize your thoughts into an ontology or fancy structure. The idea is to get in and get out. One friend is good enough with org-mode that he was able to structure his log more than me. That’s fine. Make it your own. But don’t start making per-project logs, I think it would just self-destruct under ceremony burdens. The dev log is something I write in during context switches. Get in and get out. See my previous post for shortcut keys.
I however would leave clues for myself like LEARNED:
or TIL
which could be used for retros. Or PR: S3
refactor
if I opened a pull request. The idea is to capture what have you been doing or what is your time
being spent on. I capture interruptions or helping someone too. That’s a great thing to jot down when you
first come back to your desk or switch from Slack.
Helped Dumbledore with Docker
10435 lines
of text and two years. My intention or goal was never length. It was always the pensieve.
Reading back on it is a massive log of bugs, TILs, tech gotchas and a frustration heat-sink. There are
face-palm mistakes, logs of miscommunications, “this library doesn’t do that” notes and details.
2017-02-28 - Tuesday
Trying to do a deploy, S3 goes down hard and breaks the Internet.
There are rabbit hole results with fantastic details right before you come back from the rabbit hole:
Envconsul won’t work for us because of our combination of unicorn zero downtime deployment configs, how we want to handle ENV restarts and a limitation of Go. Envconsul won’t work because it does in fact restart the app correctly if -once is passed and you -QUIT the worker. But since -once has been passed, you can’t reload the environment.
https://github.com/hashicorp/envconsul/issues/52
This is the detail I wanted to capture so I can chunk it later as “we can’t use envconsul” and then I can just text search for this later. This is how it actually worked many times.
25 Feb 2017
So let’s say I have some CLI I want to exist …
The concrete example I’m going to use is my previous blog post about Slop where I demonstrated how to use the slop gem. The code in that post is slightly contrived and certainly not clean but I think it demonstrates how to test CLI scripts which suffer from some testability problems (how do you capture STDOUT?). The thing that it does not demonstrate is long term maintenance problems that happen after it’s written once for a blog post.
Code review aside, this desire to have a binary CLI was inspired from a very real work situation where we had a CLI utility and not surprisingly it was damaged from some gem and dependency problems. Mainly, if I use (consume as a user) the slop gem it’s in my bundle. If my list of gems grows forever eventually I might want want to develop another gem that uses slop as a dev. So now I need to use RVM’s gemsets or gem_home or otherwise keep my gems and projects sandboxed. Because (as it did happen) pry uses slop and when pry stayed behind causing slop problems between projects. Distributing this gem to our team was problematic because different people used different gem isolation tools.
So … uh … what if I just want a CLI? Why can I just live and die in /usr/local/bin
like “normal” unix-y
utilities do?
So for the past few years I’ve experimented with Go as a tool in the toolbelt for the above problem. It has fast compile times, can cross-compile to other cpu types and you can get a binary even for a web service. Shipping a binary for an api service sounds pretty neat! However, it lacks high-level density (usually called expressiveness). So without starting a language war, what if I want something in-between loose shell scripts and strict compiled C (not that I’m specifically talking about shell or C)?
Ruby is so close to shell script sometimes and then you can drop into the “real stuff” for the heavy lifting
and then just continue in happy script land. I feel like a lot of shell script problems align with this flow.
Looping over images and doing mass conversion for example. It’s just a little bit of heavy algorithm
surrounded by a lot of shell stuff, which is great. So Ruby has been fine in that way. But then not fine
for it to live in $PATH
.
Go as an experiment has been fine while I’ve sought a panacea for $PATH. Go has a lot of interesting things in it and I’m not giving up on it. But porting isn’t real. Rewriting is real. Porting Ruby to Go is a rewrite. You really need to go back to requirements / thinking and you will feel tempted to refactor. It’s closer to rewriting I mean. It works the other way too. I’ve seen “Java in Ruby” in a lot of libraries.
There’s no such thing as porting. Only rewriting.
I’ll show otherwise later.
So if I make a hello world CLI in Ruby called utility
, how do I share it?
Here I list the dependencies that are implied in the top box. In ruby there are many.
Many times they aren’t listed or described. If you are a ruby dev, you just know that things start with
bundle exec
, you probably have it aliased. If you aren’t, you are confused and probably don’t use the thing
because the README
didn’t work.
Maybe this above in the middle is the source code I’m trying to share. Scripts can be commited with file
permissions so the chmod on the left isn’t entirely needed. What is definitely needed is some path setup
which may or may not already be configured. I suppose you could put utility
into /usr/local/bin
but then
it’s like an oddball exception. brew list
won’t show it and it’ll never be updated. You’ll just have to
remember you installed utility as a one-off? Uhh …
Basically it boils down to this:
“Please install a dev environment” vs “Please use a package manager.”
You can see that on the left I’m basically asking a user to install a dev environment for a Ruby program. And then as time progresses, what happens to that dev environment? Does it bit-rot? Does homebrew break it?
And maybe you might say “just gem publish”. Phusion used to do this for passenger. And logstash. But then they stopped. Using rubygems to distribute ruby code is sometimes done but then sometimes it’s frowned upon. I’m not sure exactly why and I don’t have a source although codegangsta kinda hints at it.
This isn’t a ruby problem. The same thing happens with node & python. But when I run into a utility written in Go, I breathe a sigh of relief.
It’s written in golang. Woo! This should be easy to install and run.
Worst case it’s a go get
. Sometimes it’s a brew install
. I think these mechanics keep people from
packaging ruby utilities into homebrew. I know there are packages that help with this like Phusion’s
tool and FPM but I just
don’t see that a lot. Most of the time the README just says gem install
but they skip all the context that
I diagrammed up there. Even my own projects blow up on me. Sometimes I have to reset bundler and ruby (OSX
upgrades). Then I’m missing a gem.
# My own goddamn project
~ > whatthefi
~/bin/whatthefi:80:in `main':
uninitialized constant Slop::Parser (NameError)
from ~/bin/whatthefi:136:in `<main>'
The fix: gem install slop
. I had already done bundle before but cleaning out gems, upgrading homebrew,
upgrading to Sierra or switching from rbenv/chruby/rvm and back and forth can leave this script “broken”.
So, what to do? I just want a command in my path. Do I have to switch languages?
25 Nov 2016
The Enthusiast Trench is a metaphor for a topic/hobby/community/pastime that can’t easily observed and understood from outsiders without a similar amount of interest or involvement of the curious party.
There isn’t just one Enthusiast Trench. There are many trenches and they are easily to find if you are walking on the surface of the earth. It’s like the concept of rabbit holes but rabbit holes or rabbit holing is usually a pejorative about wasting time. Enthusiast Trenches are about interest, enthusiasm and the hidden nature of the payoff in these things until you spend enough time to appreciate them. At that point, you are in the trench and now you are unable to explain to outsiders what you have learned and witnessed in the Enthusiast Trench. The trench in The Enthusiast Trench metaphor isn’t a pejorative. It isn’t related to dirt or digging. Enthusiast Trenches aren’t good or bad.
Anything that can’t be explained why it is fun is probably an Enthusiast Trench. When a person has to resort to metaphors, they are trying to think of things that surface people have seen and use those for stand-ins for things they have seen underground in the Trench.
If you listened to someone talk about why they built a life-sized Lost in Space blinking computer replica they might tell you “it was fun” but if you probe “why” then they are going to have a rough time explaining it. The raw answer in their head is probably something like:
I didn’t think I’d be able to get the neon bulbs refresh time to be precise enough to look like the original Burrows props. But, after I did some tests and talked with some friends that I met (and have become good friends with since), I knew I could get the full scale version working. Then it was just a matter of time …
The Trench isn’t this project or this person. It’s the whole community of people doing projects like this. The Trench hides the real “why” behind a time and interest wall.
A community where mods, hacks or extensions exist and are plentiful is a strong indicator that it is an enthusiast trench. The important thing about Enthusiast Trenches is not that it is one or it isn’t one. It’s that it can’t be easily appreciated.
I can think of a lot of examples but some of the biggest trenches are the ones that are abstract and not physical. Photography is one but it can be demonstrated physically (maybe not the process but the product). The abstract trenches are really tricky. So, naturally, being a software person I can think of a lot of software trenches.
A working irc client in minecraft using mods.
A raid-proof base in Rust (a survival/building game), designed in an external CAD program with mods.
A development board with the PCB shape of a Lego minifig
These examples pictured above are easily demonstrable because they are visual or physical. Abstract things are not.
This is true for software libraries in every language I can think of. Maybe I’m not in some of these communities. Maybe I’m haven’t been in the communities for a long time. I might ask the question “what are modern libraries to use in Java these days”? This is like calling down to someone in the trench after you have left. People are extending tunnels that can’t easily be explained.
Python Trench: “oh, nobody uses urllib2, everyone uses requests and there’s this great requests addon that makes uploads so easy, it really ..” (etc etc).
Maybe software libraries aren’t purely fun. But people can be enthusiastic about them because they are amazing in their eyes. If you are an outsider, you won’t be able to see the fun in the interior tunnels of their trench.
There is definitely a relation to the fear of missing out (FOMO). You could feel bad about not being in all trenches and many times I do. I don’t want to encourage FOMO. I don’t want to give FOMO any more fuel. I don’t really have a solution to FOMO and really that’s a different topic.
I follow the City Skylines subreddit but I don’t play the game. I know people are having fun. I sort of understand the game mechanics and the game loop. But there are a lot of mods and deep mechanics I don’t get. This is true of a lot of games with “mods”. The community is digging its own trenches from within a trench by extending the game. But I really don’t grok the fun.
Sometimes, I just let the weight of the trenches flow over me and appreciate the complexity. Like looking at a landscape from really far away. It’s beautiful to me because I am missing the details.
19 Apr 2016
If you are working on a gem that uses slop itself (your gem uses slop) then you might run into this error when adding pry. Because the latest published pry gem uses slop 3.6 but you are probably using slop 4. Slop 4 and 3 aren’t the same API.
require 'my_cool_gem_im_working_on'
Gem::ConflictError: Unable to activate my_cool_gem_im_working_on-0.2.0,
because slop-3.6.0 conflicts with slop (~> 4.2)
from .../rubygems/specification.rb:2284:in `raise_if_conflicts'
On bundle install
you’ll probably get a different error.
Resolving dependencies...
Bundler could not find compatible versions for gem "slop":
In snapshot (Gemfile.lock):
slop (= 4.2.1)
In Gemfile:
my_cool_gem_im_working_on was resolved to 0.2.0, which depends on
slop (~> 4.2)
pry (= 0.10.1) was resolved to 0.10.1, which depends on
slop (~> 3.4)
Running `bundle update` will rebuild your snapshot from scratch, using only
the gems in your Gemfile, which may resolve the conflict.
This is true for pry 0.10.2
too. There are two options I’ve found that works:
tl;dr Do this
Install 0.10.3 or newer. Make sure your bundle is resolving to that exact version. This means
# your Gemfile
"pry", "= 0.10.3"
in your Gemfile. If you are working on a gem and don’t really have a Gemfile but have a gemspec file then put this dev dependency in your gemspec.
# your .gemspec file
spec.add_development_dependency "pry", '= 0.10.3'
You could also install pry from github master. This might show up as 0.10.3 depending on when you are reading this. Version numbers only increment when pry does a release. I found that pry git master did not have this issue.
Now the problem here is, if you are working on a gem yourself, you don’t have a Gemfile
.
Afaik, you can’t install a gem from github source instead of a gemspec (that wouldn’t make sense
because you are going to distribute a gem!). But perhaps, you maybe want pry
temporarily in your gemspec like this:
# your_gem.gemspec
spec.add_development_dependency "pry", '=0.10.3'
Here’s how you can install a gem from source in a gemspec temporarily.
# do what you want here but I clone into a global place called ~/src/vendor
mkdir -p ~/src/vendor
cd ~/src/vendor
git clone https://github.com/pry/pry
cd pry
gem build pry.gemspec
# it will spit out a pry gem with a version on it
gem install pry-0.10.3 # or whatever `.gem` file is created
Now we have pry 0.10.3. Bundle doesn’t care it came from pry master. So when it
picks up on the spec.add_development_dependency
it will install the version
you already have. BUT BIG PROBLEM You probably don’t want to commit this
because people will get the same error you got on bundle install if
that version doesn’t resolve. As far as I can tell, this pry version
works with slop so perhaps you just want to use 0.10.3 and be done with this.
I just wanted to illustrate how you can manipulate bundler.
The reason this is happening is because of the slop namespace.
Pry fixed this in a commit associated with that issue. It’s fixed because they inlined
the gem as Pry::Slop
so now Slop
(your version) doesn’t conflict/activate.
Hope this saves someone’s day! :)
06 Apr 2016
I had an older post about ruby and slop but that’s with Slop 3 which is basically locked to Ruby 1.9.
No doubt, this post will bitrot too so please pay attention to the post date. The current ruby
is about 2.3.0
, slop 4.3
is current, it’s 2016 and the US election cycle is awful.
update This CLI has since been ported to crystal as an example of that process. Porting a Rubygem to Crystal
I think the most confusing thing about slop is that it has great examples and documentation but when you try to break this apart in a real app with small methods and single responsibilities some things sort of get weird. I think this is because of exception handling as logic control but I’m not sure enough to say slop is doing something wrong that makes this weird. In my example
I refer back to MY OWN BLOG quite often for slop examples so it’s ok that you need help.
Let’s look at the example from the README
.
opts = Slop.parse do |o|
o.string '-h', '--host', 'a hostname'
o.integer '--port', 'custom port', default: 80
o.bool '-v', '--verbose', 'enable verbose mode'
o.bool '-q', '--quiet', 'suppress output (quiet mode)'
o.bool '-c', '--check-ssl-certificate', 'check SSL certificate for host'
o.on '--version', 'print the version' do
puts Slop::VERSION
exit
end
end
I disagree with -h
here for hosts. I think -h
should always be help. This is especially true
when switching contexts. When I switch to java or node or go or python, I have no idea
what those communities’ standards are. I rely on what unix expects: dash aitch.
I disagree also with this example because figuring out how to handle -h
for help
is the most confusing thing about using slop because you have to use exceptions
as flow control (sort of an anti-pattern).
15 Mar 2016
Thoughtbot has an excellent and much desired article on getting Docker + Rspec + Serverspec wired up but I couldn’t find anything about images generated from Packer. Packer generates its own images and so we can’t just build_from_dir(.)
. Our images are already built at that point. We’re using Packer to run Chef and other things beyond what vanilla Docker can do.
The fix is really simple after I was poking around in pry looking at the serverspec API.
First of all, what am I even talking about? Serverspec is like rspec for your server. It has matchers and objects like
describe file('/etc/passwd') do
it { should exist }
end
describe file('/var/run/unicorn.sock') do
it { should be_socket }
end
So although we have application tests of varying styles and application monitors, serverspec allows us to test our server just like an integration test before we deploy. I had previously tried to go down this route with test kitchen to test our chef recipes but it was sort of picky about paths. Additionally, going with serverspec and docker doesn’t even require Chef. Chef has already been run at this point! What this means is fast tests. Just booting a docker image and running a command is fast.
# single test
$ time bundle exec rspec
1.415 total
Nice!
So how does this work? Well, like I said the thoughtbot article is really good but I wanted to add to the
‘net about packer specifcally. The critical piece to make Serverspec work with a Docker image
created from Packer is in your spec itself (spec/yer_image_name/yer_image_name_spec.rb
).
# spec_helper and a lot of spec/ came from `serverspec-init1`
require 'spec_helper'
require "docker"
describe "yer_packer_image" do
before(:all) do
image = Docker::Image.get("yer_package_image")
set :os, family: :debian # this can be set per spec
# describe package('httpd'), :if => os[:family] == 'redhat' do
# it { should be_installed }
# end
set :backend, :docker
set :docker_image, image.id
end
it "has bash installed" do
expect(package("bash")).to be_installed
end
end
See that image = Docker::Image.get("yer_package_image")
bit in the before block? This
is the difference between build my image (what the thoughtbot article uses)
and run an existing image. Since packer builds the image, we can just reuse
the one we have from our local store. Then later :docker_image, image.id
sets
the image to use during the test. It knows about docker because of require "docker"
from
serverspec. I’ll mention what versions of these gems I’m using at the time of this post
since this might bit-rot.
docker-api (1.26.2)
rspec (3.4.0)
serverspec (2.31.0)
specinfra (2.53.1) # from serverspec
Ok this is cool! How about we have packer run our tests after the
packer build
. Unfortunately this is mostly useless. :( The tests will run but
they won’t do anything if the tests fail.
Here’s the post-processor bit of our packer config. It just tells Packer to do things after it’s done building. The first bit is tag our image so we can push it out to our registry.
"post-processors": [
[
{
"type": "docker-tag",
"repository": "your-company/yer_packer_image",
"tag": "latest",
"force": true
},
{
"type": "shell-local",
"inline": ["bundle exec rspec ../../spec/yer_packer_image"],
"_useless": "don't do this"
}
]
]
The path structure is arbitrary above. We have a project we’re currently
working on that I’ll explain in another blog post or talk. The only specifics
about this file structure is that typically you’d want to do something like
require 'spec_helper'
but if you are building an image from a subdirectory
and then running tests from another nested subdirectory then you’ll need to
require_relative 'spec_helper'
. I actually don’t know why this isn’t the
default anyway.
But like I said, running tests with Packer as a post processor doesn’t do anything. You could run it with PACKER_DEBUG or something but I don’t like any of that. I’ll be following up with a more complete workflow as we figure this out. So you don’t need to do this last bit with the post-processors. I just wanted to leave a breadcrumb for myself later.
12 Feb 2016
Sidekiq Enterprise has a rate limiting feature. Note that this is not throttling. The perfect use case is the exact one that’s mentioned in the wiki: limit outbound connections to an API. We had a need for this between two of our own services. I spiked a little bit and I thought the behavior was interesting so I thought I’d share.
20 Jan 2016
At one point a while back, I had a config file outside a rails app and what I wanted was something like this:
Given this mappping definition
/order/:meal/:cheese
How can I turn these strings into parsed hashes?/order/hotdog/cheddar -> {meal:'hotdog', cheese:'cheddar'}
I knew that something in Rails was doing this. I just didn’t know what. I also didn’t know what assumptions or abstraction level it was working at.
The gem that handles parsing the routes file and creating a tree is journey.
Journey used to be (years ago) a separate gem but is not integrated into
action_dispatch
which itself is a part of actionpack
. So to install it you
need to gem install actionpack
(or use bundler) but to include it in your
program you need to require 'action_dispatch/journey'
. If you have
any rails 4+ gem installed on your system, you don’t need to install
anything. Action pack comes with rails.
require 'action_dispatch/journey'
# reorganize pattern matches into hashes
def hashify_match matches
h = {}
matches.names.each_with_index do |key, i|
h[key.to_sym] = matches.captures[i]
end
h
end
pattern = ActionDispatch::Journey::Path::Pattern.from_string '/order/(:meal(/:cheese))'
matches = pattern.match '/order/hamburger/american'
puts hashify_match matches
matches = pattern.match '/order/hotdog/cheddar'
puts hashify_match matches
# {:meal=>"hamburger", :cheese=>"american"}
# {:meal=>"hotdog", :cheese=>"cheddar"}
We have to have hashify_match
reorganize our objects because this is what
pattern.match
returns:
irb(main):001:0> matches = pattern.match '/order/hamburger/american'
=> #<ActionDispatch::Journey::Path::Pattern::MatchData:0x007f9d4d527aa0
@match=#<MatchData "/order/hamburger/american" 1:"hamburger" 2:"american">,
@names=["meal", "cheese"],
@offsets=[0, 0, 0]>
So we have to turn these ordered matches into a hash.
irb(main):001:0> matches.names
=> ["meal", "cheese"]
irb(main):002:0> matches.captures
=> ["hamburger", "american"]
We could also zip the results together but we wouldn’t have symbolized keys.
irb(main):001:0> Hash[matches.names.zip(matches.captures)]
=> {"meal"=>"hamburger", "cheese"=>"american"}
You could symbolize them easily within a rails app or by including active support.
require 'active_support'
require 'active_support/core_ext'
Hash[matches.names.zip(matches.captures)].symbolize_keys