I am thrilled to announce a preview release of Tertestrial, an open-source tool that makes running automated tests as part of developing software more natural and seamless.
why we need to run tests a lot
In TDD and BDD (test/behavior driven development) our tests don’t just specify and verify that code works, they drive the entire development workflow. Instead of writing code initially, we write tests first. Then we let the error messages from those failing tests tell us the next thing to do with our code base. We implement the smallest amount of code that makes that error go away (this is often just a few lines of code) and run the test again to verify that this actually worked. If there is a new error message, we make that one go away via another code change. If the test finally passes, we write more tests and start over. When there are no more tests to write, we are done coding this feature.
Letting tests drive development splits up the work between us and the computer. We do what humans are good at and enjoy doing (i.e. creatively solving problems and designing elegant solutions), and the computer is taking over what it is good at: repetitive verification of correctness and conformance with specifications, never missing a single detail. With that amount of support from our test suite we can code with more certainty, focus, and make bold strides with confidence. Our co-pilot, the test runner, confirms the correctness of every small or large change, right when we do it. Bugs have mere seconds to live before they are recognized and squashed.
This means we have to run tests a lot. Several times per minute, many dozen or even hundreds of times per hour! And we don’t want to run the full test suite all the time. Ideally, we run only the one test that is describing currently missing behavior of our code base, so that we get a single useful signal on what to do next, as quickly as possible.
Let’s look at the various ways on how to run tests during development and compare their pros and cons against each other.
running tests manually
Most developers do TDD by running tests manually. The steps to do that look somewhat like:
- determine the file path and line number of the test you want to run.
- [cmd]-[tab] to the terminal
- type something like
rspec spec/foo/bar_spec.rb:27and hit Enter
- [cmd]-[tab] back to the editor
Even with auto-completion by your shell for each segment of the file path, that’s quite a bit of technical data to remember and a lot of typing on the command line. While our thoughts are deep in code, thinking about how dozens of variables and algorithms fit together, we don’t want to distract ourselves with test file paths and line numbers, but prefer to keep getting stuff done in our editor!
Running the test a second time is much simpler:
- [cmd]-[tab] to go to the terminal
- [cursor up]
- [cmd]-[tab] back to the editor
A lot less, but still a sequence of 4 keystrokes — all pretty far away from the home row on our keyboard.
auto-running tests on file save
Some developers use auto-runners like Guard to run the “appropriate” test on each file save. That’s a good step towards embedding testing tighter into the development workflow, but it comes with a number of shortcomings:
- This always runs all the tests in a file, even if only one test is currently driving the development (the one red test that we try to make green).
- Tests run too frequently this way: on every file save, no matter what.
- The tests that are running change frequently, depending on which file is modified.
- This can trigger slow tests that you don’t want or need executing right now, delaying your ability to run other tests and thereby your TDD flow.
- It isn’t straightforward to run a particular test.
- This requires sophisticated configuration (mappings of regular expressions of file paths to logic that determines the corresponding test file and how to execute it).
- In order for the auto-runner to find the corresponding test for a code file, tests must be structured in exactly the same way as the production code base. This pattern is useful but falls apart for end-to-end tests.
Overall, I rarely get a good signal-to-noise ratio out of such tools in real-world scenarios, i.e. find them at least as annoying as they are useful.
Another way to run only relevant tests is to configure your test runner to only run tests marked with the tag like
:focus in the source code. This can work but requires a good amount of typing before a test can run, as well as adding and removing these tags in a lot of places. Let's say you start with an end-to-end test that you run using this tag. To make it pass, you need to TDD some code using unit tests. Before you can run these unit tests, you have to go back to your end-to-end test, remove the "focus" tag from it, go back to your unit tests, add the "focus" tag there, then trigger another test run. When your unit tests pass, you have to remove the focus tag from them and add it back to the end-to-end test. If you count this all up, it doesn't save you many keystrokes compared to simply running tests manually. Plus, not every test runner supports this technique.
You can also use solutions that are more specialized for particular editors like vimux-ruby-test, vim-rubytest, Emacs-runtests, or SublimeSBT. They work well, but only for one type of editor and a few test frameworks and languages. Configuring or expanding them is often complex and requires using your editor’s scripting language. And since they run inside your code editor, they can block it while running the tests, or change how it displays your code in distracting ways.
After trying all the options mentioned above to no satisfaction, we built a set of open-source tools that go the next step on the journey towards seamless test-driven development. It is called Tertestrial and offers a way to run a test (or a set of tests) with zero or one keystroke from within your editor as you work on your code. Tertestrial is based on the following principles:
- There is always a test that is driving your development. That is the test you want to run. As you drill deeper, you run more specific tests, but at any time there is always exactly one test that is telling you the next thing to do.
- Good developers often have the relevant test open in the editor, right next to the code that it describes. Chances are that your cursor is also somewhere inside that test!
With Tertestrial you can run the test you are currently working on in your editor via a hotkey:
Compared to running tests manually, you save the hassle of switching to the terminal and entering the test to run. Now it’s just one keystroke to run a test the first time, and then 0 or 1 keystrokes to run it again! You always run just the one test that is relevant, and only when you want it run.
who is the winner?
Let’s tally up the different solutions against each other:
The table visualizes how Tertestrial combines the advantages of the different existing approaches without incurring significant disadvantages. It thereby helps you to really lean on your tests while developing code, and let them drive out feature after feature from the outside in, with very little debugging. There are plugins for the most popular code editors. And since Tertestrial performs almost all business logic on the server, editor plugins are very simple and more editors can be supported relatively easily. Configuration happens via a YAML file that can be scaffolded by a setup assistant. Please give it a try!
- Tertestrial server: https://github.com/Originate/tertestrial-server
- Vim client: https://github.com/Originate/tertestrial-vim
- Emacs client: https://github.com/dmh43/emacs-tertestrial
- Atom client: https://github.com/charlierudolph/tertestrial-atom
next steps for Tertestrial development
Tertestrial is already a pretty useful co-pilot for any software developer the way it is right now. The next steps in this area could include more intelligent selection and triggering of test runs based on which changes have been made (“oh, you just changed a comment, no point in running the test again”), which files have uncommitted Git changes, test coverage for the current line, which tests other developers have run when working on that line/function/class, etc.