User-level Feature Specs With Cucumber

Kevin Goslar
6 min readMar 10, 2019



Applications are not just collections of technology. They are designed to provide meaningful functionality within the user’s domain of experience. To achieve that, they encapsulate complex technical implementations under intuitive, human-friendly user interfaces.

Congruent to that, the specifications for said application functionality should also be on the level of user experience, with their underlying technical implementation encapsulated.

Cucumber is often misunderstood by software developers as an unnecessary detour from expressing feature specs more directly in code. This blog post shows that Cucumber’s code and language patterns emerge naturally when organizing/refactoring complex feature specs.

This substantiates the understanding of Cucumber as a set of patterns, tools, and programming languages specialized for expressing feature specs on the same semantic level as the functionality they describe, the level of user experience.


Feature specifications (aka functional or integration tests) are an essential part of TDD. To verify that our application as a whole works, we fire up the complete application stack as well as a scriptable interaction device (browser or mobile application simulator). Using the latter we simulate users interacting with our app (clicking links and buttons, filling out forms, etc) and check that our application as a black box exhibits the correct behaviors (displays the correct responses, sends the right messages to other apps etc). These feature specs can even drive the development of their features.

For simple feature specs, we often don’t need anything beyond a fixture and mocking library together with a UI driver. As feature specs grow in size, however, expressing complex user interactions solely using only these intentionally low-level tools becomes increasingly cumbersome. Here is a representative example: the feature spec for changing the password of a user account in a typical web application. We use Ruby, RSpec, Capybara and Factory Bot.

Did you understand what this quite massive and cumbersome spec verifies? How does changing the password work? How long did it take you to understand all that? How much low-level source code did you have to read, parse, and execute in a virtual browser in your head in order to derive how the application is supposed to behave here? And that was still a relatively small, simple, and straightforward feature!

Although the spec nicely lists all the individual steps for changing a user’s password, it is too low-level. It is hard to see how the product actually works, and I am not confident from just looking at this that we didn’t forget to check something. This is merely what a developer thought the product should do, expressed in ways only a developer understands. But like all people, developers occasionally misunderstand requirements or translate them incorrectly into code.

commented groups

As a start, let’s group related steps together and add some comments.

Great, this has already made more clear what we actually do here! But comments in front of blocks of code are an indicator that a method does too much (more than one thing), and that new methods want to emerge here. Also, this method is too long, and this code is not reusable. For example, when testing other scenarios, we don’t want to duplicate the code for logging in.

extracting reusable methods

Let's extract reusable methods. Doing so also gives us a chance to remove a now unnecessary comment, because the respective code piece is now self-describing.

The scenario is now more concise and reads better. And the extracted methods make sense. But it feels like we aren’t quite there yet, and there is more we can do here.

I bet most of my feature specs have to create a user and then log in as that user. Let’s combine those steps into one.

Also, our spec contains two separate levels of abstraction now: comments describe higher-level end-user perspective, i.e. what people want to do with the product, and the corresponding code blocks represent the respective technical implementation, i.e. how to do these things. Our current feature spec mixes these levels inconsistently:

  • Comments and methods like change_my_password_to are on the high-level end-user perspective.
  • Code like create :user is on the technical implementation level.
  • Methods like login_with are in between: they already encapsulate pieces of end-user interaction, but need to be combined with other steps to form full end-user interactions.

All of that smells bad, so let’s keep refactoring.

separate product perspective from implementation

Let’s make it so that our scenario solely describes the high-level end-user perspective, and all the technical implementations are encapsulated in helper methods.

Some parts of our scenario try to sound a bit too much like English for being actual method names. They are too long. This isn’t well-factored, technically sound source code. We shouldn’t start naming our methods like that in the rest of the code base.

And it still doesn’t really come together. It doesn’t form a cohesive user story. It’s not clear why we do all these steps, and what we are actually testing here. That creating users works? That passwords can be changed? That logging in still works after a password has been changed?

Part of that is because such concepts have to be explained, but this is still nowhere near real intuitive English. Trying to make a general-purpose programming language sound like a natural language only gets us so far. In my experience, it will always feel like putting lipstick on a robot, and there is no good solution here.

describing the product part in plain English

Ultimately, it is questionable whether a general-purpose programming language is the most appropriate tool here altogether. Feature specs don’t contain complex algorithms, loops, code paths, or inheritance. They don’t even require functions or variables per se. Feature specs just express a number of linear user interactions with an application, expressed from a non-technical human perspective.

We only described our scenario in code because its underlying implementation is technical, and as developers code is our hammer. But not everything requires code. Let’s try something more close to natural language: Gherkin

Wow, that feels like a breath of fresh air. We expressed our interactions with the application in perfect English. For the first time, it’s absolutely clear what we are actually doing and verifying here, and why.

Gherkin is part of Cucumber. Let’s see how the corresponding step definitions look. If you wonder about Kappamaki below, it converts textual lists into collections.

These are the same high-level product-perspective methods we had before, just with more descriptive English names. The bodies are almost identical to the ones written in Ruby. The reusable helper files don’t change at all.

As we can see, Cucumber provides facilities to represent the abstractions that naturally emerge in well-factored, complex feature specs. And it allows to represent them in a more appropriate format than a general-purpose programming language can. Other advantages are:

  • Product experts can verify that feature specs describe the correct application behavior, resulting in better team play between the product and development departments.
  • User stories can be written directly in Gherkin. This means one less conversion step from product description to code, which means one less opportunity for things to get lost in translation. And fewer meetings.
  • Feature specs can be understood and executed by both machines and humans. Automation allows catching bugs and regressions earlier, thereby making everybody’s life easier. Knowing that this happens, Quality Assurance (QA) personnel no longer have to do the boring and repetitive task of re-verifying already-tested functionality, but can instead focus on finding new issues and ensuring that the product looks correct.

I hope it becomes more clear that Cucumber as a platform for intuitive, user-level feature specifications provides value to the entire agile organization, including the development team. It allows for better functional testing than general-purpose programming languages and should be a part of most serious agile projects.

Robust and mature Cucumber implementations are available for Ruby, JavaScript, the JVM, Python, .NET, and many other platforms. You can even develop cross-platform Android and iOS specs with it.

No more low-level Gherkin that merely wraps individual interaction steps. That’s what Capybara is for. Cucumber is a high-level specification layer with end-user perspective, on top of the underlying technical implementation.

The future is green, friends!