Front End Testing: Integration vs Unit

Dale Fukami
Oct 31, 2024
7 min read

Updated: Nov 12, 2024

Recently we were asked for our thoughts on how we approach front end testing. This is, of course, a vast topic that's difficult to cover completely but this series will demonstrate some of the ideas and principles that guide our decisions when testing on the front end. We'll be using React in our examples as this is where most of our experience applies but the underlying fundamentals of why we choose certain tactics should apply in most front end environments.

Integrated testing versus unit testing

This is the third in the series and discusses some of the tradeoffs when deciding whether to write integrated tests or unit tests. Much like the previous post the topic itself isn't necessarily specific to front end testing, but we'll look at it from the lens of a front end project.

This is a topic with a lot of nuance. Starting with the difficulty in defining "integrated" and "unit". For the purposes of this discussion I like the definition provided by J. B. Rainsberger:

any test whose result (pass or fail) depends on the correctness of the implementation of more than one piece of non-trivial behavior.

Of course, even that definition leaves us open to a lot of variance. What's "non-trivial", for example. We could nit-pick definitions forever so, for now, keep this definition in mind and just know that much of it comes from experience. You don't need to agree in exact terms. The important thing is in the conversation and trade-offs.

With all that out of the way, let's dive into the examples. We'll be starting off by continuing our exploration of the todo list project shown in the previous posts. Our starting point can be found here. In particular, note we're now showing the `TodoList` component which allows for creating new tasks as well as rendering the list of existing tasks.

Initial Tests

We'll start with an initial integrated test:

This test is integrated because it dives into the implementation details of the Task component in order to accomplish its goal.

Now more of a "unit" test:

Note the difference here is that we're mocking the Task component and triggering events directly based on the Task component interface rather than clicking HTML nodes that the Task would render.

At this point it's pretty easy to argue that the integrated test is preferable:

The unit test version that mocks dependencies requires 2 sets of tests to provide a similar level of coverage. Both the TodoList test and the Task tests are required to ensure everything is working properly.
The integrated test performs actions more similar to how a user truly interacts with the page. A user doesn't "fire an onComplete" event. They click the task name.

Let's take a look at a few changes that might occur as the app grows.

Add a checkbox to the Task

A new feature has been requested that we render a checkbox with each task and clicking that also changes the status of the task.

First, the integrated test version:

(View the whole test file here and the whole Task file here)

Now the unit test version:

(View the whole test file here and the same Task file here)

Comparing the two changes we can see a few things:

The changes to the tests are reasonably similar just in different locations.
The changes to the Task component would pretty much be the same in either case. One difference is that I added the aria-label so that the integrated test to find the right node to click. I believe there are probably better ways to do this with proper html label usage or something. Pardon my ignorance here.
The integrated test file is starting to get a little large now at 126 lines long.
What if we had another high level component that was rendering tasks? We'd have to update that test as well.
The unit test version didn't require any changes to the Todolist tests. It is blissfully unaware that functionality changed underneath it because the actual interface (in this case the events) remained the same. So long as the Task component still publishes those, there are no changes required in the todolist tests.
[Edit] I realized while editing that I had forgotten some tests in my integration test. If you compare the unit to the integration changes you'll notice that I missed the tests for verifying that the checkbox state was rendered correctly in the integrated tests. I wish I could prove that this wasn't contrived.

Event Refactoring

Some changes come from developers themselves. In this case, I don't like the onComplete and onIncomplete events that are being published by the Task component. One of my design philosophies is that components at that level should only show information and publish user events to parent components but not contain any "business logic". At first glance it seems like that's what it's doing but the fact that the events are named onComplete, to me, is business logic. We are presuming that clicking on the task means we want to change the state a certain way. That is, in my opinion, a form of business logic. So, instead I'd like to consolidate those two events into onActioned. That is, "hey, the user explicitly tried to perform an action on this task". That could be clicking, hitting enter after tabbing to it, etc.

Sidebar: I considered onClick but that feels too specific to the mouse. onSelect was another idea but it doesn't quite seem to fit what the user intends, IMO.

Here's the code changes (View all the changes for this section here)

Now the integrated test changes:

None. There are no changes. This is because all we really did from the perspective of the entire stack of components is a straight refactoring.

And the unit test changes:

As we can see, there are changes to both the task unit test and the todolist unit test. Any time the api of a component changes it requires changes in the consumers.

Comparing the two changes for this refactoring:

From the high level perspective we haven't changed behavior at all. Doesn't matter what the events are called, if it happens then the status of the item changes. This means no changes to the integrated test. This is where people will claim that unit tests make refactoring harder and, in this type of change, that's evident.
In the unit test scenario, if there were multiple higher level components then they would all have to change.

Comparing Advantages

In favor of Unit testing:

Changes to the internal behavior of subcomponents don't require modifications to consumers. Tweaking the functionality of the Task component in ways that don't change the api were focused to only the component and the component's tests. If there are multiple consumers then this advantage is compounded.
Test files tend to be smaller because they're more focused and don't require exercising every case a subcomponent handles.
Setup for each test is usually trivial.
Individual tests run faster. You only have to run the tests related to the subcomponent during the changes and because they're smaller and render fewer things the feedback loop while working is faster.
Personally, I find that I am more diligent in thinking about edge cases when working at this level. "What if the list is empty? What if this value is null?". While I can only speak to my experience on this one, I do, anecdotally, find it to be true for most.

In favor of Integrated testing:

Changes to the api of the subcomponent don't require any changes to the tests at all. This advantage compounds if there are multiple consumers.
These tests tend to exercise the system a little more closely to how a user perceives and interacts with the system. The higher you go the closer to the true user experience you get in the tests.
You may have a higher confidence that the components all work together as expected because you've tested them as a larger system.

Other considerations

At some point we have to pick a level of testing. What about the component that is rendering our TodoList component? Maybe the tests should have been at that level? If we did that then even more refactorings can be completed without changing tests but we trade off the same disadvantages in that our test files get larger, slower, more difficult to setup, etc. How high should we test? Should we write only end-to-end tests?

The final, probably most important, factor is confidence. Which level gives you the confidence that your system behaves correctly? Which gives you the confidence to refactor frequently while retaining that confidence?

Conclusion

So, which style is better?

When I approach front end testing (indeed backend testing too) I tend towards unit testing for the following reasons:

Focus - I love working on a small component in a very focused way. It helps me maintain my mental separation of the intent of each type of component.
Speed - As a TDDer, the more focused the tests the faster the feedback. It keeps me flowing.
Common changes - I find that the majority of my code changes are similar to the first example where there's some sort of user change that can be done in a smaller subcomponent.
Reusability - It's very common for me to build components in this style and end up "accidentally" having a number of components that can slot themselves into other areas of the app with very little work. This is more about the design of the system, but I find when I'm focused on what to test, and where, my decisions just naturally lead to structure that's simple to manipulate.
Unit tests align with the general qualities we like in our tests as outlined in the previous post. While integrated tests can adhere to those qualities in various ways, I find that doing so requires more effort.

I think the strongest argument for integrated tests is, "You may have a higher confidence that the components all work together as expected because you've tested them as a larger system". In practice I really haven't experienced this problem. The bugs I see in the systems I've worked on are even higher system issues such as race conditions acting on data.

In the end, it's up to you and your team to decide. Take some time to experiment with both styles and see choose which best suits your team's work. Track which types of issues occur while you're actively writing code, when CI exercises your system, when QA files reports, and in production. Is your level of testing supporting reducing these classes of issues? Do you need to change levels or can you change your approach to the current level in order to catch more issues earlier?