Responsive

Integration Tests: Pros + Cons of Doubles v. Trace-Based Testing

Integration Tests: Pros + Cons of Doubles v. Trace-Based Testing
Jul 13, 2022
7 min
read
Matheus Nogueira
Software Engineer
Tracetest

A look at utilizing doubles in tests, full system integration tests, and trace based tests when developing test suites and the pros and cons of each technique.

Share on X
Share on LinkedIn
Share on Reddit
Share on HackerNews
Copy URL

Table of Contents

## Complexities of Building Tests Across Microservices

Writing tests is not generally an easy task. As engineers, we want to test every important aspect of our application and ensure that it works consistently. To achieve that goal, we write automated tests and run them against our application to ensure it works as expected.

Usually, we start testing small pieces of code with unit tests, and then, when we start having more moving parts, we write integration tests to ensure every piece of software works well with the other components. As our products grow larger and more complex, we start considering breaking their scope into smaller microservices. That’s when testing becomes a challenge. What is the best strategy to test a feature when it involves more than one service? Do we use a double for all dependencies or wire everything together and execute all services as we would in a production environment?

I will present the pros and cons of each approach and introduce trace-based testing as a new alternative to mixing the pros of both approaches.

## Using Test Doubles

This is the simplest way of testing your application. You create a double of an external dependency that mimics the behavior of that dependency. Let’s use a code example to show its strengths and weaknesses.

Postgres implementation for fetching Users from Database:

Implementation for fetching users from data storage using the UserRepository abstraction:

Testing of the ListUsers method using a test double as the user repository:

This test strategy works, but, as our product evolves and gets more complex, maintaining those tests might become as hard as maintaining the code itself. It happened to me a couple of years back, and it probably already happened with you at some point in your career as well.

How could that become a problem you may be asking? What if we introduce a cache mechanism to our UserService and now our database is called only when there is no valid cache available? This could potentially break our test because now our mock is not called, and we expect it to be called once. Or worse - your application breaks in production due to a change in a database query but when it gets executed, there is a syntax error. Your test would not be able to detect it because you are not running your actual implementation!

### Pros of Test Doubles

- Easy to set up small tests.

- Tests run faster.

- If your dependency charges you by API call, doubles can save you some money.

- Easy to add granular assertions

### Cons of Test Doubles

- Double’s behavior might be different from the real implementation.

- You have to sync your doubles with the changes in the real dependency.

- Your tests might break if you refactor code, even without changing the behavior of your application.

- You wouldn’t test the dependency itself. What happens if you mock your database and your SQL query has a typo?

## Test Doubles - Use Only When Necessary

Doubles are good to test stable APIs and dependencies that charge you every time you call them, but they can lead to a maintenance nightmare if overused. Even small changes in your code can make your tests break when no behavior has changed. Use them only when extremely necessary. Therefore, I wouldn’t recommend using this approach to run your integration tests.

## Full Integration Tests: Wiring All Components Together

Another way of testing our services is to wire all dependencies together and then run the function we want to test to see if everything works together. Compared with test doubles, this approach gives you a more realistic test, as our applications will work together just like in production.

When our services are small and self-contained, using this approach is simple. You can, for example, call an endpoint to create a user and then call the endpoint to get that user. The test passes if the user is returned and is the same one you created. But what happens if your service communicates with other microservices? How do you ensure everything was created as expected? For example, let's say we are testing a feature for buying a product. What would that test look like?

1. Create a new user.

2. Create a new product and set its availability.

3. Add the product to the cart.

4. Trigger the purchase.

5. Assert the purchase action returned a successful status code.

That’s not that bad, but is that enough to guarantee our code is working? Probably not. How can I know if the product stock has been updated to prevent overselling that product? How can I ensure that our shipping service got the request to ship that specific product to our user address? A more realistic test would look like this instead:

1. Trigger the purchase.

2. Check if our product stock was updated using our Product API.

3. Check if a ship request was created in our shipping API.

4. Check if the cart is now empty again.

As you can see, setting everything up for those tests might be complex. You might have to connect to multiple APIs to be able to deeply assert if your application worked as expected. Imagine the number of credentials you would have to manage in your CI environment to make that possible.

### Pros of Full Integration Tests

- Realistic results.

### Cons of Full Integration Tests

- Slow

- Very complex setup

- Hard to deeply assert your test result.

## Full Integration Tests are Realistic but Complex

Testing against an environment where all your services are connected can provide realistic feedback, but the cost of maintaining those tests is high. Especially because you have to maintain clients and the credentials to all dependencies you might need to connect to assert that everything is working as expected.

This is complex, but it is the best we could do to ensure that our application works as expected. UNTIL NOW!

## Trace-Based Testing: Better way of doing Full Integration Tests?

Let’s start small. What is trace-based testing? It is an integration testing strategy that consists of using the telemetry generated by your applications to ensure they are working properly.

### How Trace-Based Testing Works

You trigger a function in your application and wait until its execution completes. After that, you retrieve the generated trace from that operation and use that trace as a source of truth about what happened during the execution of the function. Good telemetry is capable of telling a detailed story of what your application did to achieve the result you got, so you can know that database queries were executed, HTTP requests that were executed against other applications, cache hits, etc.

## What a Trace of the Purchase Operation Would Look Like

diagram of database levels for trace testing

## Key concepts behind Trace-Based Testing

In order to be able to validate the behavior of your application by looking at your traces, you must be able to do two things: choose which spans are relevant for your test, and what you want to assert in those.

We at Kubeshop have been developing [Tracetest](https://github.com/kubeshop/tracetest) for a few months now and its goal is to make it easier to [run those assertions](https://kubeshop.github.io/tracetest/adding-assertions/). The key concepts behind Tracetest are selectors and checks.

[Selectors](https://kubeshop.github.io/tracetest/advanced-selectors/) are queries that select which spans of a trace will be asserted against a set of checks. They are quite powerful, they allow you to filter spans by attribute values, parent-child relationships, and order of spans. I’ll provide some examples later, so it will be easier to understand their purpose.

With the ability to select which spans are important for you, you can write checks to ensure that those selected spans have some characteristics based on their attributes. Examples would be ensuring a span took less than 500ms to complete, the HTTP status code in the span is a success code, etc.

To make it simpler to understand, let’s have a look at some examples of selectors and checks:

1. All HTTP endpoints have a successful status code

2. All database statements should take less than 1s

3. All database insert statements should take less than 200ms

This is a powerful way of pinpointing what you want to test and being able to run checks on each span to ensure the right behavior and SLOs.

## Matching Our Integration Test Via Our Trace-Based Test

When instrumenting your application, you can attach useful information to each span as attributes. We can use those attributes to validate if those values match what we expect. Let’s take a look at what a Tracetest test would look like:

This test is capable of testing everything we wanted to test with our integration test with all components wired together. You don’t need a complex setup to make it work. The only requirement is that the machine you are running the test from needs to have access to the purchase API and the trace storage that is storing your application traces (Jaeger for example).

Tracetest will trigger the purchase operation for you, wait until your trace is complete, and then execute the assertions against your trace.

### Pros of Trace-Based Testing

You are running your actual application without any mocks, and getting realistic feedback from it

As you are testing your trace, and traces are data, it doesn’t matter where the test is run, so the test is still the same regardless of which environment you are using to run it (production, staging, dev, etc)

Allows you to test your Opentelemetry traces before they are even deployed in production

### Cons of Trace-Based Testing

If your application is not instrumented, you cannot use it to assert its behavior. (if you are not instrumenting your application, you should. It can help you identify bottlenecks, and provide you with useful information when debugging problems)

## Trace-Based Testing empowers Deep Integration Testing via Your OpenTelemetry Traces

When we are writing integration tests, we want to get the most realistic feedback possible to know if our application will work when a real user starts using it. Usually, test doubles don’t provide much help in that goal, so only use them to simulate dependencies that are expensive to replicate in your tests (payment gateways, cloud providers, etc).

For that purpose, we end up with two options: executing tasks in your application using its real dependencies or using trace-based testing to ensure that your application is working properly. The thing is: those are not competing with each other. Trace-based testing is basically a normal integration test that allows you to execute extra assertions that would be impossible with regular integration tests.

Trace-based testing is a new testing technique that can help deeply inspect what your application is doing while using your existing integration test infrastructure. If you already have full integration tests, you can start using trace-based testing with almost no effort. You only need to ensure your application is instrumented and its telemetry data is being stored.

We encourage you to try Tracetest, as you can explore your application using its Web UI and understand what’s happening in your application before you write any test. You can start by taking a look at our [Github repository](https://github.com/kubeshop/tracetest) and reading our [documentation](https://kubeshop.github.io/tracetest/). If you still have questions, reach us on our [Slack community](https://dub.sh/tracetest-community) and we will be more than happy to help you.