Are Your Selenium Tests Fragile? How To Make Your Tests Stable

Selenium tests have the reputation of easily becoming fragile tests. We’ll look at some common causes of fragile Selenium tests,…

By Testim, June 19, 2019

Selenium tests have the reputation of easily becoming fragile tests. We’ll look at some common causes of fragile Selenium tests, how you can alleviate some of these issues, and how Testim can provide extra value and robustness to your UI tests.

What’s a Fragile Test?

A Selenium test can be fragile just like any other automated test. A fragile test is when changes that seemingly shouldn’t influence the test result do so anyway. An example that most of us have encountered is when we change one piece of code and break a test that we believe shouldn’t break. But there can be other influencing factors: test data, the current date and time or other pieces of context, or even other tests. Whenever these factors influence the test outcome, we can say we have a fragile test.

Fragile Selenium Tests

Selenium is a tool to automate the browser, giving us the power to write automated tests for our (web) UI. This entails that we will execute our entire application more or less. A lot of moving parts means these Selenium tests can become fragile more easily than simple unit tests.

Let’s look at some causes of fragile Selenium tests and how we can improve them so they become more robust. The causes we’ll cover are

tight coupling to the implementation
external services
asynchrony
fragile data
context

Tight Coupling to the Implementation

Just like unit or integration tests, Selenium tests can be written so that they’re tightly coupled to the implementation. However, when the implementation changes, this means our tests need to change too, even if the public or visual behavior hasn’t changed.

For example, let’s say we’ve written our test to click on a button with ID “login-button.” When this ID is changed to something else, for whatever reason, our test will fail, even if the test case still works fine when performed manually.

This is because the test is tightly coupled to the specific implementation. How can we decouple our test from the implementation?

In this specific example, we could improve the test by having the test call a helper method that knows about the implementation. So instead of clicking the button with the ID “login-button,” we can make it call a method named “Login” and pass the necessary parameters. If all of our tests use this helper method, we’ll only have to change one piece of code when the implementation changes.

Now, let’s take this a step further and group our helper methods inside a class per page. This is the Page Object design pattern. If something on a page changes, we know where to update the implementation for our tests—in the corresponding page object.

Can we do better? Yes, we can. Thanks to Dynamic Locators and AI, Testim can identify the page element we need, even if its attributes have changed. Check it out:

A screenshot of a cell phone

Description automatically generated

This means that the QA team can record tests in their browser and they will still run fine if some underlying implementation detail changes.

External Services

Another common cause of fragile tests, especially when testing a complete application, is that we rely on external services to work as we expect. These can be particularly hard to control. But when they don’t behave as they should for the test, or if they aren’t available, the test fails. And this can happen even if we haven’t changed or broken anything on our side.

In this case, we could make our tests more robust by creating a mock service and having our application call that service. Otherwise, we’re also testing the external service and setting ourselves up for fragile tests.

By creating a mock service, we have full control over the responses. We can define what data is returned, but we can also simulate error responses, timeouts, or slow responses. These are things that are harder to imitate when we’re using the real external service.

Asynchrony

When we’re testing an application, we often have test cases where we need to wait for a certain event. For example, we might have an application that makes an AJAX call and renders the result on the page. We now have to tell Selenium to wait until the result is rendered on the page. But for how long?

Luckily, Selenium has explicit and implicit waits. They allow us to tell Selenium to wait until a certain condition is met. In our example, we would wait until the result of our AJAX call has been rendered. When that has happened, we can continue our test.

Fragile Data

Most applications that we test with Selenium will require some data storage. Our Selenium tests can become fragile because of the data for two reasons: either the data changes or certain tests continue with data from previous tests.

Data Changes

When we set up a database for our tests, the amount of data we pump into it can become quite large. Consequently, when we make a small change to this test data for one test, we might break another test that depended on this data. This is especially true for larger test suites where it has become unclear which pieces of data are necessary for which tests.

A possible solution is to populate the database with separate data for each test. However, that can lead to a large amount of test data that can be difficult to maintain over time.

A better option is to populate your database as part of each test. You could start with basic data that doesn’t change, and then have each test add the data that it needs.

Interacting Tests

Another potential issue with data is when one test changes the data of another test. Take a simple example where a test changes some identifier of a record (a product ID for example). At the end of the test, it changes it back so that another test can find that same record. However, if the first test fails to revert the change, the second test won’t find the record and will also fail.

This is a case of interacting tests: a test fails because it has been influenced by another test, even though the failing test runs fine when it’s run individually.

The best solution is to give each test its own set of data. We can achieve this by spinning up a new database for each test. This might sound like a complex solution, but we could leverage containers for this: start a container with a database, fill it with the necessary data, run the test and end by shutting down the container. Rinse and repeat for other tests.

If a separate database for each test isn’t an option, we could look at solutions like in-memory databases or libraries that reset our database to the state before the test (like Respawn for .NET).

Context

The typical example of a fragile Selenium test occurs because of a context, like tests that depend on certain dates or times. This is true for every kind of test, so it also applies to Selenium tests.

When your tests depend on certain dates or times, they might suddenly fail when the conditions (temporarily) don’t apply. This is often the case on leap days, but depending on your business logic, other dates may cause your test to fail too.

In unit tests, the date and time can easily be mocked out, but it’s often much more difficult in Selenium tests. To make matters worse, you’re dealing with two clocks: the “server” clock (where your back-end application is running) and the browser’s clock.

There are many options here, and the solution for you depends on your specific case.

On the server-side, you could wrap calls to the system clock in a service and inject a mock service anywhere you need it in your tests. In production, you would then inject the service that uses the real clock. There’s also libfaketime for Linux and Mac OS X, which might be an easier route.

You can mock the browser’s date and time using TimeShift.js and inject this in your Selenium tests.

Is It Really That Hard?

All of these problems have solutions, but they’re often non-trivial to implement, especially in legacy applications. Also, many of these solutions require time and effort from developers, which you may not be able to spare. But if the tester (a QA department, the product owner, or the end-user) can work together with the developers, then you can achieve solid tests more easily.

Also, take a look at Testim. It can cover more than Selenium does out of the box and it provides a useful platform for developers, the QA team, and end-users to collaborate.

What if you need to verify the contents of a downloaded Word or Excel file? Or if your application has email or SMS integration? And wouldn’t you like to have the tester create your UI tests instead of taking time from your developers?

Testim provides an easy recording function for non-developers to write tests. This also allows non-developers to report a bug and immediately include a test to reproduce the issue. Traditionally, similar recording tools have created fragile tests because they’re too close to the implementation. But as we mentioned, Testim takes a different approach and leads to more robust tests.

For QA professionals, there are powerful features, like running tests with different data sets, conditional steps, and screenshots. Testim also learns about your tests over time, making your tests more stable the more you run the tests, even when implementation details change.

Testim allows you to create robust UI tests for your application that need minimal maintenance and provide maximum value to both developers, QA professionals, and end-users.

Author bio: This post was written by Peter Morlion. Peter is a passionate programmer that helps people and companies improve the quality of their code, especially in legacy codebases. He firmly believes that industry best practices are invaluable when working towards this goal, and his specialties include TDD, DI, and SOLID principles.

Are Your Selenium Tests Fragile? How To Make Your Tests Stable

What’s a Fragile Test?