Your users are a test suite, you should run them

Any code that uses a service assumes things about the behavior of that service. If the service violates those assumptions, the user code will fail. This is equivalent to a test for the service; user code, in aggregate, forms a test suite for the service.

This is why "testing in production" is so powerful and tempting. Production is where your users run, and if they all work, you know that your implementation is correct, or at least correct enough.

But you don't have to test in production to take advantage of this.

Some examples:

If you can run your system, including user code, you can incorporate running user code into your normal development and testing practices. You can drive events through the system, with saved production data or with techniques from fuzzing and property-based testing to trigger known edge cases, and assert that no errors occur.

User code can be treated as a test suite, which can be optimized and improved like any other, and which can be run repeatedly by each developer as they make changes. This incentive to improve the quality of user code works especially well when the developer and the user are employed by the same entity, or when an open source developer is acting out of altruism.

One type of user code that is particularly useful for this purpose: services developed to monitor and detect production errors. Such monitoring services depend on your service just like any other user, and are naturally suited to detect many kinds of bugs.

These techniques are most effective with a large system, with multiple users each with users of their own that can be included to test more invariants.

The more code that depends on you and runs successfully, the more confident you can be that your implementation is correct.

But, a larger, more complex system also makes it harder to write realistic stand-alone test cases.

At some point, these trends cross over, and writing real code that uses the system productively is an easier way to test than trying to write realistic stand-alone test cases.

That's not a problem; your users are a test suite ready for use, you just need to run it.