The inconvenience of encapsulation
When writing unit tests, the most ugly & painful & stupid thing you might have to do is to expose some internal implementation of your module to the testing code so that:
- It is easier to enumerate all possible conditional branches to cover some edge cases; (although if you have a complex design, you can always achieve this by mocking things, however sometimes it’s an overkill)
- It is possible to inspect side effects of your code; (accept the dark side of the reality, you cannot totally avoid side effects)
Sometimes, you might even find yourself wasting a lot of precious time thinking about solutions to avoid writing such smelly code. But, hey! It’s testing code! Shouldn’t we have fun and enjoy writing it with all the power and flexibility we ought to have rather than struggling in these design holes? Why do we have to break core code for testing code?
Recently I found an excellent library rewire that neatly solves this problem. Moreover, the idea behind it is really creative so that I’d like to share it in this post.
Nothing is more valuable than an example. Let’s try to make a really simple one. The code logic might not be very reasonable, just made for showing the problems and how rewire can solve them. You can find the whole example here.
Given the following scenario: You are implementing a log appender, which writes aggregated log messages to some database in batch. And somehow the database has throttling control so that you cannot write to it more than once per second. Here is something you might write.
Now you might want to write some test cases to test your code. The three basic cases might be:
- The appender shouldn’t flush until batch size is reached.
- The appender shouldn’t flush twice within one second.
- The appender should flush when batch size is reached and more than 1 second is elapsed from last flush.
Here comes to the problems:
- How do I know the batch size? It’s not exposed anywhere. And probably it’s a private information that need to be hidden in the module.
- How do I know if the
flushToDBmethod is called?
- How do I change to a mock db for unit test?
- How do I easily control the
durationso that the
ifbranch can be tested without using some fancy thing to mock the datetime or wasting your test time for setTimeout?
The fundamental challenge is that we have hidden states in the module which are not supposed to be exposed.
Here comes the rewire
rewire is a replacement for
require for testing purpose. When it requires the module it also magically modifies your source code so that variables hidden in the module can be accessed and modified. Here is how you can use rewire to solve the above problems without touching the core module.
And yeah, it runs successfully!
The idea behind the scene
I’m extremely interested how it implements the
__get__ function. How can they bind to the hidden variables in your module? So I step into the source code. It turns out that
eval and closure does the magic. Let’s take a close look at how
__set__ is implemented.
__get__ is very similar.
Key point 1: It modifies your source code to adding
__set__ functions to your code before requiring your module.
This is important, in this way, the added function is in the same lexical scope of your module so that it can access the module’s internal variables.
Key point 2:
eval function to execute the variable replacing code.
Variables in the scope of
__set__ function are closed in the closure created by
eval statement, thus in the
eval function you can not only access the hidden variables in your module (inherited from the
__set__ closure), but also access the variables in the
__set__ function. In this way, you can build a general function that can replace any declared variables in the module’s scope.
Literally it append the following function to your module’s source code:
This technique really opens my mind. It’s so powerful and I believe it has a lot more practical applications. Hope you will also enjoy this library.