Testing DB interaction
In PHPDBUnit - Testing DB interaction with PHPUnit - Digital Sandwich, Mike Lively announces a port of Java’s DBUnit to PHP. I gave a short(ish) reply on the blog but after posting it I realized I wanted to expand on the topic a bit
.
Let me start by quoting my reply on his site. It’s Sunday morning and I am lazy -
Might be nice for acceptance testing, but I don’t see why you would want to talk to a real db during unit testing. It’s just too slow, and if you need anything resembling a biggish dataset for a *unit* test, your unit tests aren’t fine-grained enough.
Mocking the database is really where you want to go. Redirect the connection logic to give a mock connection object back that basically does string compares to pick which of a handful of precooked result sets (in-memory arrays) to return. It’ll cost you an hour or so to rig up some baseclasses, and keeping everything in memory (and not having to truncate tables) makes sure that your unit tests keep flying. Also, having your datasets in the code will force you to think about your unit tests once they get large - usually that’s a refactoring smell
![]()
I used database-based unit testing once, when I just started with XP 10 years ago, and I’ll never go back there.
If you check the original post, the example he gives would be exceedlingly simple to write a unit test for that doesn’t use a database. Roughly it goes like:
function testDelete() {
$pdo = new MockPDO(/* callback = */ $this);
$this->seenDelete = false;
$foo = new Foo($pdo);
$foo->deleteByCol1('value1');
$this->assertTrue($this->seenDelete);
}
function pdoCallback($query) {
if ($query == "DELETE FROM foo WHERE col1 = 'value1'") {
$this->seenDelete = true;
}
return null; // no resultset needed
}
Look, Ma - no Database! This is a Unit Test. It tests a Unit of code. Not a whole stack of it.
Think of using a real live database as a very strong refactoring smell because you are having a lot of test coverage duplication - every test you do using DBUnit activates the same bits of code - like MySQL drivers, etcetera - that hopefully should be tested elsewhere. I assume that when I feed the above query to MySQL, it will do what I told it. When writing the test, I might even copy and paste it to the MySQL command line to make sure that the query I am comparing against works, and in real life you probably want a “sloppy compare” that ignores whitespace and character case.
By the way - this code has one callback per test class. This was the simplest thing when we started using it at work, but what also might work is a callback per test, especially using the quite cool mock class generation facilities of SimpleTest (which is what we use at the moment). However, we don’t tend to do a lot of database testing anyway: most CRUD code is generated, so we have a single test set to test the generator and exercise what the generated code is spitting out; live classes just get the additional (hand-coded) methods tested.
If you wonder why I am so adamant against using something like DBUnit in a unit test set (that is, unless you are actually developing the database code - then it will be very useful indeed), apart from the somewhat academic argument about test coverage, a very big issue is performance. Unit tests become more valuable the faster they run, and ideally you want them so fast that every time you press “save”, they run and you don’t even notice the lag (I have seen that demoed once and it was quite cool). As I said, it is sunday morning and I am lazy so I am going to guess that setting up the database will take you something like 250ms. A hundred database tests later just the overhead of setting up the database is 25 seconds. In that 25 seconds, I could have run an additional thousand or so “fast” unit tests. If your test suite gets slow enough that you get up for a coffee after pushing the “run all tests” button, you will not run them often enough - it becomes a ritual to execute before, say, committing your code instead of a friend that helps you guide you along every step of that big refactoring you were tasked with.
So, unit testing must be fast (err… well, relatively fast. Nothing is absolutely fast in PHP, alas). The best way to ensure this is to identify just what code you want to test, and then sandwich it: on the top, there’s the unit test, on the bottom, there is probably mock code, and in between just the code that you scrutinize. Keep the sandwich model in mind and you’ll find that you will write cleaner code (because all your code becomes “pluggable” in a sense, in order to accommodate it being able to run in the test sandwich as well as in the real environment), and your unit tests stay focused and fast. Focused unit tests also means that if you change one thing, only one unit test breaks - clearly that’s a win too.



July 4th, 2007 at 7:50 pm
I am not sure if you checked my reply to your comment on my blog or not. But I thought I would re-iterate my point here.
There are several reasons why I think it is beneficial to use a live database for testing.
1. It allows you to safely refactor your queries. While this is not a problem for smallish projects, when you are dealing with a large amount of data in a performance oriented database design (not necessarily fully normalized) it is very common to alter your queries to ensure that you are not breaking your queries.
2. It allows you to test the output of your data driven units. Testing the queries you send to the database is all fine and good, but that makes the very big assumption that the queries you are writing are returning exactly what you expect. I will be the first to admit that the query in my example was extremely simple and your way of testing that particular query works just fine. However we both know that queries get much more complicated than my example and when they begin getting complicated it helps being able to test the output of them.
3. It provides for real regression testing when you find bugs in queries. If you find a bug in one of your queries, write a test that exposes the bug, hopefully squish that bug for good. If you just use query comparison you will be forced to change your test everytime you alter your query. This means that the next change you make you open the door for all of your old bugs to sneak back in.
4. Provides a very simple means to test your code against different database platforms. If you want to keep your code working on several different database platforms then using something like DBUnit makes that very easy to test. If you want to see where your code breaks in Postgres you simply need to change the database your tests use and you’ll have a very good idea of what works and what doesn’t.
I can understand the desire to keep tests slim, but you always have the option of segregating database test cases into their own suites that can be ran only after the affected code has been changed and once a night otherwise. There is usually a way to work around slow tests. Personally I would rather have slow tests that test all of the critical parts of my application than quick tests that just do the bare minimum. It basically boils down to testing what is important to your application. Your application may be more concerned with complicated business logic that is driven soley by php. The project I am working on is a financial application that has extremely intense data interaction that has to be 100% correct all the time. So it is very important to me to ensure that all of our queries are correct
.
May 21st, 2008 at 8:30 pm
In my opinion, testing the database is essential. However, mocking the database has its place. A case of the ‘Wife Swap’ solution, where the answer lies somewhere between both extremes.
Here is someone who decided that always mocking the database is wrong > http://weblogs.asp.net/rosherove/archive/2004/12/10/279258.aspx