{"id":10139,"date":"2012-09-16T22:28:00","date_gmt":"2012-09-16T21:28:00","guid":{"rendered":"http:\/\/touk.pl\/blog\/?guid=122e11276819b142ef450f3a4d0180dd"},"modified":"2023-03-23T10:58:48","modified_gmt":"2023-03-23T09:58:48","slug":"test-driven-traps-part-1","status":"publish","type":"post","link":"https:\/\/touk.pl\/blog\/2012\/09\/16\/test-driven-traps-part-1\/","title":{"rendered":"Test Driven Traps, part 1"},"content":{"rendered":"<p>Have you ever been in a situation, where a simple change of code, broke a few hundred tests? Have you ever had the idea that tests slow you down, inhibit your creativity, make you afraid to change the code. If you had, it means you\u2019ve entered the Dungeon-of-very-bad-tests, the world of things that should not be.<\/p>\n<p>I\u2019ve been there. I\u2019ve built one myself. And it collapsed killing me in the process. I\u2019ve learned my lesson. So here is the story of a dead man. Learn from my faults or be doomed to repeat them.<\/p>\n<h2 id=\"the-story\">The story<\/h2>\n<p>Test Driven Development, like all good games in the world, is simple to learn, hard to master. I\u2019ve started in 2005, when a brilliant guy named Piotr Szarwas, gave me the book \u201cTest Driven Development: By Example\u201d (Kent Beck), and one task: creating a framework.<\/p>\n<p>These were the old times, when the technology we were using had no frameworks at all, and we wanted a cool one, like Spring, with Inversion-of-Control, Object-Relational Mapping, Model-View-Controller and all the good things we knew about. And so we created a framework. Then we built a Content Management System on top of it. Then we created a bunch of dedicated applications for different clients, Internet shops and what-not, on top of those two. We were doing good. We had 3000+ tests for the framework, 3000+ tests for the CMS, and another few thousand for every dedicated application. We were looking at our work, and we were happy, safe, secure. These were good times.<\/p>\n<p>And then, as our code base grew, we came to the point, where a simple anemic model we had, was not good enough anymore. I had not read the other important book of that time: \u201cDomain Driven Design\u201d, you see. I didn\u2019t know yet, that you can only get so far with an anemic model.<\/p>\n<p>But we were safe. We had tons of tests. We could change anything.<\/p>\n<p>Or so I thought.<\/p>\n<p>I spent a week trying to introduce some changes in the architecture. Simple things really: moving methods around, switching collaborators, such things. Only to be overwhelmed by the number of tests I had to fix. That was TDD, I started my change with writing a test, and when I was finally done with the code under the test, I\u2019d find another few hundred tests completely broken by my change. And when I go them fixed, introducing some more changes in the process, I\u2019d find another few thousand broken. That was a butterfly effect, a chain reaction caused by a very small change.<\/p>\n<p>It took me a week to figure out, that I\u2019m not even half done in here. The refactoring had no visible end. And at no point my code base was stable, deployment-ready. I had my branch in the repository, one I\u2019ve renamed \u201cLasciate ogne speranza, voi ch\u2019intrate\u201d.<\/p>\n<p>We had tons and tons of tests. Of very bad tests. Tests that would pour concrete over our code, so that we could do nothing.<\/p>\n<div class=\"separator\" style=\"clear: both;text-align: center\"><a style=\"margin-left: 1em;margin-right: 1em\" href=\"http:\/\/1.bp.blogspot.com\/-ZT-AejDUZnQ\/UFY3WWXFiDI\/AAAAAAAAD5A\/3w8nm9Sdtq4\/s1600\/christ05.JPG\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/1.bp.blogspot.com\/-ZT-AejDUZnQ\/UFY3WWXFiDI\/AAAAAAAAD5A\/3w8nm9Sdtq4\/s640\/christ05.JPG\" alt=\"\" width=\"559\" height=\"420\" border=\"0\" \/><\/a><\/div>\n<p>The only real options were: either to leave it be, or delete all tests, and write everything from scratch again. I didn\u2019t want to work with the code if we were to go for the first option, and the management would not find financial rationale for the second. So I quit.<\/p>\n<p>That was the Dungeon I built, only to find myself defeated by its monsters.<\/p>\n<p>I went back to the book, and found everything I did wrong in there. Outlined. Marked out. How could I skip that? How could I not notice? Turns out, sometimes, you need to be of age and experience, to truly understand the things you learn.<\/p>\n<p>Even the best of tools, when used poorly, can turn against you. And the easier the tool, the easier it seems to use it, the easier it is to fall into the trap of I-know-how-it-works thinking. And then BAM! You\u2019re gone.<\/p>\n<div class=\"separator\" style=\"clear: both;text-align: center\"><a style=\"margin-left: 1em;margin-right: 1em\" href=\"http:\/\/4.bp.blogspot.com\/-sW2TqEQ8XBU\/UFY4h4yY-0I\/AAAAAAAAD5I\/W3j0TdrlWvI\/s1600\/BadDesignGun1.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/4.bp.blogspot.com\/-sW2TqEQ8XBU\/UFY4h4yY-0I\/AAAAAAAAD5I\/W3j0TdrlWvI\/s640\/BadDesignGun1.jpg\" alt=\"\" width=\"559\" height=\"399\" border=\"0\" \/><\/a><\/div>\n<h2 id=\"the-truth\">The truth<\/h2>\n<p>Test Driven Development and tests, are two completely different things. Tests are only a byproduct of TDD, nothing more. What is the point of TDD? What does TDD brings? Why do we do TDD?<\/p>\n<p>Because of three, and only those three reasons.<\/p>\n<h4 id=\"1-to-find-the-best-design-by-putting-ourselves-into-the-users-shoes\">1. To find the best design, by putting ourselves into the user\u2019s shoes.<\/h4>\n<p>By starting with \u201chow do I want to use it\u201d thinking, we discover the most useful and friendly design. Always good, quite often that\u2019s the best design out there. Otherwise, what we get is this:<\/p>\n<div class=\"separator\" style=\"clear: both;text-align: center\"><a style=\"margin-left: 1em;margin-right: 1em\" href=\"http:\/\/4.bp.blogspot.com\/-wjbCjjbbWvE\/UFY4qvaRJ8I\/AAAAAAAAD5Q\/aXEfb6sBfT8\/s1600\/howdoesthiswork.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/4.bp.blogspot.com\/-wjbCjjbbWvE\/UFY4qvaRJ8I\/AAAAAAAAD5Q\/aXEfb6sBfT8\/s640\/howdoesthiswork.jpg\" alt=\"\" width=\"559\" height=\"461\" border=\"0\" \/><\/a><\/div>\n<p>And you don\u2019t want that.<\/p>\n<h4 id=\"2-to-manage-our-fear\">2. To manage our fear.<\/h4>\n<p>It takes balls, to make a ground change in a large code-base without tests, and say \u201cit\u2019s done\u201d without introducing bugs in the process, doesn\u2019t it? Well, the truth is, if you say \u201cit\u2019s done\u201d, most of the time you are either ignorant, reckless, or just plain stupid. It\u2019s like with concurrency: everybody knows it, nobody can do it well.<\/p>\n<p>Smart people are scared of such changes. Unless they have good tests, with high code coverage.<\/p>\n<p>TDD allows to manage our fears, by giving us proof, that things work as they should. TDD gives us safety<\/p>\n<h4 id=\"3-to-have-fast-feedback\">3. To have fast feedback.<\/h4>\n<p>How long can you code, without running the app? How long can you code without knowing whether your code works as you think it should?<\/p>\n<p>Feedback in tests is important. Less so for frontend programming, where you can just run the shit up, and see for yourselves. More for coding in the backend. Even more, if your technology stack requires compilation, deployment, and starting up.<\/p>\n<p>Time is money, and I\u2019d rather earn it, than wait for the deployment and click through my changes each time I make them.<\/p>\n<p>And that\u2019s it. There are no more reasons for TDD whatsoever. We want Good Design, Safety, and Feedback. Good tests are those, which give us that.<\/p>\n<p>Bad tests? All the other tests are bad.<\/p>\n<h2 id=\"the-bad-practice\">The bad practice<\/h2>\n<p>So how does a typical, bad test, look like? The one I see over and over, in close to every project, created by somebody who has yet to learn how NOT to build an ugly dungeon, how not to pour concrete over your code base. The one I\u2019d write myself in 2005.<\/p>\n<p>This will be a Spock sample, written in groovy, testing a Grails controller. But don\u2019t worry if you don\u2019t know those technologies. I bet you\u2019ll understand what\u2019s going on in there without problems. Yes, it\u2019s that simple. I\u2019ll explain all the not-so-obvious parts.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">def \"should show outlet\"() {\r\n  given:\r\n    def outlet = OutletFactory.createAndSaveOutlet(merchant: merchant)\r\n    injectParamsToController(id: outlet.id)\r\n  when:\r\n    controller.show()\r\n  then:\r\n    response.redirectUrl == null\r\n}<\/pre>\n<p>So we have a controller. It\u2019s an outlet controller. And we have a test. What\u2019s wrong with this test?<\/p>\n<p>The name of the test is \u201cshould show outlet\u201d. What should a test with such a name check? Whether we show the outlet, right? And what does it check? Whether we are redirected. Brilliant? Useless.<\/p>\n<p>It\u2019s simple, but I see it all around. People forget, that we need to:<\/p>\n<h4 id=\"verify-the-right-thing\"><strong>VERIFY THE RIGHT THING<\/strong><\/h4>\n<p>I bet that test was written after the code. Not in test-first fashion.<\/p>\n<p>But verifying the right thing is not enough. Let\u2019s have another example. Same controller, different expectation. The name is: \u201cshould create outlet insert command with valid params with new account\u201d<\/p>\n<p>Quite complex, isn\u2019t it? If you need an explanation, the name is wrong. But you don\u2019t know the domain, so let me put some light on it: when we give the controller good parameters, we want it to create a new OutletInsertCommand, and the account of that one, should be new.<\/p>\n<p>The name doesn\u2019t say what \u2018new\u2019 is, but we should be able to see it in the code.<\/p>\n<p>Have a look at the test:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">def \"should create outlet insert command with valid params with new account\"() {\r\n  given:\r\n    def defaultParams = OutletFactory.validOutletParams\r\n    defaultParams.remove('mobileMoneyAccountNumber')\r\n    defaultParams.remove('accountType')\r\n    defaultParams.put('merchant.id', merchant.id)\r\n    controller.params.putAll(defaultParams)\r\n  when:\r\n    controller.save()\r\n  then:\r\n    1 * securityServiceMock.getCurrentlyLoggedUser() >> user\r\n    1 * commandNotificationServiceMock.notifyAccepters(_)\r\n    0 * _._\r\n    Outlet.count() == 0\r\n    OutletInsertCommand.count() == 1\r\n    def savedCommand = OutletInsertCommand.get(1)\r\n    savedCommand.mobileMoneyAccountNumber == '1000000000000'\r\n    savedCommand.accountType == CyclosAccountType.NOT_AGENT\r\n    controller.flash.message != null\r\n    response.redirectedUrl == '\/outlet\/list'\r\n}<\/pre>\n<p>If you are new to Spock: n*mock.whatever(), means that the method \u201cwhatever\u201d of the mock object, should be called exactly n times. No more no less. The underscore \u201c_\u201d means \u201ceverything\u201d or \u201canything\u201d. And the >> sign, instructs the test framework to return the right side argument when the method is called.<\/p>\n<p>So what\u2019s wrong with this test? Pretty much everything. Let\u2019s go from the start of \u201cthen\u201d part, mercifully skipping the oververbose set-up in the \u201cgiven\u201d.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">1 * securityServiceMock.getCurrentlyLoggedUser() >> user<\/pre>\n<p>The first line verifies whether some security service was asked for a logged user, and returns the user. And it was asked EXACTLY one time. No more, no less.<\/p>\n<p>Wait, what? How come we have a security service in here? The name of the test doesn\u2019t say anything about security or users, why do we check it?<\/p>\n<p>Well, it\u2019s the first mistake. This part is not, what we want to verify. This is probably required by the controller, but it only means it should be in the \u201cgiven\u201d. And it should not verify that it\u2019s called \u201cexactly once\u201d. It\u2019s a stub for God\u2019s sake. The user is either logged in or not. There is no sense in making him \u201clogged in, but you can ask only once\u201d.<\/p>\n<p>Then, there is the second line.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">1 * commandNotificationServiceMock.notifyAccepters(_)<\/pre>\n<p>It verifies that some notification service is called exactly once. And it may be ok, the business logic may require that, but then\u2026 why is it not stated clearly in the name of the test? Ah, I know, the name would be too long. Well, that\u2019s also a suggestion. You need to make another test, something like \u201cshould notify about newly created outlet insert command\u201d.<\/p>\n<p>And then, it\u2019s the third line.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">0 * _._<\/pre>\n<p>My favorite one. If the code is Han Solo, this line is Jabba the Hut. It wants Hans Solo frozen in solid concrete. Or dead. Or both.<\/p>\n<p>This line, if you haven\u2019t deducted yet, is \u201cYou shall not make any other interactions with any mock, or stubs, or anything, Amen!\u201d.<\/p>\n<p>That\u2019s the most stupid thing I\u2019ve seen in a while. Why would a sane programmer ever put it here? That\u2019s beyond my imagination.<\/p>\n<p>No it isn\u2019t. Been there, done that. The reason why a programmer would use such a thing is to make sure, that he covered all the interactions. That he didn\u2019t forget about anything. Tests are good, what\u2019s wrong in having more good?<\/p>\n<p>He forgot about sanity. That line is stupid, and it will have it\u2019s vengeance. It will bite you in the ass, some day. And while it may be small, because there are hundreds of lines like this, some day you gonna get bitten pretty well. You may as well not survive.<\/p>\n<p>And then, another line.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">Outlet.count() == 0<\/pre>\n<p>This verifies whether we don\u2019t have any outlets in the database. Do you know why? You don\u2019t. I do. I do, because I know the business logic of this domain. You don\u2019t because this tests sucks at informing you, what it should.<\/p>\n<p>Then there is the part, that actually makes sense.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">OutletInsertCommand.count() == 1\r\ndef savedCommand = OutletInsertCommand.get(1)\r\nsavedCommand.mobileMoneyAccountNumber == '1000000000000'\r\nsavedCommand.accountType == CyclosAccountType.NOT_AGENT<\/pre>\n<p>We expect the object we\u2019ve created in the database, and then we verify whether it\u2019s account is \u201cnew\u201d. And we know, that the \u201cnew\u201d means a specific account number and type. Though it screams for being extracted into another method.<\/p>\n<p>And then\u2026<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">controller.flash.message != null\r\nresponse.redirectedUrl == '\/outlet\/list'<\/pre>\n<p>Then we have some flash message not set. And a redirection. And I ask God, why the hell are we testing this? Not because the name of the test says so, that\u2019s for sure. The truth is, that looking at the test, I can recreate the method under test, line by line.<\/p>\n<p>Isn\u2019t it brilliant? This test represents every single line of a not so simple method. But try to change the method, try to change a single line, and you have big chance to blow this thing up. And when those kinds of tests are in the hundreds, you have concrete all over you code. You\u2019ll be able to refactor nothing.<\/p>\n<p>So here\u2019s another lesson. It\u2019s not enough to verify the right thing. You need to<\/p>\n<h4 id=\"verify-only-the-right-thing\"><strong>VERIFY ONLY THE RIGHT THING.<\/strong><\/h4>\n<p>Never ever verify the algorithm of the method step by step. Verify the outcomes of the algorithm. You should be free to change the method, as long as the outcome, the real thing you expect, is not changed.<\/p>\n<p>Imagine a sorting problem. Would you verify it\u2019s internal algorithm? What for? It\u2019s got to work and it\u2019s got to work well. Remember, you want good design and security. Apart from this, it should be free to change. Your tests should not stay in the way.<\/p>\n<p>Now for another horrible example.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">@Unroll(\"test merchant constraints field #field for #error\")\r\ndef \"test merchant all constraints\"() {\r\n  when:\r\n    def obj = new Merchant((field): val)\r\n\r\n  then:\r\n    validateConstraints(obj, field, error)\r\n\r\n  where:\r\n    field                     | val                                    | error\r\n    'name'                    | null                                   | 'nullable'\r\n    'name'                    | ''                                     | 'blank'\r\n    'name'                    | 'ABC'                                  | 'valid'\r\n    'contactInfo'             | null                                   | 'nullable'\r\n    'contactInfo'             | new ContactInfo()                      | 'validator'\r\n    'contactInfo'             | ContactInfoFactory.createContactInfo() | 'valid'\r\n    'businessSegment'         | null                                   | 'nullable'\r\n    'businessSegment'         | new MerchantBusinessSegment()          | 'valid'\r\n    'finacleAccountNumber'    | null                                   | 'nullable'\r\n    'finacleAccountNumber'    | ''                                     | 'blank'\r\n    'finacleAccountNumber'    | 'ABC'                                  | 'valid'\r\n    'principalContactPerson'  | null                                   | 'nullable'\r\n    'principalContactPerson'  | ''                                     | 'blank'\r\n    'principalContactPerson'  | 'ABC'                                  | 'valid'\r\n    'principalContactInfo'    | null                                   | 'nullable'\r\n    'principalContactInfo'    | new ContactInfo()                      | 'validator'\r\n    'principalContactInfo'    | ContactInfoFactory.createContactInfo() | 'valid'\r\n    'feeCalculator'           | null                                   | 'nullable'\r\n    'feeCalculator'           | new FixedFeeCalculator(value: 0)       | 'valid'\r\n    'chain'                   | null                                   | 'nullable'\r\n    'chain'                   | new Chain()                            | 'valid'\r\n    'customerWhiteListEnable' | null                                   | 'nullable'\r\n    'customerWhiteListEnable' | true                                   | 'valid'\r\n    'enabled'                 | null                                   | 'nullable'\r\n    'enabled'                 | true                                   | 'valid'\r\n}<\/pre>\n<p>Do you understand what\u2019s going on? If you haven\u2019t seen it before, you may very well not. The \u201cwhere\u201d part, is a beautiful Spock solution for parametrized tests. The headers of those columns are the names of variables, used BEFORE, in the first line. It\u2019s sort of a declaration after the usage. The test is going to be fired many times, once for for each line in the \u201cwhere\u201d part. And it\u2019s all possible thanks to Groovy\u2019s Abstract Syntaxt Tree Transofrmation. We are talking about interpreting and changing the code during the compilation. Cool stuff.<\/p>\n<p>So what this test is doing?<\/p>\n<p>Nothing.<\/p>\n<p>Let me show you the code under test.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">static constraints = {\r\n  name(blank: false)\r\n  contactInfo(nullable: false, validator: { it?.validate() })\r\n  businessSegment(nullable: false)\r\n  finacleAccountNumber(blank: false)\r\n  principalContactPerson(blank: false)\r\n  principalContactInfo(nullable: false, validator: { it?.validate() })\r\n  feeCalculator(nullable: false)\r\n  customerWhiteListEnable(nullable: false)\r\n}<\/pre>\n<p>This static closure, is telling Grails, what kind of validation we expect on the object and database level. In Java, these would most probably be annotations.<\/p>\n<p>And you do not test annotations. You also do not test static fields. Or closures without any sensible code, without any behavior. And you don\u2019t test whether the framework below (Grails\/GORM in here) works the way it works.<\/p>\n<p>Oh, you may test that for the first time you are using it. Just because you want to know how and if it works. You want to be safe, after all. But then, you should probably delete this test, and for sure, not repeat it for every single domain class out there.<\/p>\n<p>This test doesn\u2019t event verify that, by the way. Because it\u2019s a unit test, working on a mock of a database. It\u2019s not testing the real GORM (Groovy Object-Relational Mapping, an adapter on top of Hibernate). It\u2019s testing the mock of the real GORM.<\/p>\n<p>Yeah, it\u2019s that stupid.<\/p>\n<p>So if TDD gives us safety, design and feedback, what does this test provide? Absolutely nothing. So why did the programmer put it here? Because his brain says: tests are good. More tests are better.<\/p>\n<p>Well, I\u2019ve got news for you. Every single test which does not provide us safety and good design is bad. Period. Those which provide only feedback, should be thrown away the moment you stop refactoring your code under the test.<\/p>\n<p>So here\u2019s my lesson number three:<\/p>\n<h4 id=\"provide-safety-and-good-design-or-be-gone\"><strong>PROVIDE SAFETY AND GOOD DESIGN, OR BE GONE.<\/strong><\/h4>\n<p>That was the example of things gone wrong. What should we do about it?<\/p>\n<p>The answer: delete it.<\/p>\n<p>But I yet have to see a programmer who removes his tests. Even so shitty as this one. We feel very personal about our code, I guess. So in case you are hesitating, let me remind you what Kent Beck wrote in his book about TDD:<\/p>\n<blockquote><p>\nThe first criterion for your tests is confidence. Never delete a test if it reduces your confidence in the behavior of the system.<br \/>\nThe second criterion is communication. If you have two tests that exercise the same path through the code, but they speak to different scenarios for a readers, leave them alone.<br \/>\n[Kent Beck, Test Driven Development: by Example]\n<\/p><\/blockquote>\n<p>Now you know, it\u2019s safe to delete it.<\/p>\n<p>So much for today. I have some good examples to show, some more stories to tell, so stay tuned for part 2.<\/p>\n","protected":false},"excerpt":{"rendered":"Have you ever been in a situation, where a simple change of code, broke a few hundred tests?&hellip;\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[411],"class_list":{"0":"post-10139","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-development-design","7":"tag-spock"},"_links":{"self":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts\/10139","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/comments?post=10139"}],"version-history":[{"count":11,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts\/10139\/revisions"}],"predecessor-version":[{"id":15566,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts\/10139\/revisions\/15566"}],"wp:attachment":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/media?parent=10139"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/categories?post=10139"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/tags?post=10139"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}