Site Sections

Thursday, September 3, 2009

Philosophy of Software Test

I have been a test software engineer professionally for three years (going on four). I have done everything from writing automated GUI control test scripts in visual basic to embedded test systems engineering in c for a wind river operating system running Ada with c interfaces. I know that software test is important for any application. I have seen product fail test evaluation that would have gone to the customer otherwise.

Unfortunately not everyone has the same opinion that I do on the matter. Some feel that test software is wasted code, providing no economic gain in the long term of the project. These people tend to be the ones who write the checks for your project so usually you have to do some serious convincing to get them on board. Once you do, you still need to have a good method of approaching your test system otherwise you will be wasting money, a lot of money.

Here I will try to lay out some basic principles and guidelines to follow when writing test code for any project.

Step 1: Determine your systems primary method of outputting debug information
Many applications use the standard console window as their primary source of debug output. I do not recommend this, the reason being that once you have 100+ modules that you are trying to integrate and test, the debug information spewing to your console will be unreadable.

A better solution is to output all of your debug to a common log file for your system. I will elaborate more as to why only one log versus one for each module in a bit.

Step 2: Design your Test Strategy Early
This is probably the biggest problem for people. They know what type of software they want to write, so they go for it. Then approximately 50 - 70% of the way through development, they realize that their system is poorly designed and half of their code doesn't work. The reason for this is that generally people do not take into consideration that it generally takes a while to get a system into a testable state. By the time you have gotten to this point, so much of your code could be wrong and not true to the original design of the system. Trust me on this one, I have worked on major projects that have gone astray from their original design. No matter how well it functions after that, you are always bending your own rules to get stuff to work right. This rule bending is painful and costly, sometimes whole portions of the system need to be redesigned to accommodate some minor tweak made to fix an issue that resulted in a poorly structured architecture/test combination.

The key to success is to decide on how you are going to systematically test each component (software or hardware) before integrating it into the entire system. This is called a functional test or unit driven test, depending on what camp you come from. The idea here is that each module has a set of tests that test out its basic functions as they would be used in the actual system. Now, I do not mean to say that you need to write a whole platform to test the module, that would defeat the purpose of testing. What I mean is that you create a harness to place your module in that exercises the functions input and output to verify that your internal algorithms function appropriately. It is probably a good idea if this is a centralized module that will allow you to turn on and off specific modules for test so that way as you incorporate more modules in your system, they are not only easy to add, they can be tested together within the same test structure.

An example of this is a custom dictionary. Say your custom dictionary is going to be used by two separate modules within the system. For example a means of passing data between two separate threads. Now, if you write your dictionary to have a set of tests that test each of its functions by supplying artificial data for each of its fringe cases, then you can be confident that the Dictionary itself is functional. Now, once you start implementing the worker threads that are going to access this dictionary, you will obviously want to test each of their functions (including the function to write or read data from the dictionary) in a similar manner as you did the dictionary in the first place. This all seems to check out, all is going well. Now once we start integrating, we see a problem. For some reason we are getting corrupt data from one module in the other module. We can't just run our tests because they only prove out their individual components. However, if we design our system in such a way that a fully tested module can be used to write data to the dictionary and just the test that is used to test the dictionaries read function can be used to verify the dictionary, you have now eliminated one third of the potential problem area of the system. The same can then be done against the other module.

This type of approach will help you narrow down the possible problem areas of your system. A useful strategy that I have implemented in the past for checking for memory leaks within a data structure, was to create a separate test module that simply queried my memory manager to print out its allocation table. I then set up this test module to run between integration components and modules self tests. This provided me a way to quickly identify whether a section of the system was using more memory than was previously expected. It was also useful because it tested out the functions of my memory manager as well as the memory managers integration with the other components in the system.


Step 3: Let your Architects/Designers Identify your tests
The reason for this one is that unit testing should be done before any system code is written, and who else better knows what data will be required for input and output of your functions than your Architects.

Step 4: Identify WHAT to test and HOW MUCH to test it
Of course, if you are the architect, then this section is for you.

This is actually a tricky question. Many times your test team will want to test every component and module in the system. In a perfect world, this would be the case all the time. However, in the real world, cost and schedule demand that you use agile practices and identify key areas of testing. If you were to test everything, writing the test code would take more time than writing the actual code. Performing modular testing is a part of the agile development practice because it allows you to release stable code and integrate new features into your software quicker with less bugs, yet it still requires too much time generally to perform modular testing on all components in the system. The trick is to rely on your foundation classes.

API's such as the .net framework and the Apple OS are expected to be stable. They perform their own unit testing and provide their APIs to test developers to work with and exercise before they release them to the general public. By utilizing their work, you alleviate having to write most of your own data structures, and as a result save both development time and test development and execution time. Many companies seem to not want to rely on these APIs from what I have experienced. I am not sure as to the reason for this, but if faced with the dilemma of either using a pre-built table class or to roll your own, think twice about rolling your own. When you are writing software for yourself or for your company, its not about academics, its about money, its about maintainability, and its about rapid stable development.

Also, it is unfortunate, but anytime your company runs into either a budget or schedule crunch, test will be the first thing to go. Manager's priorities lie with meeting their deadlines, not making sure the product is 100% quality assured. This is even more the case when it comes to writing test software to test out hardware. Generally, a computer engineer designing a board for some device will perform a flying probe or a functional board test to verify that their module works (similar to my description of the unit driven testing of software above). As a result, if you can not finish your testing in time for the product release, more than likely the computer engineer/manager will determine that the functional board test was sufficient to satisfy that the product is working... This is wrong. The reasoning behind why it is wrong is the same reasoning behind why it is possible for a module to test out 100% on a unit driven test, but fail once it is integrated into the rest of the software system. It is impossible to fully understand how two devices are going to behave together prior to actually hooking them up and making them communicate with each other.

Step 5: Debug Information - Better Information is Better Information
As I stated before, you really should centralize your debug data. It makes it much faster when sorting through your runtime output than if it was in individual files for each module. I know this seems counter intuitive, but realize that text editors, along with processors and ram, has improved over the years and searching a text file is no longer a computationally heavy task. Well, this is assuming you don't have a hundred megs of output, which may happen at times but most of the time it won't.

Now what I mean by Better Information is Better Information is that more information is not usually proportional to the quality of information that is being output. Sometimes having too much information is more detrimental than having too little. On the other hand, having too little information can also mask your problem. The best bet is to determine early a format for your debug output. This will allow you to quickly identify the major components of the debug output messages. My preferred format is:

[module debug information is coming from]:[message (error, recovery, status)], [message specific data], [expected results], [actual results] [optional time stamp].

This format is uniform and quickly identifiable. For example if you have a player class that checks collision and you wanted to identify during runtime if you collided with other objects in the scene, it might look like this.

"Player: Error, Collision encountered with [object] at (x, y, z), Collision flag = true, [object] Type should not be collidable." 

Now here we can see that there was an Error thrown by our error manager framework. It was output to the log, and it contains all the data we need to see what is going on. It appears that the problem lies in the Players collision function (at least at first glance) and that the [objects] collision type was not set to collidable. If this is not the case, the only other possibility is that the [objects] collision mode read function is not passing data correctly back to the player class for some reason, either the player is overriding the data it receives, or it is corrupt for some reason when it gets there. I think this is personally much better than something that looks like this:

t: 0
Player: (x, y, z)
Object: (x2,y2,z2)
t: 1
Player: (x, y, z)
Object: (x2,y2,z2)
t: 2
Player: (x, y, z)
Object: (x2,y2,z2)
t: 3
Player: (x, y, z)
Object: (x2,y2,z2)
Collision: (x, y, z)

Or even worse, no data at all. The difference here is that we are only reporting on a failure or critical status in the better version. This helps cull data that is unimportant to us, making finding what we are looking for much easier.

I could continue to talk about this topic all day, however this post is getting a bit large so I will cut this session short for now. I will probably be adding additional little discussions about testing philosophy here as that is what I do for a living. I am starting to implement a new game idea and I will be implementing some of these things in that title. I will share my experience doing so here. If you have any comments on this, I am always willing to be persuaded to better methods, so don't feel shy about telling me you think I am wrong. I hope this has helped some of you identify your weaknesses in your testing strategies. Remember, its O.K. to be OCD when it comes to software perfection.

No comments: