Christin's blog on testing: 2011

16 December, 2011

BBST Test Design

Short summary for the restless and easily bored:

Take all courses that the Association for Software Testing (AST) are offering, they are worth every minute and cent you spend on them. Just trust me on this.

For those of you with too much time and nothing better to do:

I feel a deep respect for people who are good at what they do, and take pride in their work. Exactly what they do is irrelevant (as long as it is legal), but I enjoy watching a skilled worker do a good job. In fact, I like it just as much as I dislike watching an unskilled or uninterested person do a lousy job. Of course pride and skill go together - how can you fell pride unless you do a good job? And in order to do a good job I think two things are more important than anything else: a genuine interest and a strive to constantly progress and improve.

I am very interested in software testing, and I always want to do a good job. I take pride in what I do, and I constantly try to learn more and get better by reading, writing, listening, talking and discussing. And by taking courses.

I recently took the AST course BBST (Black-Box Software Testing) Test Design, which was the best software testing course I have ever taken. Why?

To start with I like the format of the BBST courses. Taking the courses online means I can do the work when it fits my schedule, and also when I am the most focused and effective. Furthermore, the BBST courses are very interactive, which is crucial for my (and most other people's) learning. I especially like that all courses include peer review. Having to review the work of a fellow student means that you need to know the topic well yourself and be able to go beyond reciting the lecture notes. And, you always learn something (usually a lot) by how someone else answered a question or solved an assignment.

Test Design is a topic I have given courses on myself, and care very much about. It seems that in general, focus is often on test execution, whereas test design is sometimes, unfortunately, neglected. The BBST Test Design course gives the best overview of test techniques that I have ever encountered, and a lot of concepts that used to be vague to me have now been sorted out and clearly defined. I enjoyed how the course presented a great variety of techniques, but focused on a few and really went through them in detail with hands-on exercises. The beauty of the course is that is gives a skill set that is ready to be applied immediately. I learnt practical things that I will start using right now.

I am also impressed by the course material, which is extensive, well structured and of very high quality. Knowing how much work developing a course is, I am humbled.

As with all other BBST courses I have taken, a bonus is the other students. Having so many smart, interested and enthusiastic testers with such different backgrounds to discuss testing with is a rare pleasure.

In short, I enjoyed the course and learnt a lot and I recommand that you take BBST Test Design too.

10 November, 2011

Follow-up on xBTM

Background

At STARWest 2011, I gave a talk about xBTM together with Michael Albrecht. Jon Bach was in the audience, and he gave us some very valuable feedback on our idea and how we presented it. This blog post is a follow-up on our conversation in Anaheim.

Recently I was testing version 1.5 of AddQ's SBTM reporting tool SBTExecute. The team wanted to release the new version as soon as possible, and I was travelling so there was not so much time for testing. We agreed that I would spend one morning - four hours - testing the final version. My method of choice was of course xBTM. In this blog post I will go through how I spent those four hours, and what was the outcome. For more background on xBTM and the tool, please refer to my earlier post.

Test plan

My initial step was to make a mind map test plan. My preferred (free) mind mapping tool is XMind. I opened a new mind map and placed my application under test (SBTExecute) as the central topic. Next I identified my key areas (also known as function areas). I spent a couple of minutes thinking about some obvious groups and came up with:

Configuration
Documentation
Running tool
Import
Generation
Report

I added these six key areas as subtopics in the mind map. Then I decided I also wanted to some stress testing and added that as a subtopic too. See Figure 1.

Figure 1: Mind map test plan. The central topic (SBTExecute 1.5) is the software under test. The subtopics are key areas, or test techniques, and grouped under the key areas are the test threads.

Then I spent about twenty minutes thinking about test ideas, or test threads, and writing them down in the mind map under the appropriate key area. Since this was not the first time I tested the tool, I already had some ideas of things to test. After a total of maybe half an hour or less, I had my test plan, see Figure 1.

Testing

I decided to start by testing Import, since that is the key feature of the tool, if you cannot import data from the session reports nothing else is really worth testing. I looked at the threads I had listed under the import node and figured that I could probably test all of them in one 45 min session. To show this in the mind map, I changed the colour of all those threads to blue, see Figure 2.

Figure 2: Testing all threads under the key area Import in one session.

I write my session reports using the SBTExecute Excel template, and I use the mind map note functionality to connect the session report to the mind map. The yellow paper icon on the subtopic Import shows that there is a note, which can be viewed and edited by clicking the icon, see Figure 3.

Figure 3: Adding notes in the mind map program. The note refers to the corresponding session report.

I ran the session, making notes in the session report. When I felt I was "done" testing a specific thread in that session, I visualised that in the mind map by adding a green check icon to that thread. Once in session, I realised that I didn't really want to test the thread Incorrect data, since there is data validation in the Excel template. I decided to pause that thread for now, and maybe come back to it later if I had time. Hence I added the pause icon to that thread. I also added some notes on why I paused the thread, see Figure 4.

Figure 4: Pausing the thread Incorrect data.

Then I realised I had forgotten to prioritise the key areas, so I did that using the colourful number icons in the mind map program. The highest priority is 1 and the lowest 6. See Figure 5.

Figure 5: Adding priority.

It turned out that the tool only reads files in xlsx format, and not old xls files. I wasn't sure if this was a feature or a bug, so I marked the thread with a question mark and made a note of it, see Figure 6.

Figure 6: When there are questions, the thread is marked with a question mark and a note is added.

I ended the session, but the fact that xls files were not read made me curious so instead of starting a new session I decided to look at the configuration key area for a while. I spent a few minutes trying to create session reports in Office 97-2003 and Open Office and have the tool read them. These two threads were tested as threads rather than in a session. I only spent a few minutes on them since it was a low-priority area, made a short note, then paused the threads and decided to come back later. If I were to resume these threads (in the end I never did), I would keep making notes in the notes window, see Figure 7.

Figure 7: Testing two threads for a short time, then pausing them.

Next I decided to test all threads under the key area Generation in a session, the same way as I did with Import, see Figure 8.

Figure 8: Testing all threads under the key area Generation in a session.

Here I found a few defects, which is illustrated by the red X in the mind map. Whenever I found a bug, I made a note in the mind map of the ID number and added a short description. Of course this information was also added to the session report, see Figure 9.

Figure 9: Defects found are marked by a red X, and a note of the ID number is added.

After completing the Generation session, I started looking at the Report key area. Here I decided that the two threads Iteration Reports and Summary Reports were extensive enough to make up a session, see Figure 10.

Figure 10: Testing two threads in a session.

After running three sessions, and testing two threads seperately I had the following status in my mind map, see Figure 11.

Figure 11: Test status after running three sessions and testing two threads seperately.

The documentation for the tool consists of three manuals, and I felt they were best tested as threads. At this point I was running out of time, and decided to simply quickly skim through the manuals. I used the partially filled square icons to visualise how far I felt I had gotten with the manuals and made short notes in the mind map (no session report since I tested them as threads). The final couple of minutes I decided to spend on testing running the tool with correct parameters, which I also tested as a thread since there was not enough time for a session, see Figure 12.

Figure 12: Testing threads, using the partially filled squares to visualize progress.

Test report

Finally my four hours were up and I stopped all testing activities. At this stage I had a test report in the shape of the latest version of the mind map, see Figure 13.

Figure 13: Test report.

I also had three session reports (Import, Generation and Report) and an error list.

Given more time, I would have returned to the paused or partially tested threads and continued, adding notes in the notes window. I would also have spent some time on the previously untouched threads. Due to the very limited time in this case, the above story is not a very good example of thread-based testing, but I hope to have one soon.

08 November, 2011

The Return Of Dendrograms

What is Dendrogram-Based Testing? Well, what is a dendrogram to start with?

A dendrogram is a tree diagram that visualises hierarchical clustering. If that didn't help, a dendrogram basically groups objects in a tree view based on how similar they are. The closer the objects are drawn, the more similar they are.

Thanks for the maths lesson, but how is that useful in testing?

Good question. I'll come back with a final conclusion later in this post, but I can think of two uses for dendrograms:

Clustering defects: Visually show how similar the defects previously found are.

Clustering test charters (test cases): Visually show how similar planned test charters (or test cases) are.

In order to create dendrograms we need the objects, e.g. defects, to have such properties that we can measure distances between them. This is where it starts getting tricky - how do we measure the distance between two defects? The simplest thing to do is to think of properties we believe to be important and then assign them numeric values.

One example could be the property "User" (P1) and we could assign a defect a value between 0 and 5 for this property depending on how affected we think the user is by this bug. Another property could be "Performance" (P2) or "Business" (P3). Imagine we are testing a web shop and have two defects:

B1:The "This is a gift" checkbox is missing in the GUI.
B2:Memory issue that slows shopping down when you have more than 10 items in your cart.

Each of the two bugs have the properties P1, P2 and P3, and we might to assign values as follows:

B1: P1=5, P2=0, P3=2
(the user is affected, the performance is not but the business flow is also affected)
B2: P1=3, P2=5, P3=0
(some users will be affected, the performance is affected, the business flow is not affected)

Based on these properties we can now see how similar the defects are in a dendrogram. In my earlier post I explained how to create a defect dendrogram with simple example, and I'm not going to repeat that.

Similarily we can assign test charters or test cases properties and create dendrograms. Here properties could be which actors, functions or areas that are involved, and the dendrogram shows a kind of test coverage. If all test charters are grouped together, they test very similar things.

So how do we base our testing on dendrograms?

A defect dendrogram would of course be used to decide where to focus testing. I think isolated defects would be my priority. A single defect far away from all other defects seems too unlikely, maybe there are more hiding that need to be discovered. Then again, if a large number of defects are very similar there is reason to believe that area requires special attention.

A test charter dendrogram would of course be used to help decide which charters to add. A single isolated test charter might be ok for a low-risk area, but might also be a warning flag.

Is this useful?

I have some serious doubts. Firstly, we need to find useful properties and assign them subjective values. The dendrogram will be based on those values and nothing else, so there is a huge risk of bias. Secondly, I have yet to find a good tool to use to draw dendrograms. With more than three variables (defects/test charters) and two or more properties it cannot be done by hand. Of course, writing your own tool would not be too complicated.

Right now I don't think the value gained outweighs the effort needed. I'm very interesting in hearing arguments that I'm wrong though.

23 August, 2011

Schools of testing?

CAST 2011 hosted an interesting debate between Doug Hoffman and James Bach on the topic of schools of software testing. The question under discussion was not whether there are different schools of thought within the testing community or not, but rather whether naming the schools and associating people with them is a good - or really bad - idea.

The debate was energetic, and clearly provoked a strong reaction in a lot of the attendees, which was only expected. The core issue is of course if it is ok to categorize people without bothering with their opinion. Most people categorize others, but hate when they themselves are put in a category that they do not approve of, or think they should belong to. It is a very touchy subject.

Personally, I like it.

To me, the fact that someone is associated with a school of thought corresponds to me being provided with a table of contents of a book. Let me try to explain. If person A says to me "- Person B belongs to the XYZ school", it provides me with a limited amount of information about person B, just like browsing a table of contents tells me something, but not everything, about the book. Immediately - without having to read the whole book (i.e. without having to have a deep discussion with the person) - I get a rough idea of the contents (i.e. the person's views and ideas). The same way I do not mind being associated with a school, or associating myself with a school. I find it helpful because I do not have to explain my general views over and over again, I just need to state which school(s) I consider myself belonging to. Sometimes I will disagree when others associate me with a certain school, but that on the other hand gives me valuable clues as to how I am perceived. And it might even make me change my behaviour.

However, I do assume that people are mature and intelligent enough to realize that a table of contents can be misleading, and in order to get the full story you actually have to read the book. You cannot know a person without having talked to them and having formed your own opinion.

I think the concept of schools of testing is helpful, and in all honesty - even if it was rejected people would still categorize each other 'secretly'. I would rather have it done openly so you at least can have a discussion.

18 August, 2011

CAST 2011

I'm back from CAST 2011, and I've had some time to digest the experience and think about what thoughts I want to share. There have been many excellent write-ups that give detailed account of what transpired at the conference, and there is no need for another (worse) one. Instead, I'll make some short remarks on my impressions.

I had very high expectations, but I'm still amazed.

What drives me in life in general and as a tester in particular is a continuous strive forwards and a desire to learn and progress. I have no sympathy whatsoever for people who seem to consider testing to be nothing more than a way to pass time and earn your paycheck. What made CAST 2011 such a fantastic experience was that it was a gathering of enthusiastic, engaged, creative and ambitious testers. All willing both to learn and to teach. Everyone was friendly and approachable and willing to share. I was in awe of all the experience and knowledge that was surrounding me.

Even though there were of course different opinions on various topics on a smaller scale, it was fantastic to see such a large body of people all strive in the same general direction, sharing the same goal. And I find it very comforting to see people (testers) take such pride in their work.

I learned a lot and got a bunch of new ideas to try out, but mainly I was just soaking up the joy and energy. Thank you everyone who attended and thereby contributed to making CAST 2011 one of the best conference I've ever been to.

I'm proud to be a tester.

17 July, 2011

Dendogram-Based Testing

Friday afternoon I was looking through the latest tweets when my eye was caught by the phrase Dendogram-Based Testing. I like all words that have a Greek origin and sound like science, so I had a closer look, and of course it was James Bach introducing a new concept. One that - as far as I understand - is still pretty much missing a definition. No reason to let a small detail like that stop you, right?

After reading up on dendograms I realized that I have actually used them before, but didn't know they were called dendograms. A dendogram is basically a way to take data points and cluster them based on their properties. They are commonly used in computational biology, and that's were I encountered them. About a year ago I spent a week of my vacation making dendograms from genome data.

In testing, one way to use dendograms would be to cluster defects. In order to do this you would need to define a set of properties for each defect, and based on these properties it would be possible to calculate distances between the defects and cluster them in a dendogram. I will save the discussion on whether this is useful or not for later.

Example: The android game SuperTester

To the best of my knowledge this game does not exist, but if it did it would be a game in which the player has to find critical bugs in imaginary applications under time pressure. Just for the record, I haven't thought too much about this so I'm just making it up as I go along.

Let's say five bugs have been found when testing the actual SuperTester game:

1. Can't save game (D1)
2. Can't change sound volume (D2)
3. It is possible to register the same bug twice (D3)
4. Application crahses if you find exactly 13 bugs (D4)
5. Application crashes if you play for more than 59 minutes and 59 seconds (D5)

Now we need to assign these defects properties in order to cluster them. This is the tricky part and requires some careful thinking, Which properties you pick decide what information you will get out of the dendogram. For now I'm just going to pick two properties for the sake of creating an example,

1. Frequency of occurrence on a scale 1-10, where 1 is rarely and 10 often (P1)
2. Severity of defect on a scale 1-10, where 1 is not severe and 10 is very severe (P2)

Time to make a table:

Defects with assigned properties P1 and P2.

Ok, the values might not make so much sense but let's ignore that for now. We have everything we need to calculate the distances. Note that all this assumes that the properties are numeric, if you have other properties such as "red" or "green" you need to decide how to calculate the distance between "red" and "green". For numbers we use the Euclidean distance. I'm not going to go through all the boring details, but the distance table will look like:

Distance table.

D2 and D4 are closer to each other - that is, more similar - than any other defects. Hence we cluster them in cluster A. And so it goes on. What we end up with is the following dendogram:

Dendogram. Defects are clustered based properties.

That's it! Two important points here are i) you need a tool because you definitely do NOT want to do this by hand and ii) how about non-numeric properties, how do you measure distances? I definitely think dendograms can be useful, but a tool needs to be found and then there must be some thought on which properties to use - what kind of information do we want from the dendogram?

This post is just a collection of initial thoughts and is focused on what a dendogram is. I will now crawl back under the rock I came from and think more about how to actually use it for testing.

30 June, 2011

Craving conferences

On beautiful summer days like these, I have to admit that the main thing on my mind is being out in the sunshine, preferably eating ice-cream! However, I am also starting to feel keen ongoing to another test conference. Luckily, the late summer and fall look very promising.

For starters there is CAST 2011 which is held in Seattle by the Association for Software Testing in early August. The theme is "Context-Driven Testing", and I am of course going together with some colleagues - how could I resist the opportunity to hear about, and discuss, context-driven testing with like-minded people? It will be a blast!

Then I am proud to announce that I am speaking together with Michael Albrecht at the STARWEST conference, October 2-7, in Anaheim, California. I am especially happy to be able to provide North America with some Swedish test thinking. Why don't you join us? Register using special promo code SKWS and save up to $600 if you register by Super Early Bird August 5th! Click here to register online.

Like most people around the office, I cannot wait for my vacation to start, but I am really looking forward to coming back to an exciting fall!

25 May, 2011

Sometimes what tastes like mold actually is mold

Last night I enjoyed a test-oriented after work with Simon Morley, Oscar Cosmo and Daniel Berggren. Quite a bit of time was actually spent on group dynamics and the value of working in a team where the members have mixed backgrounds and experiences. We also talked about the importance of having opportunities to discuss testing outside of your team in order to get fresh ideas as well as feedback on your own ideas.

Then came the mold discussion. We got on to the subject of gut feeling – sometimes you just know something is not quite right but you do not have any hard evidence. Or you have a weird incident that only happened once, and cannot be reproduced, but you know that it is important and should be investigated, and still no one can be bothered.

Recently I was in Portugal on vacation, eating good food and drinking great wines. One evening I ordered a piece of blueberry pie for dessert. The pie arrived, beautifully covered in blueberries and nicely presented on the plate, and I dug in. It tasted a bit funny though, sort of like…mold. But I was in a nice restaurant, recommended by a local whom I was having dinner with, the main course had been fantastic and the slice of pie looked delicious so of course I kept eating it even though I could not get rid of that nagging feeling that something was not quite right.

Finally, when I only had a piece of crust left, my eye was caught by something bluish and fuzzy. Of course the crust was moldy! Probably the entire pie bottom had been moldy, and I had eaten it all up. Not once had I stopped to question if that funny flavour really should be present, nor had I stopped to examine the pie more carefully. I was fooled by the fact that I had been told that it was a good restaurant, and that the pie looked good.

This happens in testing too. You might be testing a third party product that ‘is known to be stable’, or asked to ‘just have a quick look because we know nothing has changed’. And it looks so good! But still, deep down, you know something is wrong. Trust your gut feeling, be courageous and be persistent. Sometimes what tastes like mold actually is mold.

…oh, I survived the pie just fine, no unpleasant after effects. Still, I learned my lesson.

03 March, 2011

The return of the context-driven physicist

I signed up for CAST 2011 as soon as registration opened since Henrik Andersson had told me it was The Conference To Attend. I didn't think too much about the topic - Context-Driven Testing - until I realised "everyone" was discussing whether they were context-driven testers or not. It's even the current poll on the AST homepage! It was time to do some fact finding followed be serious thinking - am I a context-driven tester?

The physicist in me (still going strong three years down the road) is nonplussed. It has never really occurred to me that it is possible to not be context-driven. Physicists are trained to be context-driven. In physics there is no such thing as one theory or formula that applies under all circumstances - on the contrary everything is highly context dependent. Like speed - when objects move fast enough relativistic effects have to be taken into account.

The (somewhat more quiet) statistician that also lurks at my inner core agrees. In order to interpret your data you have to know the context. Without context you can't know whether the data is best described by the standard distribution, or maybe the chi-square distribution.

As a scientist my approach is to first evaluate the context and then try to find the most suitable technique for solving the problem. You won't get far if you have a favourite formula that you insist on always using. Nature has no intention to adapt to you.

I did not give TBTM a try because I thought it was cool (it is) and it would make a good blog post (it did), but because I thought it would suit my context. My focus is always on solving the task at hand, not on the methodology or technique. I do try to learn as many techniques and methodologies as possible, but not to have a nice CV but rather to really be able to be context-driven. If my tool-box only contains one tool it’s darn difficult to adapt to circumstances.

I am a context-driven tester and proud of it.

28 February, 2011

Going to the extreme - xBTM

Now that the project has finished it is time to sum up my experiences of adapting Thread-Based Test Management (TBTM). Since I generally do not believe in rigorously adhering to a protocol, I ended up not using TBTM strictly, but instead embraced a hybrid of Session-Based Test Management (SBTM) and TBTM. Naturally this hybrid will be denoted xBTM.

The project

In order for this text to make sense, a few words on the project are needed. The customer runs an application that generates output which is stored daily in a single XML file, and later used by the customer’s own applications to derive vital business statistics. There were defects that needed correction, and the customer also asked for new features regarding how log on/log off was recorded. The corrections as well as the new requirements should be implemented in a new file that was supposed to be generated in parallel with the old file for a transition period.

The team

It was a small project. I acted as project manager, test manager and tester. Later in the project I was joined by a second tester. There was one single developer.

TBTM

Most of the time my working environment in general is simply too hectic and borderline chaotic for SBTM to be suitable. There tends to be a lot on interruptions and distractions, and it is rarely possible to sit down and focus on a single test task for a given time period. Therefore I decided to give TBTM a try when a new project started.

My first step was to make a mind map containing all my test ideas as threads. Since I am very fond of open source products I used the software FreeMind. In the mind map below, e.g. File Name is a thread. XML file is the actual product. ID denotes a defect, and CR denotes a new change request. The test threads are grouped in two ways. Most groups represent key areas, e.g. Generation which means file generation. Other groups are formed based on the type of testing, e.g. Stress testing. In some cases I felt short notes were needed to explain the thread, and in those cases I attached text files – knots – to the thread. These files are shown as red arrows in the mind map. You click on the arrow to open the text file.

Test plan.

I tried using colours and icons to make the mind map easier to read. The warning signs mark key areas that I judged to be high risk and especially important. The stop light marks a thread that was tied off. That particular change request was retracted by the customer just before the project started. The initial mind map as it was before I started testing made up my test plan and was sent to the customer.

During the test period I would constantly be updating the mind map and it would always give me an accurate picture of the current status of the testing. As soon as I started working on a thread I would mark it with a smiley, see image below. Threads where I found defects were marked with a red cross, and threads where I felt sufficient testing for delivery had been done (not the same thing as claiming to be done testing!) were checked off in green.

Snapshot of mind map in the middle of the test period.

Initially my goal was to write daily status reports containing a few short notes on what I had done. In reality I only kept this up for four days. With my constantly updated mind map and the occasional session report (see below) I really did not feel a need for it.

When there was no more time for testing, I took the current status of my mind map and used it as my test report, see below. This test report was sent to the customer.

Test report.

SBTM

I do still like SBTM and I find the session reports as well as the time-boxing very useful, so when circumstances would allow, I created test charters and ran time-boxed test sessions. Typically a couple of threads would make one session, but in come cases one thread deserved a session of its own. For example, looking at the test plan the two threads File name and File header belonging to the key area File generation would be tested in the same session, whereas Install from scratch and Install upgrade belonging to the key area Installation were tested in separate sessions.

In most cases I would draw a simple activity diagram – pattern – for each test charter, see example below. I prefer using yEd for my diagrams.

Test charter.

I wrote short session reports according to a new AddQ session report template, see image below.

SBTM session report template.

Using the AddQ tool SBTExecute I could then derive metrics such as Total session time (for all sessions), Number of test charters, Number of test charters per key area, and so on from my session reports. However, since I was still experimenting in this project, as well as working on the template and the tool as I went along, I could not obtain any useful statistics. Also note than any metrics derived would only be valid for my SBTM sessions – and a considerable part of the testing was done according to TBTM.

Summary report generated by SBTExecute.

Summary and Conclusions

What is the main difference to previous projects I have worked on? Well...the first thing that comes to mind is that this time I actually used the test plan! And I have read and come back to the test report. Keeping the mind map up to date has been much easier than updating any other kind of status document, which means that it actually has been updated. As it turned out, even the developers liked it – they preferred looking at the mind map over using our bug tracking tool!

I immediately liked the combination of TBTM and SBTM – it is a simple matter of applying the methodology that is best suited for the task at hand. Some threads were fairly isolated and straight forward to test, making them suitable for SBTM. Other threads were very convoluted and had to be explored together with the developer, making them better suited for TBTM.