Sunday, April 20, 2008

Software Cost Estimating with Cocomo II

I'm reading the book Cocomo II. It's an adjusted cost estimating model from, mostly, Barry Boehm. As you can gather from previous posts, I'm sceptical on these models of process improvement and managing projects through "numbers" and forms. Ideally, we know everything at the start of the project and everyone knows how to do things most efficiently.

If you read the book only literally and pay attention to the math, it won't get you far. The power of the book comes from the interpretation of the ideas behind those numbers and formulas and only then use the numbers anyway, since they're the only factual foothold you'll get in a real-life situation. The rest are fictions of our imagination or our personal recalibrations that are influenced by our own ideas of project management.

First you should remember what the word "estimate" means. It's an expectation of cost, time or effort based on the information known at the time of producing the estimate. One of the strongest and most interesting statements is done in Chapter 1, where it is made clear that Cocomo II doesn't give you drop-dead fixed deadlines or cost statistics (as long as your numbers are correct), but that they provide a guideline or foothold on a potential track for your project that is subject to a potentially large deviation. The size of the deviation is determined by the quality and quantity of information at the outset of the project. So, the more you know beforehand, the more accurate your estimate will be. That is a logical given. Maybe that is also why you should consider reproducing estimates at different stages throughout the project.

However, things go further than cost estimates. In estimates we often make assumptions and tend to think along a positive trail of project execution. That is, we generally like to disregard risks and things that will go wrong and just don't take them into account. Or we think we can squeeze the effort anyway and do more in the same time than initially envisaged. In order to be correct in the estimate, regardless of a boss that won't like what you're telling him, you'll need to also factor in the negative issues into the equation.

So, there must be a number of factors that negatively impact the project outcome as envisaged in the estimates. Think of these factors for a future project, as these increase the potential deviation significantly:
  • Incomplete definition of scope at the start of the project.
  • Unclear development process or not living up to that process.
  • A development team that doesn't communicate well or otherwise faces challenges in its communication.
  • Scope creep in future stages of the project.
There are some people that understand that certain projects above a certain size or cost have no chance of succeeding. That is because the environmental factors keep changing and due to the size, the communication needs to increase and all other factors. Simply put, you can't reliably estimate on those projects, since there may a deviation of a factor 4-8.

If you think to the completion of the project, it would be very easy to compile an estimate at that stage. You go through the project history, estimate how much is lost for each event (without considering the real numbers) and chances are you'll arrive at a number that is about 90% accurate. But we can only do that because at the completion of a project, we have all the information we need to produce that estimate.

Now think towards the start of the project. What information do we have available and what are risks or issues that we should foresee?

From cost estimation, I think we can learn that reducing project risk and improving the chances of success is increasing the amount of information for successful development of the project. Think of methods like software proto-typing, iterative deployment cycles, showing things early, etc. It's not yet proven that these provide the correct results, because showing things early may also induce a feeling to your client that things may change at any stage in the process.

So, from all of this, I can conclude that the estimate itself is not the most worthwhile thing produced in the cycle, but the accuracy of that estimate is more valuable. How much deviation can we expect overall? And how do we express it? Since we can't reliably come to any estimate at all to initiate a project, what range can we give to project decision makers, our sponsors, so that we can inform them beforehand whether something should be done or not?

Probably, this conclusion should result in a whole new way of software development. Something based on measuring the quality of information, scope and specification available at the outset and a measure of the risk involved if things are progressed based on that little information.

2 comments:

jaco said...

with cpu and memory power you are not even close to produce or imitate a neural network. Bare in mind, computer languages are only able to process loops and if statements. A little if is out of the question.

Gerard Toonstra said...

Must be friday. You posted this under software cost estimation.

I sort of agree. It's not about executing if's or loops. ANN's is computation too and you can express a 'match' when you measure the excitation of a certain cell assembly. how to measure and so on is a different question, as well as the question whether computation and simulation of behaviour produces actual consciousness.

The idea is to find out the limitations and constraints in different directions around the theory. So different perspectives need to be applied to the theory in order to map out the area.