Sunday, June 05, 2005

Software Estimation (Part 2) - Nuts and Bolts

I think I should preface this post with the notion that what were trying to do here is come up with a very rough, gross estimate of a project very early in its life-cycle. The most elaborate software estimation process will have an accuracy of -50% to +75%, so we're really just trying to get in the ballpark. We don't want to spend thousands of dollars on the early estimating process especially if this is being done before an agreement or contract has been signed.

Use Case Point counting is similar to the Function Point counting technique in that you basically:

  1. Count certain key aspects of your collected requirements.
  2. Answer a series of questions about the nature of your project and your development environment.
  3. Process the questions based on a pre-defined formula to come up with an adjustment figure (ie., your fudge factor)
  4. Multiply your raw count by your fudge factor to arrive at an adjusted size figure.
  5. multiply the resulting figure by a number of man-hours/point to arrive at your expected level of effort.
Counting

Here is a simple use case diagram that I'm going to use as an example. (UML 2 Primer)

(click to enlarge)


Actors

The first thing we count are the actors of our system and assign them a rough difficulty.

Actor Weights
DifficultyRuleUse Case Points
EasyA machine with a programmable API 1
MediumA user with a command line interface or a system without a published API 2
DifficultA user with a GUI 3


In my example, we have 1 easy and 1 difficult actor. Doing the math, we have 1 easy actor x 1 point + 1 difficult actor x 3 points = 4 Raw Actor Points (RAP).

Use Cases (good book on Use Cases)
Next, we do the same sort of thing for the Use Cases we have identified in our project. In this case, the weights for easy, medium and complex are 5, 10 and 15 points respectively. In order to determine the complexity, we count (or predict) the number of scenarios the Use Case will have. Each use case will almost always have at least 2 scenarios, your basic success scenario (where everything goes right) and at least one alternate scenario that deals with errors. You can certainly have many more that deal with alternate logic paths.

Use Case Weights
DifficultyRuleUse Case Points
Easy3 or fewer execution paths or scenarios 5
Medium4 to 8 execution paths or scenarios10
DifficultMore than 8 execution paths or scenarios 15

In my example, I've assigned medium difficulty to "View Summary Data", since I expect this Use Case to have several different paths based on search criteria. The rest I've assigned to the "Easy" category since they are relatively simple.

Again, doing the math, I have 4 easy Use Cases x 5 points + 1 medium Use Case x 10 point for 30 Raw Use Case Points (RUP, not to be confused with the Rational Unified Process)

Calculating the Fudge Factor
This is where we adjust our raw total of use case points to account for characteristics that are specific to our system under design (Technical Complexity Factors) as well as characteristics about the environment that the project will be built in (Environmental Complexity Factors).

Technical Complexity Factors

For each of the 13 technical complexity factors, you are going to assign a value between 0 and 5. Zero indicating that a factor does not exist in your project and a five indicating that the factor has the maximum importance or impact on your system. The weights are pre-defined by the estimating method. I don't really fool with those. I think you should either wait until there is some industry movement to modify those figures, or wait until you have enough historical data collected to prove them wrong. Changing the Value column gives me enough impact on the factors for my taste.

MetricDescriptionWeightValueTCF
TCF01Distributed System2.005.0010.00
TCF02Response or throughput performance objectives1.004.004.00
TCF03End user efficiency (online)1.002.002.00
TCF04Complex internal processing1.004.004.00
TCF05Code must be re-usable1.002.002.00
TCF06Easy to install0.505.002.50
TCF07Easy to use 0.503.001.50
TCF08Portable 2.003.006.00
TCF09Easy to change 1.003.003.00
TCF010Concurrent 1.002.002.00
TCF011Include special security features 1.002.002.00
TCF012Provide direct access for third parties 1.005.005.00
TCF013Special user training facilities are required 1.003.003.00

Unadjusted TCF Value (UTV)

Total:47.00

Adding up all the TCF lines gives us an Unadjusted TCF Value (UTV). To arrive at the ending Total Complexity Factor, we apply the following formula:

TCF = TC + (UTV * TWF)
Where
TCF Constant (TC) = 0.60
TCF Weighting Factor (TWF) = 0.01

Applying the formula to our example, we have

TCF = 0.60 + (47.00 * 0.01) = 1.07

Environmental Complexity Factors
The Environmental Complexity Factors are generated the same way as the Technical Complexity Factor is calculated. You just use a different set attributes and constants.

MetricDescriptionWeightValueTCF
ECF01Familiar with Rational Unified Process1.504.006.00
ECF02Application experience0.503.001.50
ECF03Object-oriented experience1.004.004.00
ECF04Lead analyst capability0.504.002.00
ECF05Motivation1.003.003.00
ECF06Stable requirements2.004.008.00
ECF07Part-time workers-1.000.000.00
ECF08Difficult programming language-1.003.00-3.00

Unadjusted ECF Value (UEV)

Total:21.50

Notice that ECF01 references a specific software life-cycle model. A lot of the work for this method has been done by the people at Rational/IBM, which explains the item. You could probably substitute your life-cycle process here with similar results as long as the organization you are doing the work for has a mature process. Most of the organizations I've done work for rank really low in this area. A lot of CF shops seem to add management processes as an after thought with little regard for formal life-cycle management.

While I'm on the subject, it seems to me that the cf community spends a huge amount of effort building software frameworks (Fusebox, Mach-ii, Model-glue, etc.) While this is certainly worthwhile, life-cycle processes seem to be almost ignored by the community. About the only work I've seen in the area is FLiP, and it only goes so far. Maybe it's one of those things that everybody is doing that nobody feels is interesting enough to talk about on a regular basis. I get the impression that life-cycle management is handled using the "back of a napkin" or "I do that in my head" approach. I seriously hope that as more cf shops mature, that this changes. (end of rant).

Adding up all the ECF lines gives us an Unadjusted EF Value (UEV). To arrive at the ending Environmental Complexity Factor, we apply the following formula:

ECF = EC + (UEV * EWF)
Where
ECF Constant (TC) = 1.40
ECF Weighting Factor (TWF) = -0.03

Applying the formula to our example, we have

ECF = 1.40 + (21.50 * -0.03) = 0.755

Finishing Up

We now have all the numbers we need to calculate our total use case points for the project.

Here's the formula:

Use Case Points (UCP) = (RAP+ RUP) * TCF * ECF

Applying the formula to our example and rounding, we get:
UCP = (4 + 30) * 1.07 * 0.755 = 27

Hours/Use Case Point
Now that we have an estimate of size based on use case points, we can try to convert it to an estimate of effort. The original work by Karner suggested a good starting point would be to use 20 hours per use case point. Work since then has suggested that the figure can fall anywhere between 10 and 30 hours per point. My personal experience puts the number between 12 and 15 hours per point. This unfortunately is where you need some project history to calibrate the process for your specific development and estimating processes.

Here's an excel spreadsheet to help you crunch the numbers.

Coming Up Next: Working with the factors

Part 1: What Method to Use
Part 2: Nuts and Bolts
Part 3: Working with the Factors

8 comments:

  1. Hi Mike, I guess the use cases would need to be at the user goal level. In the case of CRUD type Use Case would you expect this to be split into several smaller Use cases for estimation purposes or would you count it as a single use case?

    Regards

    Russell

    ReplyDelete
  2. Hi Russell,

    There are several ways you can go with this, but there really is no substitute for experience here. Calibrating your estimating process always requires a few projects.

    You're correct in that I tend to do my use case estimation at the user goal level. I find that it gives me a good level of detail that I can share with other non-technical people and they will have an easy time understanding what I'm talking about.

    You could certainly go either way with your crud cases. I tend to just use one and maybe adjust the complexity if I know it's going to be a beast. Most of the time, I don't think a simple delete operation warrants a whole use case, but you'll have to judge for yourself.

    Keep in mind that there is a LOT of abstraction built into this method of estimating. I guess that can be one of it's shortcomings.

    ReplyDelete
  3. Hi Mike, I think you are right and that the best way ti understand the techniques is try it on a number of projects and cailibrate if necessary. Thanks for a very informative Blog.

    Regards

    Russell

    ReplyDelete
  4. Hi Mike,

    Very good info. I compared the model with a real existing project in my company, and the results were close.

    I would like to know if this estimate of work includes the testing investments (for example the testers man-hours)

    Thank you,

    Ricardo.

    ReplyDelete
  5. Hi Ricardo,
    There is nothing specifically in the estimation method regarding amounts of testing time. That would fall into the hours/use case point calculation which requires some personal calibration. The really important point here is that you remain consistent when you calibrate. If you include some amount for testing per use case point, see if you can use the same number each time or better yet, a weighted average. I'd probably consider throwing out the upper and lower extremes as well.

    ReplyDelete
  6. Hi Mike,

    At my co. we capture requirements in use case text format. Use cases are written to fair level of details.

    While using Use Case Point Estimation, we need to classify the use cases into categories like Simple, Average and Complex.

    I have come across criterion like transactions, analysis classes and scenarios/flows to classify the use case. But I am not sure what is the correct way.

    Even in case of transactions, it is not very clear what constitutes a transaction.

    An use case consists of basic and alternate flows and these flows have steps.
    I have found it difficult to have common criteria to decide the number of transactions in given flow.
    I have come across different criteria like: Transaction is
    - all or nothing
    - exchange between user and system
    - adds business value
    - step in goal-level use case

    I would highly appreciate if you could advise me on how to calculate number of transactions.

    Thanking you in advance.

    Sachin Raverkar

    ReplyDelete
  7. Hi Sachin,

    Estimating sizes of uses cases is definitely one of the biggest challenges we face using systems like this. I've read a lot about what constitutes a use case transaction and this is generally what I use.

    A transaction is a unique path through your use case. So in it's simplest sense, you might be able to just count your main success scenario and all your alternate flows. Unfortunately, I constantly find use case drafts with conditional logic in an alternate path. In that case I count each outcome as a separate transaction.

    One of the nice things about Use Case Estimation is that the precision of your count can be pretty coarse. All you're really looking to do is get your estimate clearly into one of the categories, so you can miss a few or add a few "transactions" and not blow your estimate.

    For what it's worth, I really dislike the use of the word "transaction" when describing use cases. It's used so many other places in programming that it always introduces some confusion into the discussion.

    Good Luck!

    ReplyDelete
  8. Thank you Mike for this post. I'm usin Enterprise Architect too for my use cases.
    I'm working on PHP projects.
    I would like to know what the TCF constant (0.01) and the weighting factor (0.6) are.

    My problem is: I have 4 easy use cases
    the estimated effot result is 80 hours.
    8 hours a day
    10 UCP
    unadjusted TCF: 14,9 (the values estimated between 0 and 5)
    -> TCF = 0,749
    unadjusted ECF: 24.25
    EWF: -0,03
    EConstant: 1.4
    -> ECF = 0,6725

    For these use cases my developer experience tells me I would take 40 hours to manage this.

    Should I change the constants or slighly reduce my TCF and ECF values?

    thanks
    Psy

    ReplyDelete