My father, Woody, is a risk analyst. So I asked him, knowing my math skills, where should I start learning about how to analyse and assess risk. With the personal commentary removed, here’s his answer.
Math is not very important, at least not at the beginning. Risk assessment is really just thinking hard about answers to the 3 fundemental questions: what can go wrong; how likely is it; what are the consequences?
Look at what you do at work. How can good answers to the three questions mitigate the (bad) consequences of poor decisions?
Do a pilot study with an up-coming decision.
Remember that what can go wrong? means an analysis from a choice or intiating event (like a 3-day power black-out in Chicago) of the sequence of events and failures of systems to control the events, bad human decisions, etc. Each sequence ends up in a bad situation or an ok situation. How likely is it? is just the likelihood of that sequence occuring, usually measured by a probability for each event in the sequence, either through data or expert judgement. What are the consequences? means that for every bad ending of a sequence, what are the consequences of that bad state.
Make a decision-tree or event tree to enumerate the sequences. Each branching point (or top event) can have a fault tree to represent how that branch point fails or succeeds, or just expert judgement.
Represent the likelihood of failure as a number between 0 and 1 (then success will equal 1-failure).
Choose an end state for each sequence. Multiply the numbers for each branch point to get the likelihood of the sequence.
Add up all the sequence likelihoods for the same end-state.
That’s all there is to it.
When you put it that way, it does look pretty simple.
So I went through a proof of contcept process. This is my first time making a fault tree, and I didn’t bounce it off my father.

As you can see, this is pretty basic. What can go wrong? A lot actually, and I wasn’t really doing more than picking the common problems. But this is a fault tree, not a decision tree. Are they different? They are! A fault tree is basically what you use to suss out why things go wrong. A decision tree though, we make a list of decisions and spin out what the liklihood of a failure is. So my decision here is “How should I upgrade WP? Stable or Pre-Release?

Here you’ll see this is a similar enough, but wait! I have funny numbers! That’s my guesstimate at how likely these are to cause problems. See if you don’t have high tech skills, using SVN to upgrade is higher risk. In this world, you want a lower number. Like if you look at the stable release, you’d see that it adds up to a .4 failure, or a 1% that it’ll fail because of the upgrade tool or the user’s tech skills, but a higher 2% for ‘breaks’ (by which I mean you have a crappy plugin or theme).
Now I left off things like for SVN/Nightly/Beta/RC you get the cool toys early, mostly for space and since this is a poof of concept. It’s clear that SVN is something only experienced people should play with, but it’s very possible I’ve scored Beta/RC too high. They’re sort of a break-even point, though. While Stable will always be recommended, I did a quick revamp of Nightly and Beta/RC. Nightly’s are more risky because you run a risk of getting an incomplete build (that is, some of my bored maniac friends may be checking in code, and not be 100% done when you run your update – a common weird issue with SVN and why I always svn up
before I consider reporting a bug). But a Beta/RC is a ‘very nearly done’ cake, just missing the icing.

Version two is, you can see, very similar. Personally I consider this a ‘start’ to understanding the risks inherent in a WordPress upgrade. If you held a gun on me and demanded I explain where I got the numbers, I would call them educated guesses, based on the forums, the mailing lists and my personal experience. Dad would say ‘Expert Judgement.’
My next steps are to read up more on the process of using decision trees, directly in relation with software. While I certainly will also be looking into how a tornado in downtown Chicago would impact my office (can I get to work? No? Okay, so VPN. Can it take 5,000 people at once? Based on Snowmageddon last year, no. etc etc and so on), understanding the logic trees behind the forms is always my first step.
To my WordPress friends, please let me know if I scored things too high or low in this one! To the rest of you, if you use these sorts of things in your jobs and, if so, how. I’d love to see some real-world applications outside the financial world!