skip navigation
student.bmj.com

Measuring chances: Wai-Ching Leung explains odds and probabilities


Introduction

In my latest article in the studentBMJ, I noted that statistics are useful when events are not entirely predictable.1 An important use of medical statistics is to measure this level of certainty and estimate how likely certain events are to happen. One clinical example is our attempt to estimate from our clinical assessment how likely it is that a patient is suffering from a given disease. If a 70 year old woman is admitted to an emergency department with central chest pain but with a normal electrocardiogram, how likely is it that has she had a heart attack? Another example relates to hypothesis testing. Suppose we wish to find out whether brain tumours are associated with use of mobile phones. But no matter how many people we get data on, we can never be absolutely certain. Instead we measure the level of certainty by estimating how likely the data we obtain would have occurred if brain tumours were not associated with mobile phone use.

I will discuss these two specific examples in more detail in future articles. In this article, I will look at how chance can be measured, some examples and one pitfall in interpretation that should be avoided.

Probability and odds

Chance can be measured in two ways. Probability is the more commonly used measure, and almost all medical students should have come across it. We can think of it as the number of ways an event can occur compared with the total number of possible outcomes. Suppose we throw a dice, we expect the probability that it will land on a "1" is 1¼6, that it is an even number is 3¼6=1¼2, and that it is greater than 2 is 4¼6=2¼3. Sometimes, we may estimate probability by observations-the number of events to the total number of possible outcomes. For example, if we throw a dice 600 times and it lands on an odd number 306 times, we estimate the probability of an even number to be 306¼600=51%. Probability can range from 0 (impossible) to 1 (certain to happen).

In the past, people tended to be more familiar with the concept of odds than probability. Perhaps, this is why we hear "the odds are against you" rather than "the probabilities are against you." Odds are the number of events to the number of non-events. So, the odds of a dice landing on "1" are 1/5, that it is an even number is 1/1=1 and that it is greater than 2 is 4/2=2. Odds can range from 0 (impossible) to infinity (certain to occur). Odds of 1 means that it is equally likely whether the event will occur or not. Though less often used than probability, odds are occasionally more useful in medical statistics, and I shall illustrate it later in this article

Diagram 1

You will see from the equation above that for a given value of probability, you can work out from first principles the corresponding value of the odds and vice versa. For example, a probability of 3¼4 means that out of a total of four possible outcomes, the event occurs on average three times but does not occur once. So, odds=3¼1=3. Fig 1 shows the relation between probability and odds.

Subjective factors in interpreting chances

Probability and odds are objective measures of chances, but it is important to note that we do not make decisions from such values alone. For example, should we prescribe a drug which has a 10% probability of inducing a side effect? It depends on several factors. We are more willing to prescribe if the side effect is minor (for example, skin rashes) than if it is serious (for example, death). Conversely, we are more willing to prescribe if the underlying disease is serious (for example, cancer) than if it is trivial (for example, a cold). Furthermore, some patients are more willing to accept a given chance of a side effect than others. If we tell our patients with a throat infection that there is a 10% chance that an antibiotic may cause nausea and vomiting, some patients may wish to take the risk whereas others do not.

Adjustment of our estimate of chance

Rather than an isolated event, we usually deal with a sequence of events in medical decision making. Suppose a woman presents with chest pain. When we take a history, she may or may not have associated clinical features of a heart attack. The electrocardiogram may or may not be abnormal. Subsequent cardiac enzymes may or may not be elevated. Clearly, our estimates of her chance of having suffered a heart attack will be adjusted and improved on with the extra information at each stage.

Example 1

Let us start off with a simple example using probability as a measure of chance. Suppose you pick one card from a deck of cards at random. The probability that it is the king of hearts is 1/52. Now, suppose your friend tells you that it is a red card. The probability that it is the king of hearts is now increased to 1/26, since there are only 26 red cards. Similarly, if she now tells you that it is a heart, the probability is now further adjusted to 1¼13. With the extra information at each stage, you are able to improve your estimate. If we use KH, R, and H to represent king of hearts, red card, and hearts, respectively, we can simply write our results as: P(KH)=1/52P (KH/R)=1/26P (KH/H)=1/13

The symbol/stands for "given that." For example, P(KH/R) stands for "the probability that it is a king of hearts given that it is a red card."

Probability trees

Probability trees are useful for slightly more complicated examples.

Example 2

Let’s imagine at the end of your first year, 90% of all students pass all their summary examinations in May. You know from experience that among these successful students, about 80% leave the university city for their summer holidays and only 20% stay behind. Among those who are unsuccessful in one or more of their examinations, 90% stay in their university city to prepare for their resits, and only 10% leave. Suppose you stay behind in summer to continue your part time job and happen to see a fellow student in your year in your university town. What is the probability that he was unsuccessful in one or more of his first year examinations?

There are two sequential events, the examination results and whether the student stays behind. The probability tree is as follows.

All students can be placed into one of these four categories. The proportion of students in each of these categories can be calculated by multiplying the values of the two probabilities. For example, in the first category, out of 100 students, 90% (90 students) pass all papers. Out of the 90 students, 20% (20% x 90=18 students) stay behind.

Fig 1

How can diagram 1 help you to estimate the probability that your fellow student was unsuccessful in one or more of his papers? Before you meet him in the city, his probability of being unsuccessful was 10%, since this is the overall proportion of students who are unsuccessful. But you meet him in the city. This information should increase your estimate of the probability that he has been unsuccessful. From the probability tree, the fact that he stays over summer means that he can only be in one of the two categories highlighted in yellow. Out of 100 students, he can only be one of the 27 students (18 students who passed and 9 students who were unsuccessful). Therefore, your revised estimate that he was unsuccessful is 9/27=33.3%.

Example 3

Can example 2 be applied to clinical decision making?

Imagine you are in general practice. From past experience, you know that out of all men presenting with clinical features of depression, about 10% have an underlying alcohol problem. You were taught to use a simple questionnaire (similar to the cut down, annoyed, guilt, eyeopener (CAGE) questionnaire) to identify patients with an alcohol problem. Previous research shows that of all patients with alcohol problems, 90% are correctly picked up by the questionnaire. Of all patients without alcohol problems, 80% are correctly excluded by the questionnaire. You see a man with depression and his questionnaire results are positive. What is the probability that he has an underlying alcohol problem?

Diagram 2

In fact this clinical example is almost identical to example 2. There are two events: whether the patient has an alcohol problem and the questionnaire results.

Before the questionnaire, your estimate of his probability of having an underlying alcohol problem is 10%. But with a positive questionnaire result, he can only be in one of two categories highlighted in yellow. Therefore, his probability has increased to 9/27=33.3%.

Usefulness of odds as a measure of chance

I will now illustrate how odds can occasionally be very useful. In example 3, we have shown using a probability tree that our initial estimated probability of 1/10 that the patient has an alcohol problem increases to 1/3 with the subsequent knowledge of the questionnaire result.

If we had used odds instead of probability to measure chance in the first place, we could have obtained the same result without the trouble of having to draw a probability tree. One useful property of odds is that if we want to know the combined odds given several pieces of information, we can simply multiply the odds of each piece of information. In example 3, the initial odds of the patient having an alcohol problem from the knowledge that he has depression is 1/9 (out of 10 patients, one has an alcohol problem and nine do not). The second piece of information we have is that he has a positive questionnaire result (diagram 3.) Now, since 90% of patients with an alcohol problem have a positive questionnaire result but only 20% of patients without an alcohol problem do, the odds of a person with a positive questionnaire result having alcohol problem is 90%/20%=9/2. Therefore the combined odds with the knowledge that the questionnaire result is positive is 1/9 x 9/2=1/2. In other words, for every one patient who has an alcohol problem, two do not. This result is precisely the same as a probability of 1/3 we found earlier using a probability tree.

A potential pitfall: the prosecutor's fallacy

In example 2, it is not uncommon to make the mistake that, since 90% of students who are unsuccessful in their exams stay behind, a student who stays behind has a 90% probability of having been unsuccessful. We note in example 2 above that, in fact, the correct probability is only 33.3%. This error is to confuse P(unsuccessful/stay) with P(stay/unsuccessful).

Diagram 3

Similar mistakes have been made in criminal courts relating to DNA evidence presented in rape cases. Scientists sometimes give evidence to the effect that if the defendant is innocent, the chance of obtaining the DNA match is one in a million. The prosecuting lawyer then invites the jury to draw the conclusion that since the DNA match is obtained, the probability that the defendant is innocent is only one in a million. This conclusion is not valid, and the probability of the defendant being innocent depends heavily on why the test was carried out in the first place. This is to confuse P(DNA match/innocent) with p(innocent/DNA match). This type of mistake is now known as the prosecutor's fallacy.

Wai-Ching Leung, locum general practitioner, Norwich
Email: wai_chingleung@hotmail.com


studentBMJ 2002;10:259-302 August ISSN 0966-6494

  1. Leung, WC. Why and when do we need medical statistics. studentBMJ 2002;10:227-8 (July).


Return to top    Next article
Printer friendly page    Download article PDF    Email this article to a friend