Data handling: Use experiments, simulation and probability distribution to set and explore probability models

Unit 4: Complete contingency tables to solve probability problems

Natashia Bearam-Edmunds

Unit outcomes

By the end of this unit you will be able to:

• Draw and complete contingency tables.
• Use contingency tables to solve probability problems.

What you should know

Before you start this unit, make sure you can:

Introduction

A two-way contingency table is also called a two-way table or a cross tabulation table. Contingency tables are used to examine relationships between categorical data. ‘Contingency’ means ‘possibilities’. Contingency tables help you work out all the possible outcomes of combined events.

Constructing contingency tables

Contingency tables are especially helpful for figuring out whether events are dependent or independent.

In a two-way contingency table one event is written down the side of the table and the other event is written along the top of the table. The results are written in the cells of the table. The combined result of the activities is found by working two ways: across and then down.

We count the number of outcomes for two events and their complements, when working with two-way contingency tables, making four events in total. A two-way contingency table always shows the counts for the four possible combinations of events, as well as the totals for each event and its complement.

Example 4.1

A coin is tossed and a die is rolled simultaneously. Draw a table to show all possible pairs of outcomes. From your table calculate the probability of getting a head and a two and state if these events are independent.

Solution

When you toss a coin there are two possible outcomes heads (H) or tails (T). When you roll a die there are six possible outcomes, $\scriptsize \{1;\text{ }2;\text{ }3;\text{ 4};\text{ 5};\text{ 6 }\!\!\}\!\!\text{ }$.

We will draw up the contingency table with the event rolling a die at the top of the table and the event toss a coin along the side.

 Roll a die $\scriptsize 1$ $\scriptsize 2$ $\scriptsize 3$ $\scriptsize 4$ $\scriptsize 5$ $\scriptsize 6$ Toss a coin $\scriptsize \displaystyle \text{H}$ $\scriptsize \displaystyle \text{H; }1$ $\scriptsize \displaystyle \text{H; 2}$ $\scriptsize \displaystyle \text{H; 3}$ $\scriptsize \displaystyle \text{H; 4}$ $\scriptsize \displaystyle \text{H; 5}$ $\scriptsize \displaystyle \text{H; 6}$ $\scriptsize \displaystyle \text{T}$ $\scriptsize \displaystyle \text{T; }1$ $\scriptsize \displaystyle \text{T; 2}$ $\scriptsize \displaystyle \text{T; 3}$ $\scriptsize \displaystyle \text{T; 4}$ $\scriptsize \displaystyle \text{T; 5}$ $\scriptsize \displaystyle \text{T; 6}$

In the cells of the table we list the possible combination of outcomes. There are $\scriptsize 12$ outcomes in total.

To find the probability of getting a head and a two, we find the row with $\scriptsize \displaystyle \text{H}$ and the column with $\scriptsize 2$ and read across from the $\scriptsize \displaystyle \text{H}$ and down from $\scriptsize 2$, the cell where they meet shows the outcome $\scriptsize \displaystyle \text{H; 2}$ . This outcome occurs once.

$\scriptsize P(\text{H; 2})=\displaystyle \frac{1}{{12}}$

For independent events:

$\scriptsize \displaystyle P(\text{A and B})=P(A)\cdot P(B)$

From the table we see that:

\scriptsize \begin{align*}P(\text{H)}&=\displaystyle \frac{6}{{12}}\\&=\displaystyle \frac{1}{2}\end{align*}

\scriptsize \begin{align*}P(2\text{)}&=\displaystyle \frac{2}{{12}}\\&=\displaystyle \frac{1}{6}\end{align*}

\scriptsize \begin{align*}P(\text{H)}\cdot P(2\text{)}&=\displaystyle \frac{1}{2}\cdot \displaystyle \frac{1}{6}\\&=\displaystyle \frac{1}{{12}}\end{align*}

We have shown that $\scriptsize P(\text{H; 2})=\displaystyle \frac{1}{{12}}$.

Since, $\scriptsize P(\text{H; 2})=P(\text{H})\cdot P(2)$the events are independent.

Example 4.2

Example taken from from Siyavula Maths Grade 12

The table below shows the results of testing two different treatments on $\scriptsize \displaystyle ~240$ fruit trees which have a disease causing the trees to die. Treatment A involves the careful removal of infected branches and treatment B involves removing infected branches as well as spraying the trees with antibiotics.

 Tree dies within four years Tree lives for more than four years TOTAL Treatment A $\scriptsize 70$ $\scriptsize 50$ Treatment B TOTAL $\scriptsize 90$ $\scriptsize 150$
1. Fill in the missing values on the table.
2. What is the probability a tree received treatment B?
3. What is the probability that a tree will live beyond four years?
4. What is the probability that a tree is given treatment B and lives beyond four years?
5. Of the trees that were given treatment B, what is the probability that a tree lives beyond four years?
6. Are a tree given treatment B and living beyond four years independent events? Justify your answer with a calculation.

Solutions

1. Since each column has to add up to its total, we can work out the number of trees which fall into each category for treatments A and B. Then, we can add each row to get the totals on the right-hand side of the table.
 Tree dies within four years Tree lives for more than four years TOTAL Treatment A $\scriptsize 70$ $\scriptsize 50$ $\scriptsize 120$ Treatment B $\scriptsize 20$ $\scriptsize 100$ $\scriptsize 120$ TOTAL $\scriptsize 90$ $\scriptsize 150$ $\scriptsize 240$
2. The probability that treatment B is given to a tree is the number of trees that received treatment B divided by the total number of trees.
\scriptsize \begin{align*}P(\text{B})&=\displaystyle \frac{{120}}{{240}}\\&=\displaystyle \frac{1}{2}\end{align*}
3. To find the probability that a tree lives beyond four years, we use the total of the column ‘tree lives for more than four years’ and divide that by the total number of trees.
\scriptsize \begin{align*}P(\text{tree lives more than 4 years })&=\displaystyle \frac{{150}}{{240}}\\&=\displaystyle \frac{5}{8}\end{align*}
4. To determine the probability that a tree receives treatment B and lives beyond four years, we must find the cell within the table that shows the combination of these events (row $\scriptsize 3$, column $\scriptsize 3$) and divide that by the total number of trees.
\scriptsize \begin{align*}P(\text{B and tree lives more than 4 years })&=\displaystyle \frac{{100}}{{240}}\\&=\displaystyle \frac{5}{{12}}\end{align*}
5. Here, we are restricted to only the trees that received treatment B, living beyond four years. This means we no longer need to include the trees given treatment A, so the denominator needs to be adjusted accordingly.
\scriptsize \begin{align*}P(\text{lives beyond 4 years having received B})&=\displaystyle \frac{{100}}{{120}}\\&=\displaystyle \frac{5}{6}\end{align*}
6. .
\scriptsize \begin{align*}P(\text{tree lives more than 4 years })\times P(\text{B)}&=\displaystyle \frac{5}{8}\times \displaystyle \frac{1}{2}\\&=\displaystyle \frac{5}{{16}}\end{align*}
$\scriptsize P(\text{B and tree lives more than 4 years })=\displaystyle \frac{5}{{12}}$
So we see that $\scriptsize P(\text{B and tree lives more than 4 years })\ne P(\text{B})\cdot P(\text{lives more than 4 years})$. Therefore, the treatment of a tree with treatment B and living beyond four years are dependent events.

Exercise 4.1

1. Use the contingency table below to answer the following questions:
 Brown eyes Not brown eyes TOTAL Black hair $\scriptsize 50$ $\scriptsize 30$ $\scriptsize 80$ Red hair $\scriptsize 70$ $\scriptsize 80$ $\scriptsize 150$ TOTAL $\scriptsize 120$ $\scriptsize 110$ $\scriptsize 230$
1. What is the probability that someone with black hair has brown eyes?
2. What is the probability that someone has black hair?
3. What is the probability that someone has brown eyes?
4. Are having black hair and having brown eyes dependent or independent events?
5. What is the probability of having brown eyes or red hair?
2. You are given the following information:
• Events A and B are independent.
• $\scriptsize P(\text{not A})=0.3$
• $\scriptsize P(\text{B})=0.4$

Complete the contingency table below.

 A Not A TOTAL B Not B TOTAL $\scriptsize 50$
3. A new treatment for influenza (the flu) was tested on a number of patients to determine if it was better than a placebo (a pill with no therapeutic value). The table below shows the results three days after treatment:
 Flu No flu TOTAL Placebo $\scriptsize 228$ $\scriptsize 60$ Treatment TOTAL $\scriptsize 240$ $\scriptsize 312$
1. Complete the table.
2. Calculate the probability of a patient receiving the treatment.
3. Calculate the probability of a patient having no flu after three days.
4. Calculate the probability of a patient receiving the treatment and having no flu after three days.
5. Using a calculation, determine whether a patient receiving the treatment and having no flu after three days are dependent or independent events.
6. Calculate the probability that a patient receiving treatment will have no flu after three days.
7. Calculate the probability that a patient receiving a placebo will have no flu after three days.
8. Comparing you answers in f. and g., would you recommend the use of the new treatment for patients suffering from influenza?

The full solutions are at the end of the unit.

Summary

In this unit you have learnt the following:

• How to construct a two-way contingency table.
• How to complete a two-way contingency table.
• How to calculate probabilities from a contingency table.

Unit 4: Assessment

Suggested time to complete: 25 minutes

1. Researchers conducted a study to test how effective a certain inoculation is at preventing malaria. Part of their data is shown below:
 Malaria No malaria TOTAL Male A B $\scriptsize 216$ Female C D $\scriptsize 648$ TOTAL $\scriptsize 108$ $\scriptsize 756$ $\scriptsize 864$
1. Calculate the probability that a randomly selected study participant will be female.
2. Calculate the probability that a randomly selected study participant will have malaria.
3. If being female and having malaria are independent events, calculate the value C.
4. Using the value of C, fill in the missing values on the table.
2. A rare kidney disease affects only one in $\scriptsize \displaystyle 1\text{ }000$ people and the test for this disease has a $\scriptsize \displaystyle 99\%$ accuracy rate.
1. Draw a two-way contingency table showing the results if $\scriptsize \displaystyle 100\text{ }000$ of the general population are tested.
2. Calculate the probability that a person who tests positive for this rare kidney disease is sick with the disease, correct to two decimal places.
3. The Clueless Club consists of $\scriptsize \displaystyle 500$ members. In order to be part of this club you have to be a lawyer, a teacher or an engineer. Given below is an incomplete contingency table that shows the distribution of $\scriptsize \displaystyle 500$ members, in terms of their profession and the type of hot drinks they drink.
 Tea Coffee Hot chocolate TOTAL Lawyer $\scriptsize 52$ $\scriptsize 41$ $\scriptsize 80$ $\scriptsize 173$ Teacher $\scriptsize 48$ $\scriptsize 69$ A $\scriptsize 137$ Engineer $\scriptsize 100$ B $\scriptsize 10$ $\scriptsize 190$ TOTAL $\scriptsize 200$ $\scriptsize 190$ $\scriptsize 110$ $\scriptsize 500$
1. Determine the values of A and B.
2. If a member is selected at random, what is the probability of selecting a teacher who drinks coffee?
3. If a member is selected at random, what is the probability of selecting an engineer or a person who drinks hot chocolate?

The full solutions are at the end of the unit.

Unit 4: Solutions

Exercise 4.1

1. .
1. .
\scriptsize \begin{align*}P(\text{with black hair has brown eyes})&=\displaystyle \frac{{50}}{{80}}\\&=\displaystyle \frac{5}{8}\end{align*}
2. .
\scriptsize \begin{align*}P(\text{black hair})&=\displaystyle \frac{{80}}{{230}}\\&=\displaystyle \frac{8}{{23}}\end{align*}
3. .
\scriptsize \begin{align*}P(\text{brown eyes})&=\displaystyle \frac{{120}}{{230}}\\&=\displaystyle \frac{{12}}{{23}}\end{align*}
4. .
\scriptsize \begin{align*}P(\text{black hair})\cdot P(\text{brown eyes})&=\displaystyle \frac{8}{{23}}\cdot \displaystyle \frac{{12}}{{23}}\\&=\displaystyle \frac{{96}}{{529}}\\P(\text{black hair and brown eyes})&=\displaystyle \frac{{50}}{{230}}\\&=\displaystyle \frac{5}{{23}}\\P(\text{black hair and brown eyes})&\ne P(\text{black hair})\cdot P(\text{brown eyes})\end{align*}
The events are dependent.
5. The probability of having brown eyes or red hair is the union of the two events.
\scriptsize \begin{align*}P\text{(brown eyes or red hair})&=P\text{(brown eyes})+P(\text{red hair})-P\text{(brown eyes and red hair})\\&=\displaystyle \frac{{120}}{{230}}+\displaystyle \frac{{150}}{{230}}-\displaystyle \frac{{70}}{{230}}\\&=\displaystyle \frac{{200}}{{230}}\\&=\displaystyle \frac{{20}}{{23}}\end{align*}
2. .
\scriptsize \begin{align*}P(\text{not A})&=0.3\\\therefore n(\text{not A})&=0.3\times 50\\&=15\end{align*}
\scriptsize \displaystyle \begin{align*}P(\text{B})&=0.4\\\therefore n(\text{B})&=0.4\times 50\\&=20\end{align*}
.
\scriptsize \displaystyle \begin{align*}P(\text{A and B})&=P(\text{A})\cdot P(\text{B})\\&=\displaystyle \frac{{35}}{{50}}\times 0.4\\&=\displaystyle \frac{7}{{25}}\\\therefore n(\text{A and B})&=\displaystyle \frac{7}{{25}}\times 50\\&=14\end{align*}

 A Not A TOTAL B $\scriptsize 14$ $\scriptsize 6$ $\scriptsize 20$ Not B $\scriptsize 21$ $\scriptsize 9$ $\scriptsize 30$ TOTAL $\scriptsize 35$ $\scriptsize 15$ $\scriptsize 50$
3. .
1. .
 Flu No flu TOTAL Placebo $\scriptsize 228$ $\scriptsize 60$ $\scriptsize 288$ Treatment $\scriptsize 12$ $\scriptsize 252$ $\scriptsize 264$ TOTAL $\scriptsize 240$ $\scriptsize 312$ $\scriptsize 552$
2. .
\scriptsize \begin{align*}P(\text{receiving the treatment})&=\displaystyle \frac{{264}}{{552}}\\&=\displaystyle \frac{{11}}{{23}}\end{align*}
3. .
\scriptsize \begin{align*}P(\text{no flu})&=\displaystyle \frac{{312}}{{552}}\\&=\displaystyle \frac{{13}}{{23}}\end{align*}
4. .
\scriptsize \begin{align*}P(\text{receiving the treatment and having no flu})&=\displaystyle \frac{{252}}{{552}}\\&=\displaystyle \frac{{21}}{{46}}\end{align*}
5. .
\scriptsize \displaystyle \begin{align*}P(\text{receiving the treatment})\cdot P(\text{no flu})&=\displaystyle \frac{{11}}{{23}}\cdot \displaystyle \frac{{13}}{{23}}\\&=\displaystyle \frac{{143}}{{529}}\\P(\text{receiving the treatment and having no flu})&=\displaystyle \frac{{21}}{{46}}\end{align*}
$\scriptsize \displaystyle P(\text{receiving the treatment and having no flu})\ne P(\text{receiving the treatment})\cdot P(\text{no flu})$
Therefore, a patient receiving the treatment and having no flu after three days are dependent events.
6. .
\scriptsize \begin{align*}P(\text{having no flu if received the treatment})&=\displaystyle \frac{{252}}{{264}}\\&=\displaystyle \frac{{21}}{{22}}\end{align*}
7. .
\scriptsize \begin{align*}P(\text{having no flu if received the placebo})&=\displaystyle \frac{{60}}{{288}}\\&=\displaystyle \frac{5}{{24}}\end{align*}
8. Yes, I would recommend the use of the new treatment for patients with the flu as there is a $\scriptsize 95.5\%$ chance of not getting the flu if they have been treated compared to the $\scriptsize 20.8\%$ of getting the flu if they were given the placebo.

Back to Exercise 4.1

Unit 4: Assessment

1. .
1. .
\scriptsize \begin{align*}P(\text{female})&=\displaystyle \frac{{648}}{{864}}\\&=\displaystyle \frac{3}{4}\end{align*}
2. .
\scriptsize \begin{align*}P(\text{malaria})&=\displaystyle \frac{{108}}{{864}}\\&=\displaystyle \frac{1}{8}\end{align*}
3. .
\scriptsize \begin{align*}P(\text{female and malaria})&=P(\text{female)}\cdot P(\text{malaria})\text{ since these are independent events }\\&=\displaystyle \frac{3}{4}\cdot \displaystyle \frac{1}{8}\\\therefore \text{c}&=\displaystyle \frac{3}{{32}}\times 864\\&=81\end{align*}
4. .
 Malaria No malaria TOTAL Male $\scriptsize 27$ $\scriptsize 189$ $\scriptsize 216$ Female $\scriptsize 81$ $\scriptsize 567$ $\scriptsize 648$ TOTAL $\scriptsize 108$ $\scriptsize 756$ $\scriptsize 864$
2. .
1. .
\scriptsize \begin{align*}\text{Number of people with disease }&=\displaystyle \frac{1}{{1\text{ }000}}\times 100\text{ }000\\&=100\end{align*}
.
Test for this disease has $\scriptsize \displaystyle 99\%$ accuracy rate:
\scriptsize \begin{align*}\text{Number of people with disease and test is positive }&=0.99\times 100\\&=99\end{align*}
\scriptsize \begin{align*}\text{Number of people with no disease and test is negative }&=0.99\times 99\text{ 9}01\\&=98\text{ }901\end{align*}

 Disease No disease TOTAL Test is positive $\scriptsize 99$ $\scriptsize 999$ $\scriptsize 1\text{ }098$ Test is negative $\scriptsize 1$ $\scriptsize 98\text{ }901$ $\scriptsize 98\text{ }902$ TOTAL $\scriptsize 100$ $\scriptsize 99\text{ }900$ $\scriptsize 100\text{ }000$
2. .
\scriptsize \begin{align*}P(\text{have the disease if test is positive)}&=\displaystyle \frac{{99}}{{1\text{ }098}}\\&=\displaystyle \frac{{11}}{{122}}\\&=0.09\end{align*}
3. .
1. .
\scriptsize \begin{align*}\text{A}&=137-48-69\\&=20\end{align*}
\scriptsize \begin{align*}\text{B}&=190-41-69\\&=80\end{align*}
2. .
$\scriptsize P(\text{teacher and coffee)}=\displaystyle \frac{{69}}{{500}}$
3. .
\scriptsize \begin{align*}P(\text{engineer or hot chocolate})&=P(\text{engineer})+P(\text{hot chocolate})\\&-P(\text{engineer and hot chocolate})\\&=\displaystyle \frac{{190}}{{500}}+\displaystyle \frac{{110}}{{500}}-\displaystyle \frac{{10}}{{500}}\\&=\displaystyle \frac{{290}}{{500}}\\&=\displaystyle \frac{{29}}{{50}}\end{align*}

Back to Unit 4: Assessment