A sample space contains all the possible outcomes of an experiment. Sometimes we obtain some additional information about an experiment that tells us that the outcome comes from a certain part of the sample space. In this case, the probability of an event is based on the outcomes in that part of the sample space. A probability that is based on a part of a sample space is called aconditional probability. We explore this idea through some examples.
In Example 2.6 (in Section 2.1) we discussed a population of 1000 aluminum rods.
For each rod, the length is classified as too short, too long, or OK, and the diameter is classified as too thin, too thick, or OK. These 1000 rods form a sample space in which each rod is equally likely to be sampled. The number of rods in each category is presented in Table 2.1. Of the 1000 rods, 928 meet the diameter specification. Therefore, if a rod is sampled, P(diameter OK)=928/1000 =0.928. This probability is called theunconditional probability, since it is based on the entire sample space. Now assume that a rod is sampled, and its length is measured and found to meet the specification. What is the probability that the diameter also meets the specification? The key to computing this probability is to realize that knowledge that the length meets the specification reduces the sample space from which the rod is drawn. Table 2.2 (page 70) presents this idea.
Once we know that the length specification is met, we know that the rod will be one of the 942 rods in the sample space presented in Table 2.2.
TABLE 2.1Sample space containing 1000 aluminum rods Diameter
Length Too Thin OK Too Thick
Too Short 10 3 5
OK 38 900 4
Too Long 2 25 13
TABLE 2.2Reduced sample space containing 942 aluminum rods that meet the length specification
Diameter
Length Too Thin OK Too Thick
Too Short — — —
OK 38 900 4
Too Long — — —
Of the 942 rods in this sample space, 900 of them meet the diameter specification.
Therefore, if we know that the rod meets the length specification, the probability that the rod meets the diameter specification is 900/942. We say that theconditional probability that the rod meets the diameter specificationgiventhat it meets the length specification is equal to 900/942, and we write P(diameter OK|length OK)=900/942 =0.955.
Note that the conditional probability P(diameter OK | length OK) differs from the unconditional probabilityP(diameter OK), which was computed from the full sample space (Table 2.1) to be 0.928.
E xample 2.17 Compute the conditional probability P(diameter OK| length too long). Is this the same as the unconditional probabilityP(diameter OK)?
Solution
The conditional probabilityP(diameter OK|length too long)is computed under the assumption that the rod is too long. This reduces the sample space to the 40 items indicated in boldface in the following table.
Diameter
Length Too Thin OK Too Thick
Too Short 10 3 5
OK 38 900 4
Too Long 2 25 13
Of the 40 outcomes, 25 meet the diameter specification. Therefore P(diameter OK|length too long)= 25
40 =0.625
The unconditional probabilityP(diameter OK)is computed on the basis of all 1000 outcomes in the sample space and is equal to 928/1000 =0.928.In this case, the conditional probability differs from the unconditional probability.
Let’s look at the solution to Example 2.17 more closely. We found that P(diameter OK|length too long)= 25
40
In the answer 25/40, the denominator, 40, represents the number of outcomes that satisfy the condition that the rod is too long, while the numerator, 25, represents the number of outcomes that satisfy both the condition that the rod is too long and that its diameter is OK. If we divide both the numerator and denominator of this answer by the number of outcomes in the full sample space, which is 1000, we obtain
P(diameter OK|length too long)= 25/1000 40/1000
Now 40/1000 represents theprobabilityof satisfying the condition that the rod is too long. That is,
P(length too long)= 40 1000
The quantity 25/1000 represents theprobabilityof satisfying both the condition that the rod is too long and that the diameter is OK. That is,
P(diameter OK and length too long)= 25 1000 We can now express the conditional probability as
P(diameter OK|length too long)= P(diameter OK and length too long) P(length too long)
This reasoning can be extended to construct a definition of conditional probability that holds for any sample space:
Definition
LetAandBbe events withP(B)=0. The conditional probability ofAgivenBis P(A|B)= P(A∩B)
P(B) (2.14)
Figure 2.5 presents Venn diagrams to illustrate the idea of conditional probability.
A B A B
(b) (a)
FIGURE 2.5(a)The diagram represents the unconditional probabilityP(A).P(A)is illustrated by considering the event Ain proportion to the entire sample space, which is represented by the rectangle.(b)The diagram represents the conditional probability P(A|B). Since the event B is known to occur, the eventB now becomes the sample space. For the event Ato occur, the outcome must be in the intersection A∩B. The conditional probabilityP(A|B)is therefore illustrated by considering the intersection
A∩Bin proportion to the entire eventB.
E xample 2.18 Refer to Example 2.8 (in Section 2.1). What is the probability that a can will have a flaw on the side, given that it has a flaw on top?
Solution
We are given thatP(flaw on top)=0.03, andP(flaw on side and flaw on top)=0.01.
Using Equation (2.14),
P(flaw on side|flaw on top)= P(flaw on side and flaw on top) P(flaw on top)
= 0.01 0.03
=0.33
E xample 2.19 Refer to Example 2.8 (in Section 2.1). What is the probability that a can will have a flaw on the top, given that it has a flaw on the side?
Solution
We are given that P(flaw on side)= 0.02, and P(flaw on side and flaw on top) = 0.01. Using Equation (2.14),
P(flaw on top|flaw on side)= P(flaw on top and flaw on side) P(flaw on side)
= 0.01 0.02
=0.5
The results of Examples 2.18 and 2.19 show that in most cases,P(A|B)=P(B|A).
Independent Events
Sometimes the knowledge that one event has occurred does not change the probability that another event occurs. In this case the conditional and unconditional probabilities are the same, and the events are said to beindependent. We present an example.
E xample 2.20 If an aluminum rod is sampled from the sample space presented in Table 2.1, find P(too long)andP(too long|too thin). Are these probabilities different?
Solution
P(too long)= 40
1000 =0.040
P(too long|too thin)= P(too long and too thin) P(too thin)
= 2/1000 50/1000
=0.040
The conditional probability and the unconditional probability are the same. The information that the rod is too thin does not change the probability that the rod is too long.
Example 2.20 shows that knowledge that an event occurs sometimes does not change the probability that another event occurs. In these cases, the two events are said to beindependent. The event that a rod is too long and the event that a rod is too thin are independent. We now give a more precise definition of the term, both in words and in symbols.
Definition
Two events AandB areindependentif the probability of each event remains the same whether or not the other occurs.
In symbols:IfP(A)=0 andP(B)=0, thenAandBare independent if P(B|A)=P(B) or, equivalently, P(A|B)=P(A) (2.15) If eitherP(A)=0 orP(B)=0, thenAandBare independent.
IfAandBare independent, then the following pairs of events are also independent:
AandBc,AcandB, and AcandBc. The proof of this fact is left as an exercise.
The concept of independence can be extended to more than two events:
Definition
Events A1,A2, . . . ,An are independent if the probability of each remains the same no matter which of the others occur.
In symbols:EventsA1,A2, . . . ,Anare independent if for eachAi, and each collectionAj1, . . . ,Ajmof events withP(Aj1∩ ã ã ã ∩Ajm)=0,
P(Ai|Aj1∩ ã ã ã ∩Ajm)=P(Ai) (2.16)
The Multiplication Rule
Sometimes we knowP(A|B)and we wish to findP(A∩B). We can obtain a result that is useful for this purpose by multiplying both sides of Equation (2.14) by P(B). This leads to the multiplication rule.
IfAandBare two events withP(B)=0, then
P(A∩B)=P(B)P(A|B) (2.17) IfAandBare two events withP(A)=0, then
P(A∩B)=P(A)P(B|A) (2.18) IfP(A)=0 andP(B)=0, then Equations (2.17) and (2.18) both hold.
When two events are independent, thenP(A|B)=P(A)andP(B|A)=P(B), so the multiplication rule simplifies:
IfAandBare independent events, then
P(A∩B)=P(A)P(B) (2.19) This result can be extended to any number of events. If A1,A2, . . . ,An are independent events, then for each collectionAj1, . . . ,Ajm of events
P(Aj1∩Aj2∩ ã ã ã ∩Ajm)=P(Aj1)P(Aj2)ã ã ãP(Ajm) (2.20) In particular,
P(A1∩A2∩ ã ã ã ∩An)=P(A1)P(A2)ã ã ãP(An) (2.21)
E xample 2.21 A vehicle contains two engines, a main engine and a backup. The engine component fails only if both engines fail. The probability that the main engine fails is 0.05, and the probability that the backup engine fails is 0.10. Assume that the main and backup engines function independently. What is the probability that the engine component fails?
Solution
The probability that the engine component fails is the probability that both engines fail. Therefore
P(engine component fails)=P(main engine fails and backup engine fails) Since the engines function independently, we may use Equation (2.19):
P(main engine fails and backup engine fails)= P(main fails)P(backup fails)
=(0.10)(0.05)
=0.005
E xample 2.22 A system contains two components, A and B. Both components must function for the system to work. The probability that component A fails is 0.08, and the probability that component B fails is 0.05. Assume the two components function independently.
What is the probability that the system functions?
Solution
The probability that the system functions is the probability that both components function. Therefore
P(system functions)=P(A functions and B functions) Since the components function independently,
P(A functions and B functions)=P(A functions)P(B functions)
=[1−P(A fails)][1−P(B fails)]
=(1−0.08)(1−0.05)
=0.874
E xample 2.23 Of the microprocessors manufactured by a certain process, 20% are defective. Five microprocessors are chosen at random. Assume they function independently. What is the probability that they all work?
Solution
Fori =1, . . . ,5, letAidenote the event that theith microprocessor works. Then P(all 5 work)=P(A1∩A2∩A3∩A4∩A5)
=P(A1)P(A2)P(A3)P(A4)P(A5)
=(1−0.20)5
=0.328
E xample 2.24 In Example 2.23, what is the probability that at least one of the microprocessors works?
Solution
The easiest way to solve this problem is to notice that
P(at least one works)=1−P(all are defective) Now, letting Di denote the event that theith microprocessor is defective,
P(all are defective)=P(D1∩D2∩D3∩D4∩D5)
=P(D1)P(D2)P(D3)P(D4)P(D5)
=(0.20)5
=0.0003
ThereforeP(at least one works)=1−0.0003=0.9997.
Equations (2.19) and (2.20) tell us how to compute probabilities when we know that events are independent, but they are usually not much help when it comes to deciding whether two events reallyare independent. In most cases, the best way to determine whether events are independent is through an understanding of the process that produces the events. Here are a few illustrations:
■ A die is rolled twice. It is reasonable to believe that the outcome of the second roll is not affected by the outcome of the first roll. Therefore, knowing the outcome of the first roll does not help to predict the outcome of the second roll. The two rolls are independent.
■ A certain chemical reaction is run twice, using different equipment each time. It is reasonable to believe that the yield of one reaction will not affect the yield of the other. In this case the yields are independent.
■ A chemical reaction is run twice in succession, using the same equipment. In this case, it might not be wise to assume that the yields are independent. For example, a low yield on the first run might indicate that there is more residue than usual left behind. This might tend to make the yield on the next run higher. Thus knowing the yield on the first run could help to predict the yield on the second run.
■ The items in a simple random sample may be treated as independent, unless the population is finite and the sample comprises more than about 5% of the population (see the discussion of independence in Section 1.1).
The Law of Total Probability
The law of total probability is illustrated in Figure 2.6. A sample space contains the events A1,A2,A3, and A4. These events are mutually exclusive, since no two over- lap. They are alsoexhaustive, which means that their union covers the whole sample space. Each outcome in the sample space belongs to one and only one of the events A1,A2,A3,A4.
The event B can be any event. In Figure 2.6, each of the events Ai intersects B, forming the eventsA1∩B,A2∩B,A3∩B, andA4∩B. It is clear from Figure 2.6 that the events A1∩B,A2∩B,A3∩B, and A4∩Bare mutually exclusive and that they coverB. Every outcome inBbelongs to one and only one of the eventsA1∩B,A2∩B, A3∩B,A4∩B. It follows that
B=(A1∩B)∪(A2∩B)∪(A3∩B)∪(A4∩B)
A1 A2 B
A3 A4
A1 艚 B A4 艚 B
A2 艚 B A3 艚 B
FIGURE 2.6The mutually exclusive and exhaustive eventsA1,A2,A3,A4divide the eventBinto mutually exclusive subsets.
which is a union of mutually exclusive events. Therefore
P(B)=P(A1∩B)+P(A2∩B)+P(A3∩B)+P(A4∩B) Since P(Ai∩B)=P(B|Ai)P(Ai),
P(B)=P(B|A1)P(A1)+P(B|A2)P(A2)+P(B|A3)P(A3)+P(B|A4)P(A4) (2.22) Equation (2.22) is a special case of the law of total probability, restricted to the case where there are four mutually exclusive and exhaustive events. The intuition behind the law of total probability is quite simple. The eventsA1,A2,A3,A4break the eventBinto pieces. The probability ofBis found by adding up the probabilities of the pieces.
We could redraw Figure 2.6 to have any number of events Ai. This leads to the general case of the law of total probability.
Law of Total Probability
IfA1, . . . ,Anare mutually exclusive and exhaustive events, andBis any event, then
P(B)=P(A1∩B)+ ã ã ã +P(An∩B) (2.23) Equivalently, ifP(Ai)=0 for eachAi,
P(B)=P(B|A1)P(A1)+ ã ã ã +P(B|An)P(An) (2.24)
E xample
2.25 Customers who purchase a certain make of car can order an engine in any of three sizes. Of all cars sold, 45% have the smallest engine, 35% have the medium-sized one, and 20% have the largest. Of cars with the smallest engine, 10% fail an emissions test within two years of purchase, while 12% of those with the medium size and 15%
of those with the largest engine fail. What is the probability that a randomly chosen car will fail an emissions test within two years?
Solution
LetBdenote the event that a car fails an emissions test within two years. LetA1denote the event that a car has a small engine, A2 the event that a car has a medium-size engine, andA3the event that a car has a large engine. Then
P(A1)=0.45 P(A2)=0.35 P(A3)=0.20
The probability that a car will fail a test, given that it has a small engine, is 0.10. That is,P(B|A1)=0.10. Similarly, P(B|A2)=0.12, and P(B|A3)=0.15. By the law of total probability (Equation 2.24),
P(B)= P(B|A1)P(A1)+P(B|A2)P(A2)+P(B|A3)P(A3)
=(0.10)(0.45)+(0.12)(0.35)+(0.15)(0.20)
=0.117
Sometimes problems like Example 2.25 are solved with the use of tree diagrams.
Figure 2.7 presents a tree diagram for Example 2.25. There are three primary branches on the tree, corresponding to the three engine sizes. The probabilities of the engine sizes are listed on their respective branches. At the end of each primary branch are two secondary branches, representing the events of failure and no failure. The conditional probabilities
P(B ∩ A1) = P(B|A1)P(A1) = 0.045 Failure
Small
Medium
Lar ge
P(B|A1) = 0.10
P(A1) = 0.45
P(A
3) = 0.20 P(A2) = 0.35
P(Bc|A
1) = 0.90 No fail
ure
Failure P(B|A2) = 0.12
P(Bc|A
2) = 0.88 No fail
ure
Failure P(B|A3) = 0.15
P(Bc|A
3) = 0.85 No fail
ure
P(B ∩ A2) = P(B|A2)P(A2) = 0.042
P(B ∩ A3) = P(B|A3)P(A3) = 0.030
FIGURE 2.7Tree diagram for the solution to Example 2.25.
of failure and no failure, given engine size, are listed on the secondary branches. By multiplying along each of the branches corresponding to the event B = fail, we obtain the probabilitiesP(B|Ai)P(Ai). Summing these probabilities yieldsP(B), as desired.
Bayes' Rule
IfAandBare two events, we have seen that in most casesP(A|B)= P(B|A). Bayes’
rule provides a formula that allows us to calculate one of the conditional probabilities if we know the other one. To see how it works, assume that we knowP(B|A)and we wish to calculateP(A|B). Start with the definition of conditional probability (Equation 2.14):
P(A|B)= P(A∩B) P(B)
Now use Equation (2.18) to substitute P(B|A)P(A)forP(A∩B):
P(A|B)= P(B|A)P(A)
P(B) (2.25)
Equation (2.25) is essentially Bayes’ rule. When Bayes’ rule is written, the expression P(B)in the denominator is usually replaced with a more complicated expression derived from the law of total probability. Specifically, since the events Aand Acare mutually exclusive and exhaustive, the law of total probability shows that
P(B)=P(B|A)P(A)+P(B|Ac)P(Ac) (2.26) Substituting the right-hand side of Equation (2.26) for P(B)in Equation (2.25) yields Bayes’ rule. A more general version of Bayes’ rule can be derived as well, by consider- ing a collection A1, . . . ,An of mutually exclusive and exhaustive events and using the law of total probability to replace P(B)with the expression on the right-hand side of Equation (2.24).
Bayes' Rule
Special Case: Let A and B be events with P(A) = 0, P(Ac) = 0, and P(B)=0. Then
P(A|B)= P(B|A)P(A)
P(B|A)P(A)+P(B|Ac)P(Ac) (2.27) General Case: LetA1, . . . ,An be mutually exclusive and exhaustive events withP(Ai)=0 for eachAi. LetBbe any event withP(B)=0. Then
P(Ak|B)=nP(B|Ak)P(Ak)
i=1P(B|Ai)P(Ai) (2.28)
Example 2.26 shows how Bayes’ rule can be used to discover an important and surprising result in the field of medical testing.
E xample 2.26 The proportion of people in a given community who have a certain disease is 0.005.
A test is available to diagnose the disease. If a person has the disease, the probability that the test will produce a positive signal is 0.99. If a person does not have the disease, the probability that the test will produce a positive signal is 0.01. If a person tests positive, what is the probability that the person actually has the disease?
Solution
LetDrepresent the event that the person actually has the disease, and let+represent the event that the test gives a positive signal. We wish to findP(D|+). We are given the following probabilities:
P(D)=0.005 P(+|D)=0.99 P(+|Dc)=0.01 Using Bayes’ rule (Equation 2.27),
P(D|+)= P(+|D)P(D)
P(+|D)P(D)+P(+|Dc)P(Dc)
= (0.99)(0.005)
(0.99)(0.005)+(0.01)(0.995)
=0.332
In Example 2.26, only about a third of the people who test positive for the disease actually have the disease. Note that the test is fairly accurate; it correctly classifies 99%
of both diseased and nondiseased individuals. The reason that a large proportion of those who test positive are actually disease-free is that the disease is rare—only 0.5% of the population has it. Because many diseases are rare, it is the case for many medical tests that most positives are false positives, even when the test is fairly accurate. For this reason, when a test comes out positive, a second test is usually given before a firm diagnosis is made.
E xample
2.27 Refer to Example 2.25. A record for a failed emissions test is chosen at random. What is the probability that it is for a car with a small engine?
Solution
Let Bdenote the event that a car failed an emissions test. Let A1denote the event that a car has a small engine,A2the event that a car has a medium-size engine, and A3the event that a car has a large engine. We wish to findP(A1|B). The following probabilities are given in Example 2.25:
P(A1)=0.45 P(B|A1)=0.10
P(A2)=0.35 P(B|A2)=0.12
P(A3)=0.20 P(B|A3)=0.15
By Bayes’ rule,
P(A1|B)= P(B|A1)P(A1)
P(B|A1)P(A1)+P(B|A2)P(A2)+P(B|A3)P(A3)
= (0.10)(0.45)
(0.10)(0.45)+(0.12)(0.35)+(0.15)(0.20)
=0.385
Application to Reliability Analysis
Reliability analysis is the branch of engineering concerned with estimating the failure rates of systems. While some problems in reliability analysis require advanced mathe- matical methods, there are many problems that can be solved with the methods we have learned so far. We begin with an example illustrating the computation of the reliability of a system consisting of two components connectedin series.
E xample 2.28 A system contains two components, A and B, connected in series as shown in the following diagram.
A B
The system will function only if both components function. The probability that A functions is given by P(A)=0.98, and the probability that B functions is given by P(B)=0.95. Assume that A and B function independently. Find the probability that the system functions.
Solution
Since the system will function only if both components function, it follows that P(system functions)= P(A∩B)
= P(A)P(B)by the assumption of independence
=(0.98)(0.95)
=0.931
Example 2.29 illustrates the computation of the reliability of a system consisting of two components connectedin parallel.
E xample 2.29 A system contains two components, C and D, connected in parallel as shown in the following diagram.
C
D
The system will function if either C or D functions. The probability that C func- tions is 0.90, and the probability that D functions is 0.85. Assume C and D function independently. Find the probability that the system functions.
Solution
Since the system will function so long as either of the two components functions, it follows that
P(system functions)=P(C∪D)
=P(C)+P(D)−P(C∩D)
=P(C)+P(D)−P(C)P(D) by the assumption of independence
=0.90+0.85−(0.90)(0.85)
=0.985
The reliability of more complex systems can often be determined by decomposing the system into a series of subsystems, each of which contains components connected either in series or in parallel. Example 2.30 illustrates the method.
E xample 2.30 The thesis “Dynamic, Single-stage, Multiperiod, Capacitated Production Sequencing Problem with Multiple Parallel Resources” (D. Ott, M.S. thesis, Colorado School of Mines, 1998) describes a production method used in the manufacture of aluminum cans. The following schematic diagram, slightly simplified, depicts the process.
C D Cup Wash E
Print Depalletize Fill
Fill
A B
G H
F
The initial input into the process consists of coiled aluminum sheets, approximately 0.25 mm thick. In a process known as “cupping,” these sheets are uncoiled and shaped into can bodies, which are cylinders that are closed on the bottom and open on top.
These can bodies are then washed and sent to the printer, which prints the label on the can. In practice there are several printers on a line; the diagram presents a line with three printers. The printer deposits the cans onto pallets, which are wooden structures that hold 7140 cans each. The cans next go to be filled. Some fill lines can accept cans