For each of the following problems, identify the sample space and the events described in set notation.
w
) or lose (l
) each game (assume ties are not allowed).
Let \(A\) and \(B\) be events in a sample space \(S\). Complete the following definitions and write an example of each using context situations above.
Note that element and outcome can be used interchangeably.
The most common kind of picture to make to describe sample spaces and events within sample spaces is a Venn Diagram. A Venn diagram uses overlapping circles or other shapes to illustrate the logical relationships between two or more sets of items.
Use Venn Diagrams to visualize definitions in 2.5.
Say 3 roommates are deciding on a pet. They use a Venn Diagram to determine which pet might be the best pick for them.
What pet should they choose? Cat or Goat
A single card is drawn from a standard deck of cards. (Not sure what that looks like? See here: https://en.wikipedia.org/wiki/Standard_52-card_deck). Let \(A\) be the event that an ace is selected, and let \(B\) be the event that a heart is drawn.
R
We can also rely on R
to perform union and intersection calculations. The functions union
and intersect
and setdiff
can be used in R to compute intersections and unions. Each function can only take into consideration 2 vectors.
Lets revisit the deck of cards problem from above: A single card is drawn from a standard deck of cards. Let \(A\) be the event that an ace is selected, and let \(B\) be the event that a heart is drawn.
First create the sample space and event vectors. I recommend that when you do this on your own you print the vector to ensure that what’s being created is what is intended. Trust, but verify your code.
<- rep(c(2:10, "J", "Q", "K", "A"), 4)
numbers <- rep(c("H", "C", "D", "S"), each = 13)
suits <- paste0(numbers, suits) # Sample Space
deck <- c("AH", "AC", "AD", "AS") # Event A
aces <- paste0(c(2:10, "J", "Q", "K", "A"), "H") # Event B hearts
Then we can use R functions to find the following statements.
<- union(aces, hearts)) (aces.and.hearts
## [1] "AH" "AC" "AD" "AS" "2H" "3H" "4H" "5H" "6H" "7H" "8H" "9H"
## [13] "10H" "JH" "QH" "KH"
<- intersect(aces, hearts)) (ace.of.hearts
## [1] "AH"
<- setdiff(deck, hearts)) (no.hearts
## [1] "2C" "3C" "4C" "5C" "6C" "7C" "8C" "9C" "10C" "JC" "QC" "KC"
## [13] "AC" "2D" "3D" "4D" "5D" "6D" "7D" "8D" "9D" "10D" "JD" "QD"
## [25] "KD" "AD" "2S" "3S" "4S" "5S" "6S" "7S" "8S" "9S" "10S" "JS"
## [37] "QS" "KS" "AS"
Suppose that one card is to be selected from a deck of 20 cards that contains 10 red cards numbered from 1 to 10 and 10 blue cards numbered from 1 to 10. Let \(A\) be the event that a card with an even number is selected, let \(B\) be the event that a blue card is selected, and let \(C\) be the event that a card with a number less than 5 is selected.
Define the sample space and each event in R.
<- c(paste0(1:10, "R"), paste0(1:10, "B"))
S <- c(paste0(seq(2,10, by=2), "R"), paste0(seq(2,10, by=2), "B"))
A <- paste0(1:10, "B")
B <- c(paste0(1:4, "R"), paste0(1:4, "B"))
C # print objects
S;A;B;C
## [1] "1R" "2R" "3R" "4R" "5R" "6R" "7R" "8R" "9R" "10R" "1B" "2B"
## [13] "3B" "4B" "5B" "6B" "7B" "8B" "9B" "10B"
## [1] "2R" "4R" "6R" "8R" "10R" "2B" "4B" "6B" "8B" "10B"
## [1] "1B" "2B" "3B" "4B" "5B" "6B" "7B" "8B" "9B" "10B"
## [1] "1R" "2R" "3R" "4R" "1B" "2B" "3B" "4B"
Alternatively, this code has the same result.
<- rep(1:10, 2)
numbers <- rep(c("R", "B"), each = 10)
colors <- paste0(numbers, colors)
deck
<- deck[seq(from=2, to=20, by=2)]
A <- deck[11:20]
B <- deck[c(1:5,11:15)] C
Another alternative method:
<- c("1R","2R","3R","4R","5R","6R","7R","8R","9R","10R",
sample_space "1B","2B","3B","4B","5B","6B","7B","8B","9B","10B")
<- c("2R", "4R", "6R", "8R", "2B", "4B", "6B", "8B")
A <- c("1B", "2B", "3B", "4B", "5B", "6B", "7B", "8B", "9B", "10B")
B <- c("1R", "1B", "2R", "2B", "3R", "3B", "4R", "4B") C
<- intersect(A,B)) (a_and_b
## [1] "2B" "4B" "6B" "8B" "10B"
<- intersect(a_and_b,C)) (a_and_b_and_c
## [1] "2B" "4B"
<- setdiff(S,C)
C_complement <- union(B, C_complement)) (b_or_Cc
## [1] "1B" "2B" "3B" "4B" "5B" "6B" "7B" "8B" "9B" "10B" "5R" "6R"
## [13] "7R" "8R" "9R" "10R"
<- union(B, C)
b_or_c <- intersect(A, b_or_c) a_and_b_or_c
<- setdiff(S,A)
A_complement <- setdiff(S,B)
B_complement <- intersect(A_complement,B_complement)) (Ac_and_Bc
## [1] "1R" "3R" "5R" "7R" "9R"
<- intersect(Ac_and_Bc, C_complement)) (Ac_and_Bc_and_Cc
## [1] "5R" "7R" "9R"
The probability of an event describes the proportion of time we expect the event to occur if we observed the event an infinite number of times.
Let \(S\) be a sample space. A valid probability of events \(A\) is a number \(P(A)\) between 0 and 1 (inclusive), so \(0\leq P(A)\leq 1\), that satisfies the following probability axioms:
These are some important rules to memorize that come about as a result of the above axioms. Here are a few, there are more in the textbook.
Let \(A\) and \(B\) be events in the sample space \(S\).
\[ \begin{align} P(A \cap B) & = P(A) + P(B) - P(A \cup B) \\ & = 1 - P(A^{c}) + 1 - P(B^{c}) - P(A \cup B) \\ & = [1 - P(A^{c})- P(B^{c})] + [1 - P(A \cup B)] \end{align} \]
Since \(0 \leq P(A \cup B) \leq 1 \quad \rightarrow \quad 1 - P(A \cup B) \geq 0\),
Then \(P(A \cap B) \geq [1 - P(A^{c})- P(B^{c})] +\) [something larger than 0].
Sometimes venn diagrams can be helpful to solve problems
If 50 percent of the families in a certain city subscribe to the morning newspaper, 65 percent of the families subscribe to the afternoon newspaper, and 85 percent of the families subscribe to at least one of the two newspapers. Draw a Venn Diagram to represent this situation.
\[ P(AM \cap PM) = P(AM) + P(PM) - P(AM \cup PM) \\ \]
50 + .65 - .85 .
## [1] 0.3
\(P(PM \cap AM^{c})\) =
65 - .3 .
## [1] 0.35
\(P(AM \cup PM)^{c}\)
1 - .85
## [1] 0.15
David Diez was interested in exploring the factors that contribute to an email being flagged as spam by Gmail’s system. So they downloaded all their emails for a few months in 2012 and noted certain characteristics such as if it was flagged as spam (0 means no, and 1 means yes), and what size of a number it contained (none, small, or big). A two-way table of emails with these two characteristics are shown below.
## Size of number
## Spam none small big Sum
## 0 400 2659 495 3554
## 1 149 168 50 367
## Sum 549 2827 545 3921
If you were to randomly select an email from this pool, calculate the following probabilities:
367/3921
## [1] 0.09359857
545/3921
## [1] 0.1389952
2659/3921
## [1] 0.6781433
The following data table describes the sex by species breakdown for 333 observed penguins on islands in the Palmer Archipelago, Antarctica.
## Sex
## Species female male Sum
## Adelie 73 73 146
## Chinstrap 34 34 68
## Gentoo 58 61 119
## Sum 165 168 333
If you were to select a penguin at random from these islands, what is the estimated probability that,
165/333
## [1] 0.4954955
119/333
## [1] 0.3573574
34/333
## [1] 0.1021021