Expected Value (Speegle Ch 3.2)

Variance and standard deviation (Speegle Ch 3.5)

NOTE: We are going out of order from the textbook in Chapter 3.

Expected Value (Speegle Ch 3.2)

Probability mass functions provide a global overview of a random variable’s behavior. Many times we don’t need to know everything about a variable. We often want to summarize the variable. One feature of a distribution which we might be interested in is the central tendency of a variable. One measure of central tendency is the expected value or mean of the observation. The term expected value and mean can be used interchangeably.

Definition: Expected Value

For a discrete random variable with a pmf , the expected value of is

where the sum is taken over all possible values of the random variable .

Example

Two books are assigned for a statistics class: a textbook costing $137 and its corresponding study guide costing $33. The university bookstore determined 20% of enrolled students do not buy either book, 55% buy the textbook only, and 25% buy both books, and these percentages are relatively constant from one term to another.

Let be a random variable that denotes how much a single student will spend on their course materials. The pmf is:

Textbook only: $137
Textbook + Study guide: $173 + $33 = $170
Neither: $0

x	0	137	170
p(x)	.20	.55	.25

Calculate E(X) and interpret this value in context:

0 * .20 + 137 *.55 + 170*.25

## [1] 117.85

On average, a single student will spend $117.85 on their course materials.

Confirm your results using simulation.

dollars <- c(0, 137, 170)
prob <- c(.2, .55, .25)

spend <- sample(dollars, size=10000, prob=prob, replace=TRUE)
mean(spend)

## [1] 117.5886

You try it:

A retirement portfolio’s value increases by 18% during a financial boom and by 9% during normal times. It decreases by 12% during a recession. What is the expected return on this portfolio if each scenario is equally likely?

Define a random variable.

Let be the change in portfolio value.

Write down the pdf.

x 18 9 -12

p(x) 1/3 1/3 1/3
Calculate the theoretical expected value. Write your answer in a full sentence in context of the problem.

18*1/3 + 9*1/3  - 12*1/3

## [1] 5

In the long run this portfolio’s value is expected to increase by 5%.

Confirm using simulation.

value <- c(18, 9, -12)

portfolio.change <- sample(value, size=10000, replace=TRUE)
mean(portfolio.change)

## [1] 4.8567

Variance and standard deviation (Speegle Ch 3.5)

Although the mean is a useful descriptive statistic, it only gives us an idea of where the center of the distribution is located. For instance, the following table gives the monthly temperature of New York City and San Francisco:

months	J	F	M	A	M	J	J	A	S	O	N	D
NYC	32	34	42	53	63	72	77	76	68	57	48	37
SF	49	52	53	56	58	62	63	64	65	61	55	49

The mean temperature for San Francisco is about 57 degrees and the mean temperature for New York is around 55 degrees. So, there mean yearly temperature is about the same. Do you notice anything different about the two cities with regards to monthly temperatures?

The temperature range in NYC has higher highs, and lower lows compared to SF. SF has a lower range of temperatures compared to NYC.

To distinguish between 2 distributions with the similar means it might be useful to have a statistic that measures how spread out the distribution is. The variance and standard deviations are such measures.

Definition: Variance

Suppose is a random variable with mean . The variance of , denoted by Var() or , is defined as follows:

The variance of a distribution provides a measure of the spread or dispersion of the distribution around its mean .

The standard deviation of a random variable () is the square root of the variance. We denote the standard deviation by and the variance by . E.g.:

Which of the two distributions below have the larger variance?

par(mfrow=c(1,2))
plot(proportions(table(sample(1:5, size=1000, replace=TRUE))), ylab="probability")
plot(proportions(table(sample(1:10, size=1000, replace=TRUE))), ylab="probability")

Example

Let’s return to the statistics book example and calculate and . Recap: The textbook costs $137, the study guide costing $33. 20% of students don’t buy either book, 55% buy the textbook only, and 25% buy both books. Confirm your results using simulation.

x <- c(0, 137, 170)
p.x <- c(.2, .55, .25)
mu <- sum(x*p.x)

Theoretical

(x.minus.mu <- x - mu)

## [1] -117.85   19.15   52.15

(var.dollars <- sum(x.minus.mu^2 * p.x))

## [1] 3659.327

sqrt(var.dollars)

## [1] 60.49238

Simulation

spend <- sample(x, size=10000, prob=p.x, replace=TRUE)
var(spend)

## [1] 3595.947

Bonus: You may have noticed that the formula for has the same format as a dot product. You can perform vector multiplication like this in R using the %*% operator.

x.minus.mu^2 %*% p.x

##          [,1]
## [1,] 3659.327

You try it:

Return to the retirement portfolio question (Recap: the value increases by 18% during a financial boom and by 9% during normal times, and decreases by 12% during a recession. Each scenario is equally likely). Calculate the variance and standard deviation.

value.chg <- c(18, 9, -12)
p.chg <- 1/3

Theoretical

(mean.chg <- sum(value.chg * p.chg))

## [1] 5

(var.chg <- sum((value.chg-mean.chg)^2 * p.chg))

## [1] 158

(sd.chg <- sqrt(var.chg))

## [1] 12.56981

Simulation

portfolio.change <- sample(value.chg, size=10000, replace=TRUE)
var(portfolio.change)

## [1] 157.2113

sd(portfolio.change)

## [1] 12.53839

Section 3.2: Expected Value & Variance

Expected Value (Speegle Ch 3.2)

Definition: Expected Value

Example

You try it:

Variance and standard deviation (Speegle Ch 3.5)

Definition: Variance

Example

You try it: