Probability of a Revenue Threshold

Standard

A retailer’s website purchases have an average order size of \$100 and a standard deviation of \$75. What is the probability of 10 orders generating over \$1,250 in Revenue?

mean = \$100.00
stdev = \$75.00

avg_order_needed = \$1250/10 = \$125.00
standard_error = \$75/sqrt(10) = \$23.72
z-score = (125.00 – 100.00)/23.72 = 1.05

We are looking to solve for this shaded area under the curve. Looking up on z-table for 1.05, the probability is 0.1469 or 14.7% of a obtaining \$1,250 in Revenue from 10 random orders.

Web Traffic using Linear Modeling

Standard

Wanted to illustrate a simple example to understand rate of change of web traffic over time using linear regression. My data is web traffic hits by day for past 8 months, here is top few rows:

date ,visits
10/11/14 ,37896
10/12/14 ,24098
10/13/14 ,35550
10/14/14 ,38610
10/15/14 ,35739
10/16/14 ,30316
…. through May 2015

First, I want to plot the data and add line of best fit:
```plot(data\$date, data\$visits,pch=19,col="blue",main="Web Traffic", xlab="Date",ylab="Visits") lm1 <- lm(data\$visits ~ data\$date) abline(lm1,col="red",lwd=3)``` `lm1`
#Coefficients:
#(Intercept) data\$date
#-2404.5259 148.9

To interpret this model, would be that we see 149 additional hits each day.

That model was great for absolute increase, but what if we want to average increase. To do so we can run the linear regression using log:

`round(exp(coef(lm(I(log(data\$visits+1))~data\$date))),4)`
(Intercept) data\$date
0.00000 1.00322

To interpret, would be a 0.3% increase in web traffic per day.

And other way we could look at change per day would be a generalized linear model with poisson.
```plot(data\$date, data\$visits,pch=19,col="green",xlab="Date",ylab="Visits") glm1 <- glm(data\$visits ~ data\$date, family="poisson") abline(lm1,col="red",lwd=3) # for linear model line lines(data\$date,glm1\$fitted,col="blue",lwd=3) # lm fit for possion``` `confint(glm1,level=0.95) # CI`
#2.5 % 97.5 %
#(Intercept) -55.999943551 -45.190626728
#data\$date 0.002976299 0.003632503

To interpret, 95% confident the increase web hits/day falls between range of 0.003 and 0.004, which is right inline with previous method of using linear regression log.

Probability of having 7 Girls of 8 Total Children

Standard

Here is a simplified example of calculating probability when there is a 50% chance of an outcome. This puts to good use the binomial distribution.

# chance of having at least 7 girls out of 8 children

cumulative probability function:
`dbinom(7, size=8, p=0.5)`
# 3.1%

Probability of Web Clicks in a Day

Standard

Below is a simplified example using R in which you can apply a probability that a day has a certain # of visits. The web visits are approx normally distributed, and we want to know the probability of getting fewer than 50 visits/day.

```# web traffic for last seven days web_visits <- c(64, 34, 55, 47, 52, 59, 77) visits_day <- mean(web_visits) # mean = 55.4 sd_visits_day <- sd(web_visits) # standard_deviation = 13.5 goal_visits <- 50 #result pnorm(visits_day, goal_visits, sd=sd_visits_day, lower.tail=F) # .344 or 34.4% probability you'll have fewer than 50 web visits```

source: Statistical Inference, John Hopkins University/Coursera by Brian Caffo