Today’s exercises are intended as a review of basic R features and operations. Remember that discussion attendance is completely optional but highly recommended. Also, if you finish the material early, don’t be afraid to leave early.
Remember that in R, if an operation works on a single number, it will usually also work across a vector. For example, if you multiply a number by a vector, each number in the vector will be multiplied. If you multiply two vectors of the same length, the first number of both vectors will be multiplied, and the second number of both vectors will be multiplied, etc. This will also work for functions like exp()
or pnorm()
.
>=
, compare this vector with 0 to show if each number is greater than or equal to 0. Use sum()
on this resultant vector to count how many numbers satisfy this criterion (or alternatively, use mean()
to get the proportion (think about why this works!)).qnorm()
to get a vector of 15 points evenly spaced out along the quantiles of the normal distribution. Note: this is how you obtain the theoreticals for a QQ-plot.Functions are a useful way of creating a tool that can be used over and over again. Good functions usually (but not necessarily always have to) satisfy the following:
Write a function for each of the following parts:
n
and k
, computes the binomial coefficient. You can use the factorial function for simplicity.n
6-sided dice and gives the average of the outcomes. n
should have a default value of 2.sd()
) compute the sample standard deviation of a vector.Note: functions in R have different scope than the global environment. Read this for a helpful guide about this. Also note that declaring/updating a global variable from inside a function is considered bad practice since it can easily introduce bugs that are very difficult to detect and fix. Avoid this if you can!
It’s important to be able to write clear and effective conditionals (if, else, etc…) in R. It’s often very useful to check if a condition is satisfied and then do different things depending on the outcome.
For this exercise, simply briefly review sections 7.3-7.5 of this page here.
For loops are a useful way of repeating a step a set number of times.
n
times, with a default n=1000
:
For each of the following, identify one or more random variables that can be used to model the outcome.
Choose a type of random variable that has finite mean (e.g. normal, binomial, poisson, geometric, exponential, uniform, etc…) and choose some parameters. Write down what the theoretical mean of this particular distribution is (you can use Wikipedia to get the expected value for your random variable).
Randomly generate at least 1000 observations of the variable you chose (if your computer can generate more, go ahead!). Then, use the running.mean()
function defined below to compute a running mean (i.e. each number in the output is the mean of all the previous numbers in the input). Plot this running mean using the plot()
function, and use abline()
to add a horizontal red line at your previously computed theoretical mean.
Explain what is happening here. (Hint: is this consistent with the Law of Large Numbers? Why or why not?).
# define running average function
# can be specified as cumulative sum / index of element
= function(vec) cumsum(vec)/seq(along=vec) running.mean