In this exercise, you will simulate probability distributions, and
use the probability distributions available in scipy
.
Solve the tasks described below. Write a short report containing your answers, including the plots. Send the report and your Python code by email to the course instructor (richard.johansson -at- gu.se). If you wish, your solution to the first question can be submitted on paper directly.
NB: submit your answers individually. You are allowed to discuss with your fellow students, but not write code together.
Deadline: January 30
scipy
scipy
's statistical functions and random variables.
Think of the following scenarios and draw the probability mass function (pmf) of the corresponding random variables. A rough sketch on paper is enough.
Make a new Python file that starts with the following imports:
from matplotlib import pyplot as plt import random import scipy import scipy.stats
We define a function that tosses an uneven coin and returns 'heads' or 'tails' depending on the outcome:
def coin_toss(p_heads): if random.random() <= p_heads: return 'heads' else: return 'tails'
Next, we make another function that simulates an experiment where we toss the coin a number of times, and count how many times we got 'heads'.
def count_heads(p_heads, n_toss): tosses = [ coin_toss(p_heads) for _ in range(n_toss) ] return tosses.count('heads')
Write a function that calls count_heads
several times, and collects the result of all the calls in a list. Then print the mean and standard deviation of the experiments, and plot a histogram of the results (using first plt.hist
and then either plt.show
or plt.savefig
).
Run your function four times, where you call count_heads
10, 100, 1000, and 10000 times, respectively. Use p_heads=0.7
and n_toss=20
.
Hint 1: If your histogram is ugly, increase the parameter bins
in the plt.hist
function. I used a value of 100.
Hint 2: To make the plots easier to compare, you can adjust the x and y axes:
plt.axis([-1, n_toss+1, 0, n_experiments])
Here, n_experiments
is the number of times you have called count_heads
.
Hint 3: You can use plt.xlabel
and plt.ylabel
to add text to the x and the y axis, respectively.
plt.xlabel('Number of heads') plt.ylabel('Frequency')
Make a binomially distributed random variable using the same
parameters n_toss
and p_heads
as above. This
r.v. is a mathematical model of the coin-tossing experiment.
rv = scipy.stats.binom(n_toss, p_heads)
We will now plot the pmf for the coin-tossing experiment. This is similar to what I did for the die roll r.v., on slide 9 in my lecture.
First, what are the possible outcomes we could get in an experiment where we toss a coin 20 times and count the number of times we get 'heads'? Make a list of all these possible outcomes.
Then compute the pmf for all the possible outcomes of the coin-tossing experiment. Finally plot the result using a bar plot:
outcomes = (... a list of all the possible results of a coin-tossing experiment ...) pmf_for_outcomes = (... the probability for each of those possible results ...) plt.bar(outcomes, pmf_for_outcomes, width=0.1) plt.axis([-1, n_toss+1, 0, 1])
In addition, print the mean and standard deviation of this random variable. Did your simulations in Task 2 give you reasonable results compared to what you get now?
Compute the following probabilities. Use the binomial random variable rv
to do the calculations, but to better understand what you are doing it can also be useful to explain the calculations in terms of the plot you made in Task 3.
Finally, compute the 5% percentile of the coin-tossing experiment: the number N such that in 5% of all experiments, the number of heads is N or lower.