I gave a lightning talk at the SF R Ladies meet-up about a problem with R’s sampling algorithm. Check out my slides here!
As part of my dissertation, I dug into the pseudo-random number generators and sampling algorithms used by common statistical packages. Along the way, I found an issue with the way R generates pseudo-random integers using the
sample() function. I’ll give an example where we’d like to generate integers uniformly on an interval, but sample produces 2x as many odd numbers as even ones. Our short paper on the problem is on arXiv.