Jonathan Olmsted

Deviates from a Truncated Normal Distribution via Rcpp

| Comments

Some edits on April 24 2013

Motivation

Much of the code that I use for my dissertation project is in C++. I need to generate random variables from a Truncated Normal Distribution and I could not find anything in the Boost libraries or the like. So, I had to implement my own generator. This led me an old article: Robert (1995).

After implementing the sampler described in the paper, I decided to compare its performance to that provided by the truncnorm package in R. While this involved me wrapping some functions around my C++ code and creating some function definitions in R, it wasn’t much additional work.

Benchmark

Below are the results of a benchmark of the performance between my code, rtnRcpp (which uses Rcpp heavily) and truncnorm.

Deviates from a Truncated Normal Distribution with mean 0 and standard deviation 1 are pulled from both packages. Each point on the graph reflects the relative performance of the functions from the two packages, each being called 1,000 times. I ask for three different sample sizes: 102, 104, and 106. And, I consider 6 different intervals. The specific reason for these intervals is motivated by the different cases considered in the algorithm. Beyond that, the details do not matter here.

As a note, I checked to make sure that code for both packages was compiled with the same flags.

As you can see, my code (i.e. rtnRcpp) outperforms truncnorm in every case. The function rtruncnorm() from the truncnorm package is anywhere from 18-80% slower. Sample size doesn’t seem to matter in any set way.

Whatmore, as best I can tell, the authors of the truncnorm package are implementing the same algorithm from Robert (1995), so the performance difference is driven only by the C++ code and not the underlying logic. I have every reason to believe that it is the Rcpp-foundation which leads to my realized speed. The truncnorm package definitely implements a different algorithm. When implemented in straight C, the Robert (1995) algorithm fares less well than it had when implemented in C++. Still, for most regions in the parameter space it is faster or as fast.

Release

I’m happy to share details of my implementation if anyone wants to see it.

Comments