Dear R-Users,

a question: I am the author of the ‘qpcR’ package. Within this, there is a function ‘propagate’ that does error propagation based on Monte Carlo Simulation, permutation-based confidence intervals and Taylor expansion. For the latter I recently implemented a second-order Taylor expansion term that can correct for nonlinearity. The formulas are quite complex and I had quite a hard time to transform the second-order term that has covariance included into a matrix form. The first order expansion can be represented by GRAD * SIGMA * t(GRAD) with GRAD being the gradient matrix and SIGMA the covariance matrix. The second order term is 0.5 * trace(HESS * SIGMA * HESS * SIGMA) with HESS being the Hessian matrix of partial second derivatives.

So here the problem: Being a molecular biologist, my statistical skills have limitations. I am pretty sure my implementation is right (I checked the examples in GUM “Guide to the evaluation of uncertainties in experiments” and got the same results). BUT: Being sure says nothing. It could well be that I’m on the wrong track and people use a function that gives essentially wrong (maybe slightly wrong which may still be fatal…) results because they rely on me doing things right.

Now let’s suppose I’m not the only one with a queer feeling when submitting packages to CRAN in the hope that everything I did is right.

Which brings me to the question: What are the merits of a peer-review system for R packages? I would suppose (if the paradigm is right that peer-reviewing DOES increase quality…) that if packages were independently peer-reviewed in a way that a handful of other people inspect the code and check for detrimental bugs that are enigmatic to the packages author, then this would significantly increase the credibility of the packages functions.

Now I know that there is the “R Journal” or the “Journal of Statistical Software” that regularly publish papers where new R packages are explained in detail. However, they are restricted to the statistical background of the implementations and I’m not sure if the lines of code are actually checked.

An idea would be that such a review system is voluntary, but if a package has been reviewed and declared as “passed”, then this would be some sort of quality criterion. For scientists like me that need to write grant applications, this would give more impact on the work one has dedicated into developing a “good” package that could maybe be cited as being “peer-reviewed”.

Pros:

* increase of quality

* maybe citable

* beneficial for grant applications or C.V.

Cons:

* reviewers must meticulously inspect the code

* longer time to CRAN submission

* no platform yet to publish the reviews which can be cited like a journal

Comments welcome!

Cheers,

Andrej

The Boost libraries for C++ (http://www.boost.org/) are peer-reviewed. It might be worth looking at how that works.

Hello dear,

While this is a nice post, and topic – I have removed it from R-bloggers.com

As mentioned in the “add your blog” page, r-bloggers should NOT(!!) be used as a forum (which is what this post is).

You should post this to r-help for smart comments. Still, if to answer your question from my POV:

1) There was an attempt to have a review system for packages, it is at http://crantastic.org/

Sadly, that project never seemed to have reached any critical mass of user or uses.

2) The R journal could never handle the flow of new R packages.

3) If you want to review your code, probably the best thing you can do for everyone, is to create a battery of tests to see your code does what you think it should do. There is even a package for that:

http://cran.r-project.org/web/packages/testthat/index.html

Good luck

With regards,

Tal

OK, understood…

r-help is probably also not the right place for a forum-like discussion.

Maybe Stackoverflow?

Greets,

Andrej

I think you raise valid points. Pragmatically, however, in depth reviewing does take a long time, especially when reviewing code.

Note that the Bioconductor project has a reviewing procedure [1], which is more of a technical and package interoperability nature, rather than scientific or statistical. You could always get in touch with an expert in the field and ask if they would want to provide mentoring/support, granting proper acknowledgement in the package and/or derivative work.

Hope his helps,

Laurent

http://www.bioconductor.org/developers/package-guidelines/

Hmm, good point.

However: When we talk about code, does that imply these experts always need to be proficient in R in order to judge the credibility of the package? I would assume that for “niche” statistics it may be hard to find people that cover both worlds (the stats and R)…

I some cases, the statistics experts might very well be knowledgeable in R, which is a software written for/by statisticians in the first place. But the point (and sometimes difficulty) of interdisciplinary science is synergistic interaction between experts with different skills.

I think this is a good question to ask… I tend to trust the packages I install but maybe there could be a sign-off process of some sort that lets you know a which functions have been independently tested and “proven” to work.

A related point in the academic world is citeability – in which case loding libraries with something like figshare may also be worth considering?

Didn’t know about figshare…

Thanks for pointing out!