Getting ready to take off for ECVP in
Leuven. I have a poster to present that includes some of the last work
Christie Haskell Marsh did before she completed her PhD (she now works as
senior data scientists with Johnson & Johnson in their Baby Products
Division). In addition to Christie's data, there is data collected using
virtually the same data collected as a replication. I recorded in the readme for the repo
of my Haskell parsing code of all the pain I caused myself, but rarely
is the easy way the fun way, or the interesting way, or the educational
way or the "moral" way. So, I am back on another painful path trying to
make a scientific poster that is reproducible.
This isn't about replicability as in "crisis", this is about
making a scientific document that clearly documents what you did and how
you did it, and that allows others to repeat your analyses exactly as
you did them to produce the poster, manuscript, or blog. Unfortunately,
the tools that you need to do this require you to work at it and makes
it less likely people will do it. If you like making your poster or
figures in powerpoint or illustrator I do not know how you will be able
to do this. But if you are willing to spend some time, have some
patience, and are willing to compromise a bit on your aesthetic vision,
it is not too hard to achieve this goal right now. For the poster in
Leuven I wrote the poster as an Rnw file. This is a combination
of R and nowebformat that allows
using R to conduct the analyses and generate the figures while
subsequently allowing me to subsequently use LaTeX tools to produce the
document I will display. I have done other posters starting with an org
file and using org-babel
for including the code, but if you are going to have to write a bunch of
LaTeX and R anyway the convenience of orgmode
is largely absent. Here in short is the basic production line. Do
whatever you want to do in RStudio or elsewhere until you have a pretty
good idea of the workflow that the poster will need (you can actually
include code blocks form other documents, but I did not do that here).
Then get to writing your Rnw file. When ready
you move over to R and library(knitr)
then knit("yourFileName.Rnw")
. My current draft of
this file is here. The
result of this will be a file: yourFileName.tex
. You can change that output if
you want, but tex is the default. Then you LaTex the file as
many times as you need to, with the tools you have set up to get a pdf version. The
benefit of this approach is that if you have my data (and I will be
posting this some where publically soon) you can start with my raw file
and reconstruct the poster. Don't like my analysis? Do your own. You
will have the exact code I used available to change. So, this doesn't
sound so hard. What is the problem? Well, getting your tools set up. And
then there is the fact that if you want to deviate from the established
templates or default mode there can be a lot of time on stackoverflow
trying to get the tweaks just so. But, it is the right way. Our analyses
and our choices should be transparent. Throw away your programs of
oppression and free yourself to code your posters. Reproducible
scientists of the world unite!