I liked the results from comparing perl and module releases being tested over time. Most of the hoops to jump through were with release dates and getting R to render how I wanted. Here are those hoops, along with how I jumped.
CPAN Testers has a results page for each module. On there, a nice friendly JSON button links to all the test results:
[ { "status" : "PASS", "osvers" : "2.11", "state" : "pass", "osname" : "solaris", "platform" : "i86pc-solaris-thread-multi", "version" : "0.615", "distribution" : "XML-Writer", "perl" : "5.10.0", "fulldate" : "201201182135", ... }, ... ]
This is really easy to parse with JSON.pm.
Once we have clones of the Perl git repository and XML::Writer we can get the times at which those releases were tagged. Perl’s tags have used a number of conventions so, in essence:
while read p; do ... elif git rev-parse --quiet --verify perl-"$p" >/dev/null; then t="perl-$p" else t="v$p" ... echo "$p,$t","`git show "$t" --pretty=format:Stamp,%ct | grep ^Stamp, | cut -f2 -d,`"
to get output like:
ID | Tag | Stamp |
---|---|---|
XML-Writer-0.3 | xml-writer-0.3 | 944761768 |
XML-Writer-0.4 | xml-writer-0.4 | 954899991 |
... | ||
5.5.3 | perl-5.005_03 | 922659709 |
5.6.0 | perl-5.6.0 | 953789951 |
... |
We then process all this using a perl script to get CSV output with the corresponding release dates of the software used in each test.
Get the data in with d <- read.csv('data.csv')
and then start to plot.
A simple plot(x$key, x$value)
is a great start when analysing data.
If we want control over how the results look, R is happy to give us
that control. Too much? Maybe!
First, plot without labels (plot(d$xmlwriter, d$perl, ..., xaxt='n', yaxt='n')
), then add in
axes for dates:
dates <- ISOdate(1999:2012, 1, 1) axis(1, at=dates, labels=format(dates, "%Y")) dates <- ISOdate(seq(2000, 2012, 2), 1, 1) axis(2, at=dates, labels=format(dates, "%Y"))
Use abline
to show guidelines for specific releases:
# Perl releases: abline(h=1305397133, col='darkgray') abline(h=1271077269, col='darkgray') abline(h=1198315389, col='darkgray') abline(h=1027029998, col='darkgray') abline(h=953789951, col='darkgray') axis(4, at=c(1305397133, 1271077269, 1198315389, 1027029998, 953789951), c('v5.14.0', 'v5.12.0', 'v5.10.0', 'v5.8.0', 'v5.6.0'), tick=FALSE, las=2)
For each XML::Writer release, I wanted to show the mean release date of
the perls it had been tested with. We can use aggregate
across the tests
to group the perl release dates (d$perl
) by the corresponding XML::Writer
version, then summarise by taking the mean and plot it as a line:
points(aggregate(d$perl, list(d$xmlwriter), mean), type='l', col='blue')
This shows that the test perls are getting more recent; it’s nice to to show and to quantify this kind of thing.