Publication-quality plots with Gnuplot

After careful consideration of the alternatives on different platforms, I’ve concluded the best software for generating publication-quality plots is Gnuplot. The R suite is much better suited for statistical analysis, but the plot generation capabilities aren’t as flexible and to my eye don’t look as good. (Full disclosure: I am mainly speaking of the base plotting library–I have not extensively used some of the other libraries such as lattice or ggplot2). One can generate publication-quality graphs with Excel, but it is a chore, and most importantly, one cannot easily set the exact sizes of diagrams or generate high resolution images of graphs.

Now in some ways, the Gnuplot language syntax seems somewhat old-fashioned, but once you learn it (or better yet, learn to write programs that generate it). While the PNG driver in Gnuplot doesn’t generate terribly good-looking output, both the Postscript and SVG drivers work very well. I find them most useful particularly for publication-quality scatter diagrams. Generally, you create a file called something.plt, with commands to be passed to ‘gnuplot’ (‘wgnuplot.exe’ on WinXP). To use the postscript terminal to generate an EPS file (best for images around 3″ square, use these commands:

  set terminal postscript eps enhanced size 3in,3in
  set output 'file.eps'

To generate a Postscript file (best for images around 6″-6.5″ square):

  set terminal postscript enhanced size 6in,6in
  set output 'file.ps'

To generate an SVG file (best for images around 4″-6″ square):

  set terminal svg enhanced size 500,500
  set output 'file.svg'

The svg terminal doesn’t seem to accept the ‘in’ argument to the size option, but in SVG-land 100=1in. I like svg the most, because the resulting graph may be further annotated with the open-source SVG editor program Inkscape. Therefore subsequent instructions here are assuming the use of the svg driver.

The ‘enhanced’ option to the svg terminal is important. It allows more sophisticated titles to be set:

  set encoding iso_8859_1
  set title 'This is the title of the graph'
  set xlabel 'Molecular weight (kDa)'
  set ylabel 'Solvent-inaccessible surface area x 1000 ({\305}^2)'

‘^’ allows superscripting and ‘_’ subscripting as in LaTeX. Non-ASCII characters may be included as well:

{\305}  Angstroms

Controlling the legend (key):

  set nokey          # turn off the legend
  set key top left

Controlling the axis ranges:

  set xrange [0:10]
  set xtics 0,2,10   # set an increment of 2
  set yrange [0:1000]

Adding a function to print, such as a regression curve:

  f(x) = 0.039440 * x + 0.678467

Now that we have all of the parameters set, we use the plot function to actually generate the graph data:

  plot 'data1.dat' using 1:2 w p, \
       'data2.dat' using ($2/1000):($3/1000) w lp pt 13 ps 1.5 lt 1 lc -1 lw 1,
       f(x)

‘w’ stands for ‘with’. The first argument is the style, which is how the data should be plotted. Some possible arguments are:

points(p)       - unconnected points
linespoints(lp) - points connected by lines

Some of the other arguments are:

pointtype(pt)  - controls the appearance of points
pointsize(ps)  - controls the size of points
linetype(lt)   - controls the appearance of lines (solid, dashed, etc.)
linecolor(lc)  - controls the line (and point) color
linewidth(lw)  - controls the width of lines

Unfortunately, the arguments for these options are specified as arbitrary numbers, which are defined by the specific driver. For the SVG driver, these are the pointtypes:

0   dot
1   +
2   x
3   *
4   open square
5   closed square
6   open circle
7   closed circle
8   open upward-pointing triangle
9   closed upward-pointing triangle
10  open downward-pointing triangle
11  closed downward-pointing triangle
12  open diamond
13  closed diamond

The SVG linecolors:

-1  black
0   gray
1   red
2   green
3   blue
4   cyan
5   dark green
6   dark blue
7   orange
8   teal?
9   light green
10  purple
11  light orange
12  magenta
13  yet another green

There do not appear to be different linetypes other than solid for the SVG driver (though that can be corrected in Inkscape). If that is an issue, the postscript terminal does have many kinds of dashed lines. An easy way to see the capabilities of a terminal is to issue the ‘test’ command.

The images can then be exported with Inkscape at arbitrary resolution to a lossless PNG file. This can also be done at the command line:

  $ inkscape --export-area-drawing --export-png=file.png \
             --export-dpi=300 file.svg

Using the open-source program ‘convert’ from the ImageMagick suite of tools, this can be converted to TIFF or another format, optionally removing the transparency and adding a border:

  $ convert -background "#ffffff00" -flatten -bordercolor "#ffffff00" \
            -border 50x50 input.png output.tiff

‘convert’ may also be used to convert an EPS image to a raster format:

  $ convert -density 300 input.eps -geometry 900x900 output.tiff

It can also convert SVG to raster formats, but I find its output inferior to that of Inkscape.