Pages

24 August 2010

Org-mode, gnuplot, and The Fall [2 of 2]

This is part one of a two part post on The Fall and org-mode/gnuplot. Part 1 dealt with The Fall; this post will focus on the methods used to make the plots and some other info.

I ended up writing up what I learned in the process of making my graphs for the fall in a more comprehensive document. It's looking like it will make it on Worg, the org-mode wiki site. Rather than rehash exactly what I did for the fall graphs, then, I thought I'd just post a modified version of my write-up instead. While the data tables and specific points are different, these three examples cover how I made the graphs for the fall.

The examples below assume at least some familiarity with org-mode and/or gnuplot. While the code is written in org-mode babel blocks, the same lines would work from a gnuplot terminal as well. For a quick-start, check out:

- Org-mode manual on gnuplot
- A gnuplot tutorial
- The gnuplot homepage

One last note: for quality purposes, I have exported to .eps and then used ImageMagick to convert to .png images. If you are not familiar with ImageMagick or don't want to use it, remove the
set terminal postscript color solid eps enhanced 20
line from the examples below and change all instances of
:file file-name.eps
to
:file file-name.png
and that should export directly to .png files for you.

Here we go!

-----

I have inquired about some various tips and tricks on the org-mode mailing list regarding the use of gnuplot and I thought I would put together a short piece on what I've come to use and understand so that it's all in one place.

On that note… this information is all available elsewhere: the org-mode mailing list, blogs, the gnuplot manual and so on. All I'm doing is putting together some things I consider handy right in one place for ease of use. I'll try to make reference to original sources where possible.

To get gnuplot up and running, follow the guide on worg.1

Once babel and gnuplot are working in org-mode, the setup is generally like this:

Data Table (if pulling from a table and not a formula)
#+tblname: data-table
| x | y1 | y2 |
|---+----+----|
| 0 | 3 | 6 |
| 1 | 4 | 7 |
| 2 | 5 | 8 |


Gnuplot Source Block
#+begin_src gnuplot :var data=data-table :file output.png

gnuplot code goes here

#+end_src


On to examples!

Named X-Tics



Summary
This option allows one to have named tics on the x-axis (I'm not sure if named y-tics are possible). This option is possible already if the x value column contains names, but this option allows for placing the named tics wherever one wants. With the "normal" way (just setting the x index for a column with names) will evenly space the names along the x-axis.6 I wanted non-uniform spacing.7

Make a table with a column for the value of the x-tic (location is another way to think of it), another column with the name (label) for each x-tic, and then add whatever subsequent y values should correspond.

Example
#+tblname: x-tics
|----------+-------+------|
| tic name | x-loc | Dead |
|----------+-------+------|
| Civil | 1861 | 0.62 |
| WWI | 1914 | 9.8 |
| WWII | 1939 | 24 |
| Nam | 1955 | 1.5 |
| Gulf | 1990 | 0.04 |
|----------+-------+------|

And here is the gnuplot code we'll useOrg-mode, gnuplot, and The Fall [2 of 2]

#+begin_src gnuplot :var data=x-tics :file x-tics.eps :results silent
reset
set terminal postscript color solid eps enhanced 20

set yrange [0:25]
set xtics ("1850" 1850, "2010" 2010)
set xrange [1850:2010]
set ylabel "Deaths (MM)"
set xlabel "Wars in Time"
set title 'War Deaths'

plot data using 2:3:xticlabels(1) w p lw 3 notitle
#+end_src




Notes
For more than one set of y values, just do something like this:
plot data u 2:3:xticlabels(1) w lines title 'Set1',\
data u 2:4:xticlabels(1) w lines title 'Set2',\
data u 2:5:xticlabels(1) w lines title 'Set3',\


Different Scales



Summary
Different scales are accomplished through multiplot by making small graphs and then adjusting their sizes, origins (position), and margins in order to overlay them side-by-side to create the appearance of a single graph.8 9

Example
#+tblname: world-pop
|----------+--------+------|
| tic name | x-loc | Pop |
|----------+--------+------|
| 10k BC | -10000 | 1 |
| | -9000 | 3 |
| | -8000 | 5 |
| | -7000 | 7 |
| | -6000 | 10 |
| | -5000 | 15 |
| | -4000 | 20 |
| | -3000 | 25 |
| | -2000 | 35 |
| | -1000 | 50 |
| | -500 | 100 |
| AD 1 | 1 | 200 |
| 1000 | 1000 | 310 |
| 1750 | 1750 | 791 |
| 1800 | 1800 | 978 |
| 1850 | 1850 | 1262 |
| 1900 | 1900 | 1650 |
| \'50 | 1950 | 2519 |
| | 1955 | 2756 |
| | 1960 | 2982 |
| | 1965 | 3335 |
| | 1970 | 3692 |
| \'75 | 1975 | 4068 |
| | 1980 | 4435 |
| | 1985 | 4831 |
| | 1990 | 5263 |
| | 1995 | 5674 |
| | 2000 | 6070 |
| 2005 | 2005 | 6454 |
|----------+--------+------|

The code is as follows:

#+begin_src gnuplot :var data=world-pop :file world-pop.eps :results silent
reset
set terminal postscript color solid eps enhanced 20

set xrange [ -10000 : 1 ]
set yrange [ 0 : 7000 ]
set xlabel "Time"
set multiplot

set size 0.275,1
set origin 0.0,0.0
set lmargin 10
set rmargin 0
set ylabel "Population (MM)"
plot data using 2:3:xticlabels(1) with lines lw 3 notitle

set origin 0.275,0.0
set size 0.15,1
set format y ""
set lmargin 0
set rmargin 0
set xrange [2 : 1750]
set ylabel ""
plot data using 2:3:xticlabels(1) with lines lw 3 notitle

set origin 0.425,0.0
set size 0.575,1
set format y ""
set lmargin 0
set rmargin 2
set xrange [1751 : 2005]
set ylabel ""
plot data using 2:3:xticlabels(1) with lines lw 3 notitle

set nomultiplot
#+end_src




Notes
  • Size sets the width/height of the piece
  • Origin sets where the plot begins: left at (0,0), middle at (left-size,0), and right at (left-size + middle-size,0)
  • Margins determine the border spacing. Left has enough for the Y-axis
    title (lmargin) and 0 for rmargin, middle has 0 for both (to
    seamlessly fit with left and right), and right has 0 lmargin and a
    little rmargin to make things look nice
  • I use xranges like so: [x1 : x2], [x2+1 : x3], [x3+1 : x4]
  • The set ylabel "" is there on the middle and right pieces to keep the labels from repeating on each y axis
  • It's possible to remove the y axes by using "border set" options for each piece
    • Left would want "set border 1+2+4", mid = "set border 1+4", and right = "set border 1+4+8"
    • Use "set noytics" to remove the floating tic lines
    • I found this visually appealing but potentially confusing since no y-axes for each slice might give the illusion that the x-axis is the same scale. If the y axes are there, it makes one realize that there is something else going on…
  • Note that there is no title and there are three x-axis labels. Each "piece" gets its own – I suppose I could set it to null using 'set xlabel ""' but wanted to illustrate the behavior. Setting a title would create three titles so I left it off. Perhaps there's a way to manually place the title but I'm too unfamiliar to be aware of it right now.


Broken Axis



Summary
One can use arrows to break up axes very cleverly.10 11 12 The general method is to draw 6 arrows to break the x-axes both at the top and the bottom: 4 diagonal and 2 white (to create the illusion of a break).

While the following is not really to scale, I think the example of a far distant date with a broken line and then some recent dates shows how this can work in an esthetically pleasing way. We'll just use the same world population data, but modify it a tad.

Example
#+tblname: broken-axis
|-----------+-------+-----+------|
| tic name | x-loc | Pre | Post |
|-----------+-------+-----+------|
| 10,000 BC | 1600 | 1 | |
| | 1650 | 15 | |
| AD 1 | 1700 | 200 | |
| 1750 | 1750 | | 791 |
| 1800 | 1800 | | 978 |
| 1850 | 1850 | | 1262 |
| 1900 | 1900 | | 1650 |
| \'50 | 1950 | | 2519 |
| | 1955 | | 2756 |
| | 1960 | | 2982 |
| | 1965 | | 3335 |
| | 1970 | | 3692 |
| \'75 | 1975 | | 4068 |
| | 1980 | | 4435 |
| | 1985 | | 4831 |
| | 1990 | | 5263 |
| | 1995 | | 5674 |
| | 2000 | | 6070 |
| 2005 | 2005 | | 6454 |
|-----------+-------+-----+------|

The code to be used:

#+begin_src gnuplot :var data=broken-axis :file broken-axis.eps :results silent
reset
set terminal postscript color solid eps enhanced 20

A=1725
B=1600
C=2010
D=0
E=6500

xoff=.005*(C-B)
yoff=.02*(E-D)

set arrow 1 from A-xoff, D to A+xoff, D nohead lw 2 lc rgb "#ffffff" front
set arrow 2 from A-xoff, E to A+xoff, E nohead lw 2 lc rgb "#ffffff" front
set arrow 3 from A-xoff-xoff, D-yoff to A+xoff-xoff, D+yoff nohead front
set arrow 4 from A-xoff+xoff, D-yoff to A+xoff+xoff, D+yoff nohead front
set arrow 5 from A-xoff-xoff, E-yoff to A+xoff-xoff, E+yoff nohead front
set arrow 6 from A-xoff+xoff, E-yoff to A+xoff+xoff, E+yoff nohead front

set xrange [B:C]
set yrange [D:E]

set xlabel 'Time'
set ylabel 'Population (MM)'
set title 'World Population'

plot data u 2:3:xticlabels(1) w l lw 3 notitle,\
data u 2:4:xticlabels(1) w l lw 3 lc 1 notitle
#+end_src




Notes
In any case, from the above:
  • A->E are just used to set the break location (A) and the xrange (B,C) and yrange (D,E)
  • xoff/yoff have to do with the break. xoff is the gap created in the x-axis and yoff is the height above and below the scale. The multipliers work for this example but may need to get tinkered with for others.
  • The arrows draw the 4 diagonal lines and a white line in between them to create the actual break
  • I used two sets of y values and two plot commands to create the break between AD 1 and 1750. This is not always needed. See footnote 4 for how to do this with a continuous function (the site uses sin x) and an "offset" variable to bump the whole thing over a tad.


The example at gnuplot-tricks uses a continuous function (sin x) which probably works the best since the x-axis scale is the same. In this case, though, the axis is "cheated" in that it is not only broken, but the scale is artificially manipulated. In the data chart, we should have had population values at 10,000 BC, 5,000 BC, and 1 AD. Instead I put them at 1600, 1650 and 1700 AD. The spacing is proportionate, but scaled by 100x (5,000 years vs. 50). Compared to the plot from 1750-2005 it's obviously not the same x-axis scale. While not technically correct, I think it's perhaps more visually appealing, especially where scale is not too important. To get the point across, it does the job very well: left of break was not much growth, then in a much smaller time scale to the right of the break, much population growth occurred. The multi-axes/scales in the previous section illustrates more correctly with respect to scale, but I think this example is cleaner.

0 comments:

Post a Comment

<i>, <b> | links: <a href=""></a>