R base plotting without wrappers

Base plotting is as old as R itself yet for most users it remains mysterious. They might be using plot() or even know the full list of its parameters but most never understand it fully. This article attempts to demystify base graphics by providing a friendly introduction for the uninitiated.

Deconstructing a plot

Quickly after learning R users start producing various figures by calling plot(), hist(), or barplot(). Then, when faced with a complicated figure, they start stacking those plots on top of one another using various hacks, like add=TRUE, ann=FALSE, cex=0. For most this marks the end of their base plotting journey and they leave with an impression of it being an ad-hoc bag of tricks that has to be learned and remembered but that otherwise is hard, inconsistent, and unintuitive. Nowadays even experts who write about base graphics(1) or compare it with other systems(2) share the same opinion. However, those initial functions everyone was using are only wrappers on top of the smaller functions that do all the work. And many would be surprised to learn that under the hood base plotting follows the paradigm of having a set of small functions that each do one thing and work well with one another.

Let’s start with the simplest example.

plot(0:10, 0:10, xlab = "x-axis", ylab = "y-axis", main = "my plot")

The plot() function above is really just a wrapper that calls an array of lower level functions.

plot.new()
plot.window(xlim = c(0,10), ylim = c(0,10))
points(0:10, 0:10)
axis(1)
axis(2)
box()
title(xlab = "x-axis")
title(ylab = "y-axis")
title(main = "my plot")

Written like this all the elements comprising the plot become clear. Every new function call draws a single object on top of the plot produced up until that point. It becomes easy to see which line should be modified in order to change something on the plot. Just as an example let’s modify the above plot by: 1) adding a grid, 2) removing the box around the plot, 3) removing the axis lines, 4) making axis labels bold, 5) turning the annotation labels red, and 6) shifting the title to the left.

plot.new()
plot.window(xlim = c(0,10), ylim = c(0,10))
grid()
points(0:10, 0:10)
axis(1, lwd = 0, font.axis=2)
axis(2, lwd = 0, font.axis=2)
title(xlab = "x-axis", col.lab = "red3")
title(ylab = "y-axis", col.lab = "red3")
title(main = "my plot", col.main = "red3", adj = 0)

In each case to achieve the wanted effect only a single line had to be modified. And the function names are very intuitive. A person without any R experience would have no trouble saying which element on the plot is added by which line or changing some of the parameters.

So, in order to construct a plot various functions are called one by one. But where do we get all the names for those functions? Do we need to remember hundreds of them? Turns out the set of all the things you might need to do on a plot is pretty limited.

par()          # specifies various plot parameters
plot.new()     # starts a new plot
plot.window()  # adds a coordinate system to the plot region

points()       # draws points
lines()        # draws lines connecting 2 points
abline()       # draws infinite lines throughout the plot
arrows()       # draws arrows
segments()     # draws segmented lines
rect()         # draws rectangles
polygon()      # draws complex polygons

text()         # adds written text within the plot
mtext()        # adds text in the margins of a plot

title()        # adds plot and axis annotations
axis()         # adds axes
box()          # draws a box around a plot
grid()         # adds a grid over a coordinate system
legend()       # adds a legend

The above list covers majority of the functionality needed to recreate almost any plot. And for demonstration example() can be used to quickly see what each of those functions do, i.e. example(rect). R also has some other helpful functions like rug() and jitter() to make certain situations easier but they are not crucial and can be implemented using the ones listed above.

Function names are quite straightforward but what about their arguments? Indeed some of argument names, like cex can seem quite cryptic. But the argument name is always an abbreviation for a property of the plot(3). For example col is a shorthand for “color”, lwd stands for “line-width”, and cex means “character expansion”. Good news is that in general the same arguments stand for the same properties across all of base R functions. And for a specific function help() can always be used in order to get the list of all arguments and their descriptions.

To further illustrate the consistency between arguments let’s return to the first example. By now it should be pretty clear, with one exception - the axis(1) and axis(2) lines. Where do those numbers: 1 and 2 came from? The numbers specify the positions around the plot and they start from 1 which refers to the bottom of the plot and go clockwise up to 4 which refers to the right side. The picture below demonstrates the relationship between numbers and four sides of the plot.

plot.new()
box()
mtext("1", side = 1, col = "red3")
mtext("2", side = 2, col = "red3")
mtext("3", side = 3, col = "red3")
mtext("4", side = 4, col = "red3")

The same position numbers are used throughout the various different functions. Whenever a parameter of some function needs to specify a side, chances are it will do so using the numeric notation described above. Below are a few examples.

par(mar = c(0,0,4,4))        # margins of a plot: c(bottom, left, right , top)
par(oma = c(1,1,1,1))        # outer margins of a plot
axis(3)                      # side where axis will be displayed
text(x, y, "text", pos = 3)  # pos selects the side the "text" is displayed at
mtext("text", side = 4)      # side specifies the margin "text" will appear in

Another important point is vectorization. Almost all the arguments for base plotting functions are vectorized. For example, when plotting rectangles the user does not have to add each point of each rectangle one by one within a loop. Instead he or she can draw all the related objects with one function call while at the same time specifying different positions and parameters for each.

plot.new()
plot.window(xlim = c(0,3), ylim = c(0,3))

rect(xleft = c(0,1,2), ybottom = c(0,1,2), xright = c(1,2,3), ytop = c(1,2,3),
     border = c("pink","red","darkred"), lwd = 10
     )

Here is another example producing a check board pattern.

plot.new()
plot.window(xlim = c(0,10), ylim = c(0,10))

xs <- rep(1:9, each = 9)
ys <- rep(1:9)

rect(xs-0.5, ys-0.5, xs+0.5, ys+0.5, col = c("white","darkgrey"))

Constructing a plot

One of base R graphics strengths is it’s flexibility and potential for customization. It really shines when the user needs to follow a particular style found in an existing example or a template(4). Below are a few illustrations demonstrating how different base functions can work together and reconstruct various types of common figures from scratch.

Summary

R base plotting system has several polished and easy to use wrappers that are sometimes convenient but in the long run only confuse and hide things. As a result most R users are never properly introduced to the real functions behind the base plotting paradigm and are left confused by many of its perceived idiosyncrasies. However, if inspected properly, base plotting can become powerful, flexible, and intuitive. Under the hood of all wrappers the heavy lifting is done by a small set of simple functions that work in tandem with one another. Often a few lines of code is all it takes to produce an elegant and customized figure.


  1. “Why I don't use ggplot2” by Jeff Leek  ↩︎
  2. “Why I use ggplot2” by David Robinson  ↩︎
  3. “Graphics parameter mnemonics” by Paul Murrell  ↩︎
  4. “Reproducing the style of a histogram plot in R” on stackoverflow.com  ↩︎
  5. answer about plotting the legend in the margins with base R on stackoverflow.com  ↩︎
  6. “corrplot” package on CRAN  ↩︎