FAQ: How to order the (factor) variables in ggplot2



When you make a bar plot for categorical (i.e., factor) variables, probably you want to order the levels of variable in some way.


The basic idea is that making data.frame in the order you want. But this does not woks well, because the levels are reordered alphabetically. Actually, this is not a power of ggplot2, but the general behavior of factor variable.

ggplot2 uses the order of levels of factor variable to determine the order of category.
In R, there are many ways to specify the order of factors.

# sample data.
d <- data.frame(Team1=c("Cowboys", "Giants", "Eagles", "Redskins"), Win=c(20, 13, 9, 12))

# basic layer and options
p <- ggplot(d, aes(y=Win)) + opts(axis.text.x=theme_text(angle=90, hjust=1))

# default plot (left panel)
# the variables are alphabetically reordered.
p + geom_bar(aes(x=Team1), stat="identity") + opts(title="Default")

# re-order the levels in the order of appearance in the data.frame
d$Team2 <- factor(d$Team1, as.character(d$Team1))
# same as
# d$Team2 <- factor(d$Team1, c("Cowboys", "Giants", "Eagles", "Redskins"))

# plot on the re-ordered variables (Team2) (middle panel)
p + geom_bar(aes(x=Team2), data=d, stat="identity") + opts(title="Order by manual")

# re-order by variable Win
# the variables are re-orderd in the order of the win
d$Team3 <- reorder(d$Team1, d$Win)

# plot on the re-ordered variables (Team3) (right panel)
p + geom_bar(aes(x=Team3), data=d, stat="identity") + opts(title="Order by variable")

Note that re-ordering the factor is also useful to specifying the order of facet and so on.

31 thoughts on “FAQ: How to order the (factor) variables in ggplot2

  1. You made my day! I was looking for the same topic on a boxplot, and it works the same way. Very well explained. Thank you so much, Sascha

  2. Pingback: How to rearrange the order of factors in R for plotting with ggplot2 « bartev

  3. Pingback: This made my day | Non-Commercial Use

    • > iris$Species2 ggplot(iris, aes(Sepal.Length, Sepal.Width, color = factor(Species2))) + geom_point(size = 3) + geom_line() + theme(legend.position = c(0.84, 0.91), legend.background = element_rect(colour = “black”)) + scale_colour_manual(values = c(“red”, “blue”, “green”))

  4. Sorry that last comment got F’ed up for some reason. Here’s how to do it:

    > iris$Species2 ggplot(iris, aes(Sepal.Length, Sepal.Width, color = factor(Species2))) + geom_point(size = 3) + geom_line() + theme(legend.position = c(0.84, 0.91), legend.background = element_rect(colour = “black”)) + scale_colour_manual(values = c(“red”, “blue”, “green”))

  5. This is awesome! 2nd generation question: is it possible to reorder based on a subset of data?
    You would need three variables
    1 Success – we have an outcome we want to order — success
    2 Task – then we have a bunch of tasks —- tasks
    * we could order the tasks based on success, but…
    3 time – the tasks happened at time 1 and time 2, and I want to order it based on task success on time 1.

    I dunno. I might just do this manually. Thanks though!

  6. hello the second line throws me this error.
    I guess change the sintaxys with alctualizacion. anyone has any solution thanks

    > p <- ggplot(d, aes(y=Win)) + opts(axis.text.x=theme_text(angle=90, hjust=1))
    Error: 'opts' is deprecated. Use 'theme' instead. (Defunct; last used in version 0.9.1)

  7. Really appreciated this post. I had this problem last night with the labels I was using appearing on different values unexpectedly. I’m glad other people have experienced the same problem. It just took a simple reorder command to fix it. Thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s