Associated Material
Zoom notes: Zoom Notes 05 -
Communicate
Readings:
Introducing RMarkdown
So far through this course we’ve been using Rscripts for analysis
which lets us save and run our R code, including comments about what
we’re doing along the way. We’re now going to introduce RMarkdown
documents - which are like Rscripts on steroids!
RMarkdown is a framework that enables the creation of reproducible
documents which are a combination of text, R code, and the evaluated
output from the code all embedded in a single document. Not only that,
but from a single RMarkdown source document, multiple different output
formats can be produced such as HTML, PDF, and Word docs.
In fact this entire course has been written using RMarkdown! At the
top right of each page is a Code
button that will let you
download the RMarkdown code that created the page.
Below is an example of an RMarkdown source document
---
title: "Abridged Gapminder Analysis"
date: 2022-04-13
output: html_document
---
```{r setup, include = FALSE}
library(tidyverse)
```
# Introduction
Load in the Gapminder dataset so that it is ready for analysis
```{r read.csv}
# Save an imported data frame into a named variable
gapminder_data <- read_csv("gapminder_data_2007.csv")
```
There are `r nrow(gapminder_data)` rows to the dataset.
## Visualise Life Expectancy
This is a histogram of the life expectancy.
```{r hist}
# Histogram of life expectancy values from gapminder
gapminder_data %>%
ggplot(aes(x = lifeExp)) +
geom_histogram()
```
There are three main components to this document
- The YAML header which is surrounded by
---
s and provides information for the compiling
process
- R code chunks which are surrounded by
```
s
- Text which can be formatted using the Markdown language.
A reference guide of RMarkdown syntax can be found through
Help
-> Cheat Sheets
->
R Markdown Reference Guide
in the RStudio menu.
Example RMarkdown
Before we delve into explaining each part of the RMarkdown file we’re
going to create our own from the included template that comes with
RStudio.
Lets create our own RMarkdown document now from the template. To do
this go File
-> New File
->
R Markdown
. You’ll then be presented with a window that
looks like this
Take the opportunity to fill in your name and title then click
OK
.
You should now have a document that looks like the following:
---
title: "My First Rmd"
author: "Murray"
date: '2022-04-13'
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
RStudio Visual Editor
From Rstudio v1.2 there has been the inclusion of a live-preview
editor that can be turned on that provides a graphical point-and-click
method of editing Markdown.
Documentation for how to use the editor and its functionality can be
found at https://rstudio.github.io/visual-markdown-editing/
Knitting
In order to get our output document, we need to do a compiling step
or knit the document - behind the scenes the text
portions are formatted based on the markdown syntax, the R code is run
and the results generated, and then the formatted text, code, and
results are “knitted” together as a single output.
One of the key benefits on a reproducibility side is that RMarkdown
is evaluated from top to bottom externally from your session and so it
needs to be self-reliant and have all the commands from reading your
data in, processing it, and making your awesome tables and plots like in
the Visualisation Module.
To knit the document look for the knit
button in the top
left of the “source” panel. The keyboard shortcut is Ctrl +
Shift + K on PC or Apple +
Shift + K on MacOS.
You will then be prompted to save this script, call it
“r_markdown_example.Rmd” as save it in your scripts/
directory within your project directory. Once you have knitted a window
should pop-up containing your brand new analysis document!
RMarkdown scripts generally have the file extension
.Rmd
.
Take a few minutes to read from top to bottom through your script and
identifying the same features in your outputted HTML document.
Markdown syntax
Markdown is a simplified language that uses symbols to encode
formatting of text in a compiled document. Markdown documents can be
converted to HTML or LaTeX (used for PDF) through Pandoc (which comes
bundled with RStudio).
Headings
Headers - these use the # for the largest heading (header 1) through
to ###### the smallest heading (level 6) and correspond to the h1 to h6
heading tags in HTML.
# Level 1 heading
## level 2 heading
### level 3 heading
#### level 4 heading
##### level 5 heading
###### level 6 heading
We’ll cover some more of the common text formatting now, where you’ll
see the rendered paragraph followed by the markdown syntax that was used
to generate it:
Bold/Italics
Italics is encoded by surrounding word(s) with with a single
asterisk (*) or underscore (_), bold uses double
asterisks ** or underscores __. To superscript something,
surround it with carets (^), and to subscript surround it
with tilde (~). Surrounding with double tildes will
strikethrough.
*Italics* is encoded by surrounding word(s) with with a single asterisk (\*) or underscore (_), **bold** uses double asterisks ** or underscores __. To ^super^script something, surround it with carets (^), and to ~sub~script surround it with tilde (~). Surrounding with double tildes will ~~strikethrough~~.
Lists
Unordered lists can be made by starting a line with either a dash (-)
or an asterisk (*) and if you want to nest items use a tab or two spaces
to indent per layer.
- item 1
- item 2
- item 3
- item 4
- item 1
- item 2
- item 3
- subitem 1
- subitem 2
- sub sub item 1
- item 4
Ordered lists start the line with a number followed by a fullstop. It
is possible to nest unordered and ordered lists within the same list
- item 1
- item 2
- item 3
1. item 1
2. item 2
3. item 3
Block quotes
block quotes are a way of including blocks of text from someone else.
To use these that the line with a > angle bracket
> block quotes are a way of including blocks of text from someone else. To use these begin the line with a > angle bracket
Links
Links can be done as either the full url e.g. https://www.google.com, or
you can link words by surrounding
them with [] followed immediately by the url in parentheses.
Links can be done as either the full url e.g. https://www.google.com, or you can [link words](https://www.google.com) by surrounding them with [] followed immediately by the url in parentheses.
Verbatim code
If you want to include code in your document, the use of verbatim blocks will stop the symbols being interpreted for markdown and will be reproduced asis in the document.
These blocks are started and ended with three backticks ```
```
If you want to include code in your document as has been done to demonstrate the markdown code that generated each of the example paragraphs, the use of verbatim blocks will stop the symbols being interpreted for markdown and will be reproduced as is in the document.
Theses blocks are started and ended with three backticks ```
```
You can also do inline verbatim
by surrounding the text
with a single backtick
You can also do `inline verbatim` by surrounding the text with a single backtick
Code Chunks
Markdown provides verbatim code chunks, however where RMarkdown
really comes into its own is the ability to have the code that is
included evaluated and the results also embedded directly below the code
that was created them. While it’s called RMarkdown you’re also not just
limited to R but other languages can be included and run (so long as the
underlying engines are set up)
A code chunk takes this format, similar to to the verbatim code chunk
but following the first three backticks are curly braces, and inside the
name of the language in lower case - in this case “r”
```{r}
1 + 2
```
Would produce
1 + 2
#> [1] 3
Working directory
The working directory or location that R is going to start looking
for specified files (e.g. a csv to read in) for an RMarkdown will
default to the location the RMarkdown file is saved. This can be a
common source of errors in compiling an RMarkdown document if your
RMarkdown is saved in a subdirectory and you don’t have your file paths
correct.
Don’t use setwd()
in an RMarkdown. It will cause
issues.
If you are using an RStudio project and structure as introduced in Introducing R and Rstudio you can make use of
the here
package which provides a nice way of dealing with relative file paths as
if you were navigating from the top of your project directory.
For instance given the following project setup:
my_project/
|- data/
\- my_csv.csv
|- docs/
|- outputs/
|- scripts/
\- my_rmd.Rmd
\ - my_project.Rproj
If we were working on the file my_rmd.Rmd
without the
use of here
we would need to use relative paths from
scripts/
(we want to use relative paths within our project
because they aren’t dependant on any particular computer making our
project transferable) and the command to read data in would look like
this:
my_data <- read_csv("../data/my_csv.csv")
Using here
everything is relative from the
.Rproj
file which can be easier to think of since it
follows a relative path the same structure as the project, not relative
to where the file you’re currently working on lives - here
works all that out for you:
library(here)
my_data <- read_csv(here("data/my_csv.csv"))
Code Chunk Options
The behaviour of the code chunks can be modified with options. These
options are provided inside the {}’s of the code chunk and are comma
separated.
The defaults for a chunk are:
```{r, eval=TRUE, echo=TRUE, message=TRUE, include=TRUE, warning=TRUE}
1 + 2
```
echo=TRUE
will “echo” the code that is run above the
results
eval=TRUE
means the code inside the chunk will be
evaluated (run)
include=TRUE
means the code and the results will be
included in the document
warning=TRUE
will include any warnings as output in the
document
message=TRUE
will include messages as output in the
document
These can individually be specified and set to FALSE
to
disable the specific behaviour.
Citations
Citations can be inserted into an RMarkdown document. This
document from RStudio goes through how to do it using ether Markdown
or with the visual editor which can be linked with a citation manager
such as Zotero, or by searching DOIs and more.
Quarto
Pre-requisite: In order to use Quarto you will need to install the
Quarto program which RStudio can then use to compile the Quarto. See https://quarto.org/docs/get-started/
RMarkdown is an extremely useful format for creating reproducible
reports, however, there are some key features that are missing
(without additional packages and tweaking) which you will find you need
if you want to use it for making documents like theses or manuscripts,
the easiest to point to is cross-referencing to figures and tables in
your text (the packages bookdown and thesisdown do add
this functionality to RMarkdown).
Quarto is the next iteration of RMarkdown, and has taken much of the
functionality that the extra packages created to expand on RMarkdown
had, and includes them right from the get-go. Not only that, but Quarto
has been designed from the start to have multi-language support, so if
you find yourself working in another language such as python, then this
same document publishing system is still available to you.
By and large, Quarto and RMarkdown are extremely similar - they both
share the three main components:
- YAML header
- Markdown blocks
- Code chunks
Where they differ is there are some slight syntax changes, largely in
the YAML header and how options are given to code chunks.
The first difference is that instead of being saved with a
.Rmd
file extension, a quarto document has the extension
.qmd
. And instead of the Knit
button, it’s
called Render
.
Code chunks
The code part of the code chunks are exactly the same with quarto as
in RMarkdown. Where they differ is how the chunk options are provided.
Instead of them being placed within the curly braces, they can be listed
inside the block at the top following a #|
, using the
key: value
syntax like in the YAML header, instead of using
=
.
```{r}
#| eval: false
1 + 2
```
Conclusion
This module has only scratched the surface of what is possible with
the highly versatile format that is RMarkdown. The main benefit that
RMarkdown is that it provides a mechanism to create reproducible
analysis documents that include prose, code, and generated outputs.
Make sure to check out RMarkdown - the definitive
guide for a comprehensive introduction and guide to the
possibilities of RMarkdown. There are also packages for creating
multi-document RMarkdown outputs such as entire websites
(packagedown
, distill
), blogs
(blogdown
), and books (bookdown
).
