LaTeX is a document preparation system for high-quality typesetting (http://www.
latex-project.org) that’s freely available for Windows, Mac, and Linux platforms.
An author creates a text document that includes markup code for formatting the
content. The document is then processed through a LaTeX compiler , producing a finished document in PDF, PostScript, or DVI format.
The Sweave package allows you to embed R code and output (including graphs) within the LaTeX document. This is a multistep process:
1 A special document called a noweb file (typically with the extension .Rnw) is created using any text editor. The file contains the written content, LaTeX markup code , and R code chunks. Each R code chunk starts with the delimiter
<<>>= and ends with the delimiter @.
2 The Sweave() function processes the noweb file and generates a LaTeX file. During this step, the R code chunks are processed, and depending on options, replaced with LaTeX-formatted R code and output. This step can be accomplished from within R or from the command line.
Within R, the format is
Sweave("infile.Rnw")
By default, Sweave("example.Rnw") would input the file example.Rnw from the current working directory and output the file example.tex to the same directory.
Alternatively, use can use
Sweave("infile.Rnw", syntax="SweaveSyntaxNoweb")
Specifying this syntax option can help avoid some common parsing errors, as well as conflicts with the R2HTML package .
Execution from the command line will depend on the operating system. For example, on a Linux system, this might look like $ R CMD Sweave infile.Rnw
3 The LaTeX file is then run through a LaTeX compiler, creating a PDF, PostScript, or DVI file. Popular LaTeX compilers include TeX Live for Linux, MacTeX for Mac, and proTeXt for Windows.
The complete process is outlined in figure D.1.
example.rnw
example.rnw
example.pdf PDF file
Run through LaTeX compiler example.ps
Postscript file example.dvi
DVI file
LaTex (TeX) file Text file with LaTex
markup and Rcode Chunks
Run through Sweave() function in R
Figure D.1 Process for generating a publication-quality report using Sweave
As indicated earlier, each chunk of R code is surrounded by <<>>= and @. You can add options to each <<>>= delimiter in order to control the processing of the correspond- ing R code chunk. For example
<<echo=TRUE, results=HIDE>>=
summary(lm(Y~X, data=mydata))
@
would output the code, but not the results, whereas
<<echo=FALSE, fig=TRUE>>=
plot(A)
@
wouldn’t print the code but would include the graph in the output. Common delim- iter options are described in table D.1.
Table D.1 Common options for R code chunks
Option Description
echo Include the code in the output (echo=TRUE) or not (echo=FALSE). The default is TRUE.
eval Use eval=FALSE to keep the code from being evaluated/executed. The default is TRUE.
fig Use fig=TRUE when the output is a graph. The default is FALSE.
results Include R code output (results=verbatim), suppress the output (results=hide), or include the output and assume that it contains LaTeX markup (results=tex).
The default is verbatim. Use results=tex when the output is generated by the xtable() function in the xtable package or the latex() function in the Hmisc package.
By default, Sweave will add LaTeX markup code to attractively format data frames, ma- trices, and vectors. Additionally, R objects can be embedded inline using a \Sexpr{}
statement . Note that lattice graphs must be embedded in a print() statement to be processed properly.
The xtable() function in the xtable package can be used to format data frames and matrices more precisely. In addition, it can be used to format other R objects, including those produced by lm(), glm(), aov(), table(), ts(), and coxph(). Use method(xtable) to view a comprehensive list. When formatting R output using xtable() , be sure to include the results=tex option in the code chunk delimiter.
It’s easier to see how this all works with an example. Consider the noweb file in listing D.1. This is a reworking of the one-way ANOVA example in section 8.3. LaTeX markup code begins with a backslash (\). The exception is \Sexpr{}, which is a Sweave addition. R related code is presented in bold italics.
Listing D.1 A sample noweb file (example.nrw)
\documentclass[12pt]{article}
\title{Sample Report}
\author{Robert I. Kabacoff, Ph.D.}
\date{}
\begin{document}
\maketitle
<<echo=false, results=hide>>=
library(multcomp) library(xtable) attach(cholesterol)
@
\section{Results}
Cholesterol reduction was assessed in a study that randomized \Sexpr{nrow(cholesterol)} patients to one of \Sexpr{length(unique(trt))} treatments.
Summary statistics are provided in Table \ref{table:descriptives}.
<<echo = false, results = tex>>=
descTable <- data.frame("Treatment" = sort(unique(trt)), "N" = as.vector(table(trt)),
"Mean" = tapply(response, list(trt), mean, na.rm=TRUE), "SD" = tapply(response, list(trt), sd, na.rm=TRUE) )
print(xtable(descTable, caption = "Descriptive statistics for each treatment group", label = "table:descriptives"), caption.placement = "top", include.rownames = FALSE)
@
The analysis of variance is provided in Table \ref{table:anova}.
<<echo=false, results=tex>>=
fit <- aov(response ~ trt)
print(xtable(fit, caption = "Analysis of variance", label = "table:anova"), caption.placement = "top")
@
\noindent and group differences are plotted in Figure \ref{figure:tukey}.
\begin{figure}\label{figure:tukey}
\begin{center}
<<fig=TRUE,echo=FALSE>>=
par(mar=c(5,4,6,2))
tuk <- glht(fit, linfct=mcp(trt="Tukey"))
plot(cld(tuk, level=.05),col="lightgrey",xlab="Treatment", ylab="Response") box("figure")
@
\caption{Distribution of response times and pairwise comparisons.}
\end{center}
\end{figure}
\end{document}
Sample Report
Robert I. Kabacoff, Ph.D.
1 Results
Cholesterol reduction was assessed in a study that randomized 50 patients to one of 5 treatments. Summary statistics are provided in Table 1.
Table 1: Descriptive statistics for each treatment group Treatment N Mean SD
1time 10 5.78 2.88 2times 10 9.22 3.48 4times 10 12.37 2.92 drugD 10 15.36 3.45 drugE 10 20.95 3.35
The analysis of variance is provided in Table 2.
Table 2: Analysis of variance
Df Sum Sq Mean Sq F value Pr(>F)
trt 4 1351.37 337.84 32.43 0.0000
Residuals 45 468.75 10.42
and group differences are plotted in Figure 1.
Figure D.2 Page 1 of the report created from the sample noweb file in listing D.1. The noweb file was processed through the Sweave() function in R and the resulting TeX file was processed through a LaTeX compiler to produce a PDF document.
After processing the noweb file through the Sweave() function in R and processing the resulting TeX file through a LaTeX compiler, the PDF document in figures D.2 and D.3 is generated.
●
●
1time 2times 4times drugD drugE
510152025
Treatment
Response
a a
b b
c c
d
Figure 1: Distribution of response times and pairwise comparisons.
Figure D.3 Page 2 of the report created from the sample noweb file in listing D.1.
To learn more about Sweave, visit the Sweave home page (www.stat.uni-muenchen.
de/~leisch/Sweave/). An excellent presentation is also provided by Theresa Scott (http://biostat.mc.vanderbilt.edu/TheresaScott). To learn more about LaTeX, check out the article "The Not So Short Introduction to LaTeX 2e,” available on the LaTeX home page (www.latex-project.org).