Peter Ralph
September 29, 2020
Communicate your thoughts to your collaborators and your future self,
combining text, math, documented code,
and no fuss.
No fussing about layout:
Markdown aims to be readable as-is,
but has methods to produce beautiful output.
A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. – John Gruber
Don’t even think about the layout, just the content.
[…] it is better to leave document design to document designers, and to let authors get on with writing documents – intro to LaTeX
Today’s goal:
Paragraphs begin and end with empty lines, and are not indented.
inline_code($fixed.width)Lists must be preceded by an empty line,
Indenting subsequent content
will continue the list
Indenting means one tab, or four spaces.
After using markdown for a bit, go read pandoc’s documentation.
Math goes between `$`, single ($\pi$), or double:
$$ \frac{\pi}{4} = \sum_{n=0}^\infty \frac{(-1)^n}{2k+1} .$$
Math goes between $, single (\(\pi\)), or double: \[\frac{\pi}{4} = \sum_{n=0}^\infty \frac{(-1)^n}{2k+1} .\]
Even math environments inside double dollar signs:
\[\begin{align} x &= (a+b)^2 - (a-b)^2 \\ &= 4ab \end{align}\]
```{.r}
msg <- "Hello, world."
print(msg)
```
produces
> I also dream about a modern replacement for LaTeX
> designed from the ground up to target multiple output formats
> (at least PDF, HTML, EPUB). -- [John MacFarlane](http://john.macfarlane.usesthis.com/)
produces
I also dream about a modern replacement for LaTeX designed from the ground up to target multiple output formats (at least PDF, HTML, EPUB). – John MacFarlane
Images are the same but with a ! in front.
Setting width and height are optional.
Directions / in Rstudio.
notes.Rmd.To compile, either:
Open R, run
and open notes.html in your web browser.
or in Rstudio, save the file (with suffix .Rmd) and click on knit.
Where is it? Use getwd() to tell you.
Remember,
[…] it is better to leave document design to document designers, and to let authors get on with writing documents.
If you must, then:
For more info see the documentation.
At the top of your document, add
… the YAML metadata,
delimited by exactly three dashes.
Note: Besides setting the title, you can control the output in many ways here.
For example: add
and render with rmarkdown::render("notes.md").
or even:
Goal: add R code to the document, along with its output.
Just add a chunk of R code, wrapped in
```{r}
# PUT ARBITRARY R CODE HERE
```
Try it!
Powers of two?
```{r}
2^(0:10)
```
How about this?
$$ \lim_{n \to \infty} 4 \sum_{k=1}^n \frac{ (-1)^n }{ 2n+1 } = \pi , $$
```{r}
cumsum( 4 * (-1)^(0:20) / (2*(0:20)+1) )
```
How about this? \[ \lim_{n \to \infty} 4 \sum_{k=0}^n \frac{ (-1)^n }{ 2n+1 } = \pi , \]
## [1] 4.000000 2.666667 3.466667 2.895238 3.339683 2.976046 3.283738 3.017072 3.252366 3.041840 3.232316 3.058403 3.218403 3.070255 3.208186 3.079153 3.200366 3.086080 3.194188 3.091624 3.189185
```{r}
plot(cumsum( 4 * (-1)^(0:20) / (2*(0:20)+1) ))
abline(h=pi, col='red')
```
Make a short Rmarkdown document that
checks that \[1 + 2 + \cdots + n = n(n+1)/2\] for every \(n\) between 1 and 100
shows these on a plot
explains what’s being computed
Useful: x = cumsum(1:100) and plot(x) and lines(y).
knitr uses a regular expression to find code chunks
pandoc renders the resulting markdown file
Name each chunk, and set options for what gets printed
```{r my_chunk_name, fig.height=4, echo=FALSE}
echo=(TRUE|FALSE)include source code in the output?
results="(markup|asis)"style the output or not?
include=(FALSE|TRUE)include anything in the output?
Set document defaults up top:
```{r, include=FALSE}
fig.dim <- 5
library(knitr)
opts_chunk$set(
fig.height=fig.dim,
fig.width=2*fig.dim,
fig.align='center'
)
```
One option: use pander.
```{r}
library(pander)
bases <- table( sample( c("A","C","G","T"), 300, replace=TRUE ) )
pander(t(bases))
```
note: the transpose t( )
| A | C | G | T |
|---|---|---|---|
| 70 | 85 | 83 | 62 |
You can
`r paste(letters[c(9,14,19,5,18,20)],collapse='')`
code anywhere.
You can insert code anywhere.
Even in the YAML header.
Go change yours!
---
title: "My notes"
author: "Peter Ralph"
date: "`r date()`"
---
Goal: Write a function that will generate all sequences of A/C/G/T of length \(n\) for which no two adjacent letters are the same.
Here is a pre-written solution.
Download the iris dataset to a new directory.
or just do
Read in the data.
Describe the dataset: number of observations, variables, etcetera.
R code (`r nrow(iris)`)Make a table of the number of observations for each species.
pander()results="asis" and print.xtable(xtable( ),type='html')Plot the flower dimensions against each other,
pairs(), and colored by species.Set up some fake data: each has 50 observations of two quantitative variables (age and height) and a categorical variable (type):
dir.create("examples/thedata")
owd <- setwd("examples/thedata")
for (samp in LETTERS[1:8]) {
dir.create(samp)
xy <- data.frame(
age=exp(rnorm(50)),
type=sample(letters[1:3],50,replace=TRUE)
)
xy$height <- 5 + runif(1)*xy$age + 3*runif(1)*as.numeric(xy$type) + rnorm(50)
write.table(xy,file=paste0(samp,"/data.tsv"))
}
setwd(owd)We now have 10 datasets, each in a file like A/data.tsv. Here’s what one looks like:
## age type height
## 1 1.4796488 c 9.361211
## 2 1.0046351 b 7.809438
## 3 0.8169789 c 8.509819
## 4 0.1572326 a 5.702507
## 5 0.2543326 a 6.924387
## 6 8.1443866 b 13.790307
## 7 0.8223393 a 7.054007
## 8 6.1442730 b 13.426498
## 9 3.0191242 b 9.207488
## 10 0.7728761 c 7.260055
## 11 2.7337382 a 8.775505
## 12 4.9572850 b 10.785370
## 13 1.1893305 b 7.997365
## 14 7.4601762 a 11.033910
## 15 0.2412250 b 5.188430
## 16 1.7532513 c 9.365491
## 17 0.5922570 c 7.953115
## 18 1.1926213 b 10.210615
## 19 1.0236192 b 6.526443
## 20 5.1199085 b 8.518106
## 21 0.9541798 a 5.253971
## 22 0.3123665 b 5.506721
## 23 1.0759528 a 5.045323
## 24 1.4650504 a 6.649536
## 25 1.6650421 a 7.217283
## 26 1.1198742 b 6.362583
## 27 1.2415980 b 7.193038
## 28 2.4319417 b 8.180893
## 29 0.5862522 b 6.385537
## 30 0.8680723 b 7.183052
## 31 0.6600719 b 6.006921
## 32 0.7879503 b 6.422339
## 33 2.5323571 c 9.426129
## 34 0.2440238 c 7.841699
## 35 0.7925729 a 5.619829
## 36 3.0982004 c 9.460839
## 37 1.2647280 b 6.505928
## 38 2.3627868 a 8.699527
## 39 0.4884444 a 5.547974
## 40 0.4585299 a 6.704829
## 41 0.3537712 c 7.097977
## 42 3.4127451 b 8.615107
## 43 0.7935832 a 6.186167
## 44 0.5354499 c 6.376454
## 45 1.0126319 c 6.952713
## 46 0.3659778 c 6.663840
## 47 0.9081240 c 8.437199
## 48 0.4860253 b 6.594441
## 49 0.3530557 a 7.768926
## 50 0.5318298 a 5.175261
We would like to visualize each, like this:
The template: examples/simple-template.Rmd
---
title: "Visualization for `r getwd()`"
date: "`r date()`"
---
```{r setup, echo=FALSE}
input.file <- "data.tsv"
xy <- read.table(input.file)
```
The file `r normalizePath(input.file)`
has `r nrow(xy)` observations:
```{r}
plot( height ~ age, col=type, data=xy )
legend( "topleft", pch=1, col=1:nlevels(xy$type) )
```
Input: this looks for the file data.tsv in the current directory.
Render it:
Option 1: copy the template into each of the ten directories, and render them there.
Option 2: use my templater package.
library(devtools)
install_github("petrelharp/templater")
library(templater)
dir.names <- file.path("examples/thedata", LETTERS[1:8])
for (input.dir in dir.names) {
output.file <- file.path(input.dir, "visualization.html")
render_template("examples/simple-template.Rmd", output=output.file,
change.rootdir=TRUE, quiet=TRUE)
}Look at them:
```{r make_links, results="asis"}
output.files <- file.path("examples/thedata", LETTERS[1:8], "visualization.html")
links <- paste("[",dir.names,"](",output.files,")",sep='')
cat( paste("- ", links, "\n"), "\n" )
```
Goal: Compare different \(k\) with \(k\)-means on the iris dataset.
iris/k, for \(1 \le k \le 5\),kmeans with the appropriate k.Example: