R in Eighty Days

How much R can I learn from scratch in eleven-and-a-half weeks?


Day 21: Wednesday 4th September 2019

So, having completed the R for the Rest of Us Getting Started with R online course, it's now time to try the next online course. This is Fundamentals of R. This looks—judging by the number of headings and subheadings on the contents listing—as if it's going to be a longer course than the first one, And it begins by introducing me to something called R Markdown.

Before I get into the actual tutorial, it's worth saying that this is a classic example of the type of problem you encounter when you are learning something that's completely new. This thing I'm about to learn about, so far all I know is its name. I don't know what it is. I don't know what it does. I don't know if it will be useful to me. I don't know what category to put it in. David Keyes has said so far that R Markdown is a 'tool' and that it can be great for workflow, but I still don't know what that means.

So anyway, I'm watching the first video in the tutorial, and David says there are three elements to R Markdown: YAML, codechunks and text. And it looks as if these three elements are somehow 'visible' within the R Studio interface but at this stage, it's quite difficult to ascertain...

Ah, right, OK, so I just watched that first video and now I can see what it is, and it is exciting! It is a tool that allows you to blend together text and data into a report or a document. This is a tool that Edward Tufte would like! Data/text integration is one of his Grand Principles of Information Design.

And now I'm reading the blogpost-type article that David has written about R MarkDown and it is making things a lot clearer, yes it's this integration of data and text that is a subject so close to my heart! And it also seems to contain some echoes of the way CSS and HTML interact with one another, but the main thing appears to be that you've got everything you need for an actual report or document all in one place, with all the code there as well, so you have the audit trail and everything.

So it's all totally great, but...

...in that blog, he says: "There's a Learning Curve. Learning R and R Markdown does not happen overnight. It takes time to learn how to use these tools. Throw in git/GitHub and the process is even longer. Know that moving to this workflow is good in the long term, but likely to be a bit painful in the short term."

Painful in the short term. Hmmmm. Don't much like the sound of that! Anyway—undeterred—onwards and upwards...

Back to the tutorial, and I like the way David does these screenshots where he colour codes sections of the screen so that you can see which bit is which. So here he has colour-coded the YAML, the text and the code chunks, and this does make the whole thing a lot clearer, and now I'm finding out about YAML. YAML seems pretty straightforward, it doesn't frighten me, I can see that it's just basic metadata, and I can see that we can edit it. I also like the idea that we can just output to Word or HTML or whatever, and it looks as if we might be able to output to PowerPoint, too.

After YAML, it's now time to look at the text part of R Markdown, and text looks as if it's basically a crude word processor, it's something in between the version of Wordstar I remember from circa 1987 and HTML referring to a CSS stylesheet for how to interpret different content elements like headers and bold and italics and bullet points.

Third, code chunks. This is what code chunks look like (and I actually created this code chunk myself! You have absolutely no idea how pleased that has made me!):

```{r cars_skimmed, echo=TRUE}
library(skimr)
skim(cars) ```

And yes, before you ask, I do not know what those three weird characters are before the curly brackets, except I think David referred to them as 'back-ticks', so now I just have to locate them on my laptop keyboard, oh right, I think I see them, just to the left of the number 1 key on my laptop!

But anyway, I like the way we can integrate text and code chunks and we have these commands echo = FALSE and insert = FALSE if we want to suppress stuff.

Previous Day ... Next Day