Go to http://rstudio.calvin.edu and enter your user name and password to log in. If you need help, see the RStudio help on our "from class" page.
You can do things in R by typing commands in the Console panel.
As a simple example, you can use the console as a calculator. For example, type
2 + 5
in the console and then hit enter. Or try something more advanced like
log(2) + 5^2 - 10
Working that way makes it hard to keep a record of your work (and hard to redo things if anything changes or if a mistake was made). For this class, you will mostly want to work in R markdown files, which can contain text, R code, and R output (such as figures). We'll get to that in a little bit.
You can save things in R using an assigment statement. The assignemnt statement
can use either <-
(the traditional form in R) or =
(which may be more familiar to you).
x <- 2 + 8 # save this computation with the name xy = log(10) # save this computation with the name y
x # show me what is in x
## [1] 10
x + y # do a computation with x and y
## [1] 12.30259
Many data sets and functions are stored in pacakges. These must be loaded before you can use them. There are a couple ways to do this.
In the Packages pane, find the package you want and check the box next to its name.
Use the library()
command. For example, to load the mosaic
pacakge, use
library(mosaic)
You will notice that if you use method 1, the code for method 2 will appear in the console, just as if you had typed it yourself.
Important packages for this class: mosaic
and openintro
(data sets for IMS).
The mosaic
package will automatically load mosaicData
(some data sets)
and ggformula
(our plotting system).
It's good to know how to get help. Here are three ways.
Search in the HELP pane.
Look in the HELP menu.
To get help with a function or data set, use ?
?Gestation
Give some of these a try to find out about the Gestation data set.
library(mosaic) # make sure the package is loadednames(Gestation) # names of the variablesdim(Gestation) # number of rows (cases) and columns (variables)head(Gestation) # first few cases glimpse(Gestation) # first few entries for each variable?Gestation # long form documentaiton with details about the data
In RStudio, navigate to File -> New File -> R Markdown..., or click on the white rectangle with a green + and select R Markdown from the drop-down menu.
Choose "From Template" and select the "ggformula fancy" template. (For other things you do in this class, you will probably want to switch to the simpler "Stat 145" template.)
You will see that there are a number of other templates here as well. Mostly, you won't need them, but feel free to explore.
These slides were created in RMarkdown using the xaringan template.
There are other slide templates under Presentation.
Save your file by clicking on Save in File menu or clicking on the
disk icon in the File pane (maybe give it a useful file name like
FirstRmarkdownTry.Rmd
).
The file will be saved to the server, not to your computer.
All your files will be accessible in the RStudio Files pane whenever you log into RStudio, regardless of which computer you are using.
How do Rmd files actually work? What's so cool about them?
Click on the small black arrow next to the word "Knit" (and the ball of yard icon) at the top of the file window. Select "Knit to PDF". Check out the compiled PDF result, and compare it to the original markdown file. Wow!
At the top of the markdown file, enter an appropriate title, author(s), and date (within the quotation marks). Knit again to see the effect.
---title: "Untitled"author: ""date: ""
Note: It is important that you don't mess up this section of the file, or knitting won't work. In particular, be sure you don't accidentally delete any quotation marks.
The R Markdown file is a text file where you save all the R commands you want to use, plus any text commenting on the work you are doing and the results you get. Parts of the file with a plain white background are normal text.
You can format the text using special codes. Here are couple examples.
what you type | what you get |
---|---|
**bold** |
bold |
*italics* |
italics |
$x^2 - 2$ |
x2−2 |
You can also make bulleted lists, numbered lists, section headers, simple tables, web links, mathematical formulas, and more.
Check out the R Markdown cheat sheet for more.
An Rmd file can also contain one or more R code chunks. These sections of the file have a grey background on screen.
The first and last lines indicate that this is an R chunk. You can type any R commands between these two lines and the commands will be executed when you knit.
To add a code chunk to your file, you can type in the header and footer by hand to start and end the chunk. Or, you can click on the green box with the C inside (at the top of the Rmd file) to insert an empty chunk.
The first R code chunk in a Rmd file is usually used to specify settings. In this chunk, you can also give R permission to use certain packages (software toolkits) with one of the following commands
# either of these will worklibrary(packagename) require(packagename)
If you look carefully at the output, you will see that the settings chunk does not appear there. That's intentional - for homework, you don't need to show me what settings you used, although I would always like to see the R code you use to solve problems in the output.
Going back to the Rmd file, look at the header of the settings chunk. Notice that in addition to the "r" label, which is followed by an (optional) chunk name, the first line of the hidden settings chunk includes the option
include = FALSE
Another option (useful for reporting to people who don't need to see the code, only the results) is
echo = FALSE
For assignments, please leave your R code visible in the output (except for the setup chunk).
There are, but for this class, you can probably just start from the "Stat 145" template and be in pretty good shape.
If interested, see one of these references
Every R Markdown file (Rmd file) must be completely stand-alone. It doesn't share any information with the Console or the Environment that you see in your RStudio session. All R code that you need must be included in the Rmd file itself!
For example, if you use point-and-click in the RStudio Environment pane to import a data file, that dataset will not be available for use within the Rmd file. Similarly, if you load a package by typing
library(mosaic)
in the Console window, mosaic functions and data will not be available to use within the Rmd file unless you also include this command in the RMarkdown file.
Create a new file
Use the "Stat 145" template instead of the "ggformula fancy" template. That's how you will usually want to start your homework.
Edit the top to include your name(s) and the problem set number (PS 3). Feel free to work with a partner or two. (Include all of your names.)
At the end of these slides are the instructions for what to include in PS 3.
But we before we get there, a few more notes.
There are multiple ways to run and test R code from a markdown file. Sometimes you want to knit the whole file and get the PDF; other times you want to run just a specific bit of code to make sure it's working correctly.
Obviously, every time you knit the file, all R code will be run automatically.
But sometimes it is nice to run just a bit of the code to test it out, rather then knitting the whole file.
You can also use shortcuts/buttons to run specific chunk(s). Here is one way to do it (option 1): Use the Run pulldown menu at the top of the markdown file. (Choose the option you want based on what you are trying to do).
Here is another way to use shortcuts/buttons to run only a specific chunk (option 2): Click on the green arrow at the upper right of a code chunk to run the chunk.
Finally, here's a third way. Put your curser in a code chunk at hit
<CTRL>+<ENTER>
or <command>+<ENTER>
depending on your operating system.
We already covered this once, but we're covering it again because it's one of the most common student mistakes in Rmd files!
If you run R code in the console or the RStudio GUI (for example, reading in a data set by pasting code into the console or using the Import Dataset button in the Environment pane), you won't be able to use the results in your markdown file. Any and all commands you need, including reading in data, need to be included in the file.
We'll cover how to load your own data later (but it's pretty easy, so feel free to experiment on your own if you like).
The best format for printing seems to be PDF, but printing directly from RStudio's pre-viewer isn't so great.
But when you knit, a PDF of the output is also saved in your Files pane. If you go to the Files pane and double click the PDF, it should open in your computer's normal PDF viewer and you can print it from there...and it will look good.
You can also choose to download (use export) the PDF (or a Word document) to your local machine and print from there.
While PDF is nice for printing, HTML can be nice for creating the file. That's because RStudio can view HTML right inside its Viewer pane -- but you have to tell it to do so.
Choose Tools > Global Options > R Markdown
Choose to show output preview in Viewer Pane.
Instructions.
Write a paragraph or two about anything you like. Include bold, italics, and a list (enumerated or bullets, your choice).
The mosaicData package includes a data set called Gestation which has information about birth in the US. For each of the following questions, create a graphical summary (ie, a plot) using ggformula that sheds light on the question; create a numerical summary that sheds light on the question; and write a sentance or two addressing the question in light of your graphical and numerical summaries.
a. When were these babies born?
b. How many days does a pregnancy typically last?
c. Are babies born to smokers smaller than babies born to non-smokers?
d. What is the relationship between mother's height and father's height? (Do taller women marry taller men?)
The openintro package includes a data set called cherry that has some measurements of cherry trees.
a. Make scatter plots of volume vs height and volume vs diameter. Which one shows the stonger association?
b. Is the stronger association roughly linear? If yes, compute the correlation coefficient.
c. Is the weaker association roughly linear? If so, compute the correlation coefficient.
Finally, include another short paragraph of text describing your experience with RStudio so far. What made sense or was satisfying, and what was confusing or frustrating? Any problems or questions, suggestions for changes, or tips you'd provide to other students who will do an assignment like this in the in the future?
Go to http://rstudio.calvin.edu and enter your user name and password to log in. If you need help, see the RStudio help on our "from class" page.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |