Introduction to R and GLM/GAMs workshop
by Aaron Greenville and Mathew Crowther
This two-day workshop is designed to remove the mystery behind R, passing on tips for best practice techniques that we have picked up on our journey with R and lastly, to get you started with GLM/M and GAM/Ms. There are many ways to use R and here we wish to show you our workflow, which seems to work for us.
Please bring your own laptop or share one with a friend. This workshop is designed to be very hands-on. If you do not have access to a laptop, then please let us know and we will see if we can organise something for you.
Assumed knowledge is basic statistics.
Before the workshop you will need:
Download and install:
Optional: GLM and GAMs Workshop slides, Model Selection and PseudoR² slides.
Acknowledgements: This workshop draws on material from Software Carpentry (If you see one of their course advertised, then do it!) and Zuur A.F. (2009). Mixed effects models and extensions in ecology with R. Springer, New York ; London.
Outline [pdf]
Basics
- Exercise 1: Set up R, Rstudio, github – installing packages, loading packages, a trick for uni proxy settings.
- Data structures – vectors, lists, matrix, data frame, factors and sub-setting (using $ to call columns, how to use dataframes, vectors, selecting rows, columns or cells)
- Getting data into R and setting working directory.
- basic data checks – head(), using RStudio, plots ()
- Using and misusing attach()
- Using help
Version control
- Why do it?
- Exercise 2: RStudio and Github.
Writing functions
- Exercise 3: introduction
Organising a project
- Best practice for organising a project.
- Exercise 4
GLM and GLMM
- Basics of a linear model
- What is a GLM and GLMM
- Exercise 5: Frog road kill (poisson, quasi-poisson and neg bin)
- Exercise 6: GLMM with temporal confounding – Hawaii bird abundance
- Exercise 7: binomial GLM rats
GAM and GAMM
- What is a GAM and GAMM
- Exercise 8: GAM- Roadkill
- Exercise 9: GAMM with spatial confounding – Roadkill
Model selection
- Introduction to Information Theory
- Exercise 10: GLM Model selection and model averaging – Roadkill
Pseudo R2
- How do you know your top model/s are any good?
- Exercise 11: GLM Pseudo R2
Steps in choosing the appropriate analysis
More resources
- Recommended books: anything from the UseR series, any books by Zuur et al.
- Search engines: Google and rseek.org
- Software Carpentry
- Quick intro to R
- Organizing your project
- R-bloggers interpreting residual plots
- UCLA data analysis examples
- Git hub demo website