Analytics 1: Introduction to reproducible analyses in R, 2022

Course content

An increase in the complexity and scale of biological data means biologists are increasingly required to develop the data skills needed to design reproducible workflows for the simulation, collection, organisation, processing, analysis and presentation of data. Developing such data skills requires at least some coding, also known as scripting. This makes your work (everything you do with your raw data) explicitly described, totally transparent and completely reproducible. However, learning to code can be a daunting prospect for many biologists! That’s where an Introduction to reproducible analyses in R comes in!

R is a free and open source language especially well-suited to data analysis and visualisation and has a relatively inclusive and newbie-friendly community. R caters to users who do not see themselves as programmers, but then allows them to slide gradually into programming.

Who is this for?

This is mandatory training for first years i.e. students who started in 2021. This includes core DTP students plus iCASE and White Rose Network students.

Prerequisites

I recognise that people will enter this training with a diverse range of previous experience in R. This is a challenge to manage but the aim is for everyone to get something out of the training no matter where they start. There will be tutor-led sessions for those without previous experience and materials for those with experience to workthough with the help of floating tutors. Pre-course instructions for participants are given below.

Computers will be provided. You can bring your own laptop to the workshop if you wish but it will need a working wi-fi connection and you will have to let the tutor know in advance so a temporary York password can be created for you.

Learning outcomes

After these modules the successful learner will be able to:

  • Find their way around the RStudio windows
  • Create and plot data using the base package and ggplot
  • Explain the rationale for scripting analysis
  • Use the help pages
  • Know how to make additional packages available in an R session
  • Understand what is meant by the working directory, absolute and relative paths and be able to apply these concepts to data import
  • Summarise data in a single group or in multiple groups
  • Recognise tidy data format and carry out some typical data tidying tasks
  • Develop highly organised analyses including well-commented scripts that can be understood by future you and others
  • Use R Markdown to produce reproducible analyses, figures and reports

When and where

This course will be held in-person on Thursday 24th March 2022 at The University of York. Transport will be provided for Leeds and Sheffield students. Please note that Leeds and Sheffield students will have to get up very early that morning so you do not miss the coaches! Transport details will be issued nearer the time.

Schedule

Click the links for an insight into what will be covered in each module.

09:30 Arrive

10:00 – 10:30 Tutor-led for Everyone: Introduction and Principles of reproducibility

10:45-12:45 Tutor-led for Beginners: Introduction to R and working with data or Supported learning for those with some R experience: Tidying data and the tidyverse including the pipe

13:30-14:00 Tutor-led for Everyone: RStudio Projects

14:00 – 16:00 Tutor-led for Everyone: R Markdown for Reproducible Reports

Tutor

The course will be delivered by Emma Rand of The University of York.

Refreshments

Lunchtime refreshments will be provided.

Travel arrangements

Details of coach travel for Leeds and Sheffield students will be issued with detailed joining instructions via email. Please note that you will have to get up very early that morning so you do not miss the coaches! Please note that if you choose not to use the coaches provided, you will not be able to claim your travel costs. Car parking on the York campus is in very limited supply and is on a first-come, first-served basis so there is no guarantee of a space.

Analytics R1 and Analytics R2 are running concurrently on the same day therefore both year groups will travel together.

COVID regulations

Following the Government announcement on Monday 21 February, which set out the plan for ‘living with Covid’, The University of York has updated its guidance to staff and students:

  • We’re asking our community to continue to protect others by staying at home if you feel unwell, wearing a face covering and getting booster vaccinations.
  • Participants should do a lateral flow test before attending the Analytics course. If it is positive, please stay at home and alert the DTP Co-ordinator that you will not be attending. (Catherine Liddle, email: WRDTP@leeds.ac.uk)
  • Any changes to these measures will be advised prior to the course date

How to register

Registration closes on Friday 18th February 2022. Please use the following link:

Contact

If you have any queries or you feel you already have sufficient previous experience of a particular module, please discuss this directly with the tutor, Emma Rand, email: emma.rand@york.ac.uk

For general DTP queries, contact Catherine Liddle, White Rose BBSRC Doctoral Training Partnership Co-ordinator, e-mail: WRDTP@leeds.ac.uk