2-6 July 2018
Course content overview
R is an open source programming language and software environment for performing statistical calculations and creating data visualisations. It is rapidly becoming the tool of choice for data analysts with a growing number of employers seeking candidates with R programming skills.
This course will provide you with all the tools you need to get started analysing data in R. We will introduce the tidyverse, a collection of R packages created by Hadley Wickham and others which provides an intuitive framework for using R for data analysis. Students will learn the basics of R programming and how to use R for effective data analysis. Practical examples of data analysis on social science topics will be provided.
1. R and the ‘tidyverse’
This session will introduce R & RStudio and cover the basics of R programming and good coding practice. We will also discuss R packages and how to use them, with a particular focus on those that make up the ‘tidyverse’. We also introduce R Markdown which will be used to report our analyses throughout the course.
2. Import and Tidy
Data scientists spend about 60% of their time cleaning and organizing data (CrowdFlower Data Science Report 2016: 6). This session will show you how to ‘tidy’ your data ready for analysis in R. In particular, we’ll show you how to take data stored in a flat file, database, or web API, and load it into a dataframe in R. We will also talk about consistent data structures, and how to achieve them.
Together with importing and tidying, transforming data is one of the key element of data analysis. We will cover subsetting your data (to narrow your focus), creating new variables from existing ones, and calculating summary statistics.
Data visualisation is what brings your data to life. This session will provide you with the skills and tools to create the perfect (static and interactive) visualisation for your data.
5. Bringing it all together
In this last session we review all we have learned on this course, and think about how we can bring it all together in dynamic outputs, such as interactive documents, plots, and Shiny applications.
After this course, users should be able to:
- implement the basic operations of R;
- read data in multiple forms;
- clean, manipulate, explore and visualise data in R
This course has no prerequisites.
We will be referring to R for Data Science by Garrett Grolemund and Hadley Wickham (2016) throughout the course. The full text of the book is freely available online or a hard copy can be purchased.
Dr Reka Solymosi is a lecturer in quantitative criminology at the University of Manchester in the United Kingdom. Before that she was a data analyst researching issues around transport crime and policing at Transport for London. Her research interests are around crowdsourced data collection, transport crime, and perception of crime and place. She uses R in both teaching and research, and co-runs the R at University of Manchester (RUM) group.
Dr Henry Partridge is the Manager of the Trafford Data Lab which supports decision-making in Trafford, Greater Manchester by revealing patterns in data through visualisation. Henry is currently involved in a Horizon 2020 project which promotes the use of open linked statistical data to improve the delivery of public services. Henry has strong research and analytical skills with particular expertise in R programming, data visualization, and spatial analysis.
To book your place and have an invoice submitted to your institution submit your details on our online booking form.
To book your place and pay by credit or debit card please visit our e-store