Module Overview

PSY6422 Data Analysis and Visualisation was designed and taught by me Tom Stafford 2020-2025, as part of the MSc in Psychological Research Methods with Data Science at The University of Sheffield. The module continues as part of the MSc in Psychological Research Methods and the new MSc in Artifical Intelligence.

These pages are now archival. Check out the showcases from previous years:

0.1 Motivation

Psychological science is increasingly reliant on complex computational and statistical methods to make sense of rich behavioural data. This course aims to teach the skills which support creating robust and reproducibile analyses with such methods and data.

As well as supporting sophisticated data visualisation, we aim to train you in reproducible workflows - meaning that you can reliably re-create all steps of an analysis using scripts that automate all steps between raw data and the final visualisation.

As well as being reproducible (by you or other researchers) your work should be legible (to Future You, or other researchers) and scalable (it should work as well on 400,000 data points as on 40).

You will need help to do this. Therefore you will use Open Source solutions - these are analysis products which have a worldwide community of people using them, and the infrastructure which supports sharing advice and solutions.

In practice, this means you are going to start by using R (you could use Python, but this module is based on R).

0.2 Course Aims

The curriculum is updated each year, but you can get the general idea of the order the topics are covered from the leftbar. By the end of this course you will have:

Been trained in data project management – including fundamentals of data storage, syncronisation and sharing – and the importance of reproducible workflows
Used the statistical programming language R, and RStudio, for data management, analysis and visualisation
Been introduced to fundamental programming concepts
Prepared data project documentation using RMarkdown
Had an introduction to version control using git
Published data projects to the web via github pages

There is also the opportunity to cover advanced topics, either in class or as part of your project. These could include

Interactive visualisation with Shiny apps
SQL
webscraping
animated / roll-over visualisation

You may particularly enjoy the Reading list

0.2.1 The Module mini-project

The bulk of the course assessment is to conduct and publish your own analysis project. By doing this you will have experience of combining all the skills taught on the course within a singe project. This will take a data visualisation from start to finish - from raw data, through data cleaning and documentation to sharing your code and the resulting visualisation on the web.

The intention with the assessment is to ensure that every student leaves the course with something they are proud to put in their portfolio of work, something which shows what they can do and which helps with future job or course applications.

See here for more on the nature of your module project, see here for examples from previous years: class of 2020, class of 2021, class of 2022

0.3 Resources for current students

Google Drive:

Includes slides and other resources, as well as these specific documents

FAQ document which I am adding to as questions come in

FAQ for PSY6422 (including on module project)

Most information is on these pages (hosted on github, no login required)