Advanced Topics
The first ten topics of this course represent the core that I want every student to take away from the course. With these topics you should be equipped to focus on your personal data project for which makes up the majority of the grade on the course.
The remaining time in the timetable is available for supervision on this data project, and to cover more advanced topics. For these topics, rather than provide a lecture, we will work in small groups to follow exercises and tutorials which exist outside of the course. The motivation for this is two-fold. First, because all your future work will be teamwork, I want students to graduate from this course with experience working together on technical projects. Whether you are more or less confident in the technical requirements, you will learn a lot from trying to share what you know or think you know with a group. Second, a key skill for ongoing development in data science is to teach yourself. There world is rich in useful advice, tutorials and examples. Only by discovering how you can use these will you maximise your potential after you have finished this module.
In previous years we have voted on the advanced topics to cover in class. Topics have included:
Please visit these pages for topic-specific resources
10.5 Machine Learning
- Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron (2019)
- notebooks for this https://github.com/ageron/handson-ml3
10.6 Miscellaneous topics/resources
- regexone.com Learn Regular Expressions with simple, interactive exercises.
- Performance tuning: Code performance in R: Parallelization
- Making Your First R Package
- Webscraping (in python): Web-Data-Scraping-S2023 from Brian Keegan
- Leon Yin: Finding Undocumented APIs
10.7 Reproducibility
Dependencies
Code Review
- Strand, J. F. (2021, March 31). Error Tight: Exercises for Lab Groups to Prevent Research Mistakes. https://doi.org/10.31234/osf.io/rsn5y
- CODECHECK
10.8 General righteousness
You should use a password manager. Really. LastPass is recommended
You should know how to use a VPN
- Advice for Sheffield students here: Working remotely - information for students
- You can test by visiting this page. If you are in the University network (either on campus or via a VPN) it will offer you the chance to download the article PDF. Otherwise it will try and charge you $35 for the privilege.