This is a ‘LIVE COURSE’ – the instructors will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.
TIME ZONE – UK (GMT+1) local time – however all sessions will be recorded and made available allowing attendees from different time zones to follow.
Please email oliverhooker@prstatistics.com for full details or to discuss how we can accommodate you.
This workshop aims to give novice programmers an introduction to data visualisation using Python for research in evolutionary biology and genomics by using biological examples throughout. We will use example datasets and problems themed around sequence analysis, taxonomy and ecology, with plenty of time for participants to work on their own research data.
Much of the popularity of Python stems from the availability of high quality libraries of existing code that we can use for our own projects. Libraries (“packages” in Python terminology) are even more useful when they are designed to work together. For scientific programming, we are lucky to have a collection of mature packages which work together to form a stack:
In this course we will learn how to use these packages together to quickly explore large biological
datasets, find meaningful patterns in the data, and present our results clearly. We will focus on the high-level packages – pandas and seaborn – as this will allow us to do the most work with the smallest amount of code. By concentrating on just two packages for an entire course, we will be able to cover a large part of what these tools can do.
The course is intended for anyone interested in using Python for analysis and visualization of biological datasets. Some previous experience of Python IS required, as we won’t cover the absolute basics of the language, so you will need to know the very basic syntax. The introduction to Python for Biologists course gives a suitable background. If you want to come on this course but have no Python experience, get in touch at martin@pythonforbiologists.com and I can suggest resources to get up to speed.
This course includes plenty of practical time, including opportunities to work on your own datasets, so it might be particularly suitable for people at the start of the data analysis stage of a project.
Delivered remotely
Time zone – UK (GMT+1) local time
Availability – 20
Duration – 4 days, 8 hours per day
Contact hours – Approx. 28 hours
ECT’s – Equal to 3 ECT’s
Language – English
Lectures/discussions of Python code, libraries and techniques delivered using interactive
notebooks. Workshop/practical time for students to tackle carefully designed programming
challenges that use the material from the discussion sessions. Usually followed up by
discussion of solutions, wrap up and summarisation.
Very modest; you should be familiar with basic descriptive statistics and how to read
common chart types like box plots and scatter plots.
This course assumes a background knowledge of Python syntax, so is not suitable for
complete beginners to programming. If you have any questions about whether the course is
suitable, don’t hesitate to email martin@pythonforbiologists.com to chat.
Although not absolutely necessary, a large monitor and a second screen could improve the learning experience. Participants are also encouraged to keep their webcams active to increase their interaction with the instructor and other students.
PLEASE READ – CANCELLATION POLICY
Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited
If you are unsure about course suitability, please get in touch by email to find out more oliverhooker@prstatistics.com
Day 1 – Classes form 09:30 – 17:30
Session 1: Environment, packages, data files and data model
Session 2: Series objects and thinking in columns
Day 2 – Classes form 09:30 – 17:30
Session 3: Introducing seaborn
Session 4: Categorical axes with seaborn
Day 3 – Classes from 09:30 – 17:30
Session 5: Grouping and categories with pandas
Session 6: Long vs. wide form data and heatmaps
Day 4 – Classes form 09:30 – 17:30
Session 7: Complex data files with pandas
Session 8: High performance pandas
Martin a freelance trainer specialising in teaching programming (mostly Python) and Linux skills to researchers in the field of biology. He trained as a biologist and completed his PhD in large-scale phylogenetics in 2007, then held a number of academic positions at the University of Edinburgh ending in a two year stint as Lecturer in Bioinformatics. I launched Python for Biologists in 2015 and have been teaching and writing full-time ever since.