ONLINE COURSE – Introduction to generalised linear models using R and Rstudio (IGLMPR)

Pre-Recorded

This course provides a comprehensive practical and theoretical introduction to generalized linear models using R. Generalized linear models are generalizations of linear regression models for situations where the outcome variable is, for example, a binary, or ordinal, or count variable, etc. The specific models we cover include binary, binomial, and categorical logistic regression, Poisson and negative binomial regression for count variables, as well as extensions for overdispersed and zero-inflated data. We begin by providing a brief overview of the normal general linear model. Understanding this model is vital for the proper understanding of how it is generalized in generalized linear models. Next, we introduce the widely used binary logistic regression model, which is is a regression model for when the outcome variable is binary. Next, we cover the binomial logistic regression, and the multinomial case, which is for modelling outcomes variables that are polychotomous, i.e., have more than two categorically distinct values. We will then cover Poisson regression, which is widely used for modelling outcome variables that are counts (i.e the number of times something has happened). We then cover extensions to accommodate overdispersion, starting with the quasi-likelihood approach, then covering the negative binomial and beta-binomial models for counts and discrete proportions, respectively. Finally, we will cover zero-inflated Poisson and negative binomial models, which are for count data with excessive numbers of zero observations.

This course is aimed at anyone who is interested in using R for data science or statistics. R is widely used in all areas of academic scientific research, and also widely throughout the public, and private sector.

Delivered remotely

Time zone – NA

Availability – NA

Duration – 3 x 1/2 days

Contact hours – Approx. 12 hours

ECT’s – Equal to 1 ECT’s

Language – English

This course will be largely practical, hands-on, and workshop based. For each topic, there will first be some lecture style presentation, i.e., using slides or blackboard, to introduce and explain key concepts and theories. Then, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session.

A basic understanding of statistical concepts. Specifically, generalised linear regression models, statistical significance, hypothesis testing.

Familiarity with R. Ability to import/export data, manipulate data frames, fit basic statistical models & generate simple exploratory and diagnostic plots.

A laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs, Macs, and Linux computers.

Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.

Topic 1: The general linear model. We begin by providing an overview of the normal, as in normal distribution, general linear model, including using categorical predictor variables. Although this model is not the focus of the course, it is the foundation on which generalized linear models are based and so must be understood to understand generalized linear models.

Topic 2: Binary logistic regression. Our first generalized linear model is the binary logistic regression model, for use when modelling binary outcome data. We will present the assumed theoretical model behind logistic regression, implement it using R’s glm, and then show how to interpret its results, perform predictions, and (nested) model comparisons.

Topic 3: Binomial logistic regression. Here, we show how the binary logistic regresion can be extended to deal with data on discrete proportions. We will also present alternative link functions to the logit, such as the probit and complementary log-log links.

Topic 4: Categorical logistic regression. Categorical logistic regression, also known as multinomial logistic regression, is for modelling polychotomous data, i.e. data taking more than two categorically distinct values. Like ordinal logistic regression, categorical logistic regression is also based on an extension of the binary logistic regression case.

Topic 5: Poisson regression. Poisson regression is a widely used technique for modelling count data, i.e., data where the variable denotes the number of times an event has occurred.

Topic 6: Overdispersion models. The quasi-likelihood approach for both the Poisson and binomial models. Negative binomial regression. The negative binomial model is, like the Poisson regression model, used for unbounded count data, but it is less restrictive than Poisson regression, specifically by dealing with overdispersed data. Beta-binomial regression. The beta-binomial model is an overdispersed alternative to the binomial.

Topic 7: Zero inflated models. Zero inflated count data is where there are excessive numbers of zero counts that can be modelled using either a Poisson or negative binomial model. Zero inflated Poisson or negative binomial models are types of latent variable models.

**Dr. Rafael De Andrade Moral**

Rafael is an Associate Professor of Statistics at Maynooth University, Ireland. With a background in Biology and a PhD in Statistics from the University of São Paulo, Rafael has a deep passion for teaching and conducting research in statistical modelling applied to Ecology, Wildlife Management, Agriculture, and Environmental Science. As director of the Theoretical and Statistical Ecology Group, Rafael brings together a community of researchers who use mathematical and statistical tools to better understand the natural world. As an alternative teaching strategy, Rafael has been producing music videos and parodies to promote Statistics in social media and in the classroom. His personal webpage can be found **here**

**ResearchGate**

**GoogleScholar**

**ORCID**

**GitHub**