Predict Delays

OpenDataDay 2017 repository


In this project, we fit a simple linear model to predict delays in arrival times of VBZ public transportation vessels using data of 4 weeks. The accompanying shiny app can be found here.

To fit the model, we use the predictors 'weekday', 'vehicle type', 'temperature' and 'precipitation'. 'weekday' and 'vehicle type' are categorical predictors. 'temperature' and 'precipitation' are continuous predictors.

We obtain data for the predictors 'weekday' and 'vehicle type' from Open Data Zurich ( and data for the predictors 'temperature' and 'precipitation' from

The delay in arrival times, which is the quantity we want to predict, we obtain from Open Data Zurich as well.

The data set we use for fitting the model contains ca. 6 mio data points.

To run our model:

  • Clone the project
  • Make directory 'raw' in project root directory
  • Move data into dir 'raw'. If you have the data on a USB stick 'Stadt Zurich Open Data' move data from USB into directory 'raw'.
  • Open the RProject in RStudio.
  • Hit Ctrl+Shift+B to start the Makefile-based project build.
  • Enter remake::create_bindings() in R console to bind to the data object from within R.

Edited content

10.03.2017 13:23 ~ oleg

Event finished

04.03.2017 18:00

Event started

04.03.2017 09:00
Contributed 6 years ago by oleg for Open Data Day Zurich

School of Data CH   Slack | Twitter | Facebook

Creative Commons LicenceThe contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.

Open Data Day Zurich