Jupyter Notebook Docker with Spark and DeltaLake support

  • Attempts to replicate Databricks Runtime, plus features from feature-rich jupyter/docker-stacks.
  • Based image on NVIDIA’s rapidsai/rapidsai image.
  • Support for Spark/PySpark 3.2.x and Delta Lake 1.1.0.
  • Monthly cronjob to update the image with latest features from upstream jupyter/docker-stacks
  • CD/CI automate building of image and pushing to DockerHub and ghcr.io

Docker container for Data Science:

  • Based on Jupyter docker-stack jupyter/datascience-notebook