Imagine that you have just coded a Django web application which is capable of offering users the ability to login and use various services. You have now decided that you want to give the users the ability to interact with the services you provide using a Jupyter notebook spawned from within the Django web application. How would you do it?
Well, the good folks in the Jupyter community have already given us "JupyterHub" which allows users to spawn their own notebook servers starting from a single starting point - 'the hub' - after authenticating themselves in some way. By default, the JupyterHub system authenticates users against the Linux usernames and passwords for user accounts created on that system. So, the real change we need to make is to somehow tell JupyterHub to authenticate using the user accounts stored in the Django application.
These are the goals of this tutorial:
Let's begin! You will of course gain more insight into the workings if you follow along but if you wish to directly obtain the minimal working source code, check out this Github repository.
We are going to make a Django application from scratch. We are going to assume that you have some basic experience in Django and thus not explain every single step in this process. We want to create a virtual environment in a separare folder, install Django inside it and create a new project which we call "service_provider" since it will provide the user authentication service.
$ mkdir django-oauth-jupyterhub-demo $ cd django-oauth-jupyterhub-demo $ python3 -m venv venv/ $ source venv/bin/activate $ pip3 install django==2.2.7 $ django-admin startproject service_provider
By default, Django is designed to use SQLite as a database. For our current purposes, we want to continue using the same. So, let us now create the database by 'migrating' it and create our first superuser.
$ cd service_provider $ python manage.py migrate $ python manage.py createsuperuser (enter the details prompted)
Let us now test that everything works fine. Start the development server by saying,
$ python manage.py runserver
Point your favorite browser to http://127.0.0.1:8000/
and see if the success
page appears. Next, head to http://127.0.0.1:8000/admin
and see if you are
able to login using the username and password you specified during the process
of creating the super user.
If you are a regular netizen, you would have done the following many times. You come across a site which requires you to login. You click on Login and you are given an option to use your existing Google or Github or some other login to sign into the web site. Stackoverflow, Evernote etc. are examples of such sites. This is great because it saves you, the user, the trouble of creating a new account and managing it. Instead you allow Google or Github or whatever to authenticate you using the account you created with them and share some information with the service you are trying to use. Here, we say that Google or Github are service providers and applications such as Evernote or Stackoverflow are client applications.
The mechanism by which a client application is able to authenticate a user using account information stored and maintained on the service provider's application or database is called OAuth or Open Authentication. Note that this is A popular way to achieve this - there can be other methods to do this as well!
In our example, Django will be the service provider since it will store all the user accounts and details while JupyterHub will be the client application.
What does Django need to do in order to support the OAuth mechanism?
Phew! Sounds like hard work! Well, Django comes with an extension which allows this to be mostly automated. Let's set Django up for this.
We begin by installing the Django OAuth Toolkit. This can be done by saying,
$ pip install django-oauth-toolkit==1.2.0
Next, we need to install this app into our Django project. So, open up
service_provider/settings.py
and under INSTALLED_APPS
list, add
oauth2_provider
. So, this part of your settings.py
file will look something
like this:
INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'oauth2_provider', ]
And under MIDDLEWARE
add the following entry:
'oauth2_provider.middleware.OAuth2TokenMiddleware',
That's it - all the logic needed to handle OAuth is now within your Django
application. In order to maintain a list of applications, the access credentials
and more, the OAuth toolkit that we just installed needs some tables to exist in
the database. Let's run the migrate
command and update our database.
$ python manage.py migrate
If you are curious to see what new models have been introduced by this, check out the Django admin console and see what has changed!
Some more settings are needed by the way! Unfortunately, because we are now going to ask two applications (likely hosted on two separate URLs) to send requests to each other, we are going trigger some security mechanisms placed inside Django. When something hosted on one URL triggers a request to something hosted on another URL it is called Cross Origin Resource Sharing (CORS). Unless the headers of the HTTP request are populated with some information, such requests are generally blocked by good browsers and good web applications. So, we need to tweak Django to deal with this. Without too much explanation here is what needs to be done.
$ pip install django-cors-middleware==1.4.0
Open service_provider/settings.py
. Under INSTALLED_APPS
, add the following
line:
'corsheaders',
And under MIDDLEWARE
add,
'corsheaders.middleware.CorsMiddleware',
And run a migration.
$ python manage.py migrate
The final step in terms of Django code changes is to setup the URLs over which Django app can be told that a client application needs a user to be authenticated. Yes, the client application needs to know the URL or address over which it ask Django for user information etc.
Now, the most essential URLs mandated by OAuth framework are already available
to us thanks to our installing the oauth2_provider
application. All we need to
do is to tell Django to expose them to the world.
Open service_provider/urls.py
and under the urlpatterns
list, add the
following entry:
path('o/', include('oauth2_provider.urls', namespace='oauth2_provider')),
At the top of the file, you will need to add the following:
from django.urls import include
Now, we are going to go to our browser and type http://127.0.0.1:8000/o/applications/
.
Then click "Click here". Enter the following information.
Please copy the Client ID and Client Secret in a text file somewhere and keep it handy. We are going to need it later!
The final URL we need to setup is the one that returns a JSON containing the currently logged in user! This is needed by JupyterHub and is not a part of the minimal requirements of the OAuth framework.
WARNING: I'm breaking form here! Ideally, you should have a separate Django
application with its collection of URL definitions and view functions, all
neatly arranged. But for the sake of achieving a bare minimal working example,
I'm going to avoid creating any app at all and define a view function inside
the main project's urls.py
. For your main ready-to-serve application, you should
be separate out this logic as per your style.
Next, open service_provider/urls.py
and add the following code.
from django.http import HttpResponse from django.contrib.auth.decorators import login_required import json @login_required() def userdata(request): user = request.user return HttpResponse( json.dumps({ 'username': user.username }), content_type='application/json' )
And in the urlpatterns
list, add
path('userdata', userdata, name='userdata')
To test if this works, ensure you are logged in (using the admin console) and
type http://127.0.0.1:8000/userdata
- you should get a JSON data dump
containing the key "username".
If the user is not logged in and tries to access a page which requires the user
to be logged in, Django automatically tries to redirect the user to the login
page in a way such that once the user logs in successfully, the user is
redirected back to the page they were trying to access initially. By default,
the view function responsible for login requires a template to be defined as
registration/login.html
but instead of creating a page we can use ask Django
to use the Django admin login page for now.
Open the settings.py
file and add the following line:
LOGIN_URL = '/admin/login'
Finally, our Django application which by default had only one way of
authenticating a user - by the default User model - needs to be told to
recognize also those users who have identified themselves via OAuth. So, we add
the settings.py
the following lines:
AUTHENTICATION_BACKENDS = ( 'oauth2_provider.backends.OAuth2Backend', 'django.contrib.auth.backends.ModelBackend' )
That's the last change to be made to Django.
We are now ready to bring in the last piece of the puzzle - JupyterHub itself! Start by installing it and the Jupyterhub extension that supports OAuth framework.
$ pip install jupyter==1.0.0 jupyterhub==1.0.0 oauthenticator==0.9.0
Now, alongside our parent Django project folder service_provider
we are going
to create a new folder called hub_config
where our JupyterHub config files will
be kept and from where the Jupyterhub will be launched.
$ cd <path/to/django-oauth-jupyterhub-demo> $ mkdir hub_config $ cd hub_config
Next, we are going to create a file called jupyterhub_config.py
which will
contain the following code dump which I've explained with inline comments.
# This is how we tell Jupyter to use OAuth instead of the default # authentication which is done using local Linux user accounts. c.JupyterHub.authenticator_class = 'oauthenticator.generic.GenericOAuthenticator' # Where should Django pass the authentication results back to? c.GenericOAuthenticator.oauth_callback_url = 'http://localhost:8010/hub/oauth_callback' # What is the client ID and client secret for Jupyterhub provided Django? c.GenericOAuthenticator.client_id = 'irhIz1p3G8lyiBDWv66LzuwLacyV1i98jJP0qXQx' c.GenericOAuthenticator.client_secret = 'tidEvFtozIJTTIfmHqkBEnlEtFl0Wd3tB7WnD2EvXDkRkk36Lphr5N3RoPaJhuJBaSuQ2j3WZSF7OrCrdGwG9ejEWty1VN gkjon3EyTdKpeBXVLw8q4nk0szvU3tHUx6' # Where can Jupyterhub get the token from? c.GenericOAuthenticator.token_url = 'http://localhost:8000/o/token/' # Where can it get the user name from? What method shall it use? # What key in the JSON output is the username? c.GenericOAuthenticator.userdata_url = 'http://localhost:8000/userdata' c.GenericOAuthenticator.userdata_method = 'GET' c.GenericOAuthenticator.userdata_params = {} c.GenericOAuthenticator.username_key = 'username' # What address will Jupyterhub be accessed from? c.JupyterHub.bind_url = 'http://localhost:8010' # By default Jupyterhub requires that a Linux user exist for every # authenticated user. For testing, we are going to trick JupyterHub # to merely pretend that such a user exists and launch notebook servers # for the same user running the hub process itself! from jupyterhub.spawner import LocalProcessSpawner class SameUserSpawner(LocalProcessSpawner): """Local spawner that runs single-user servers as the same user as the Hub itself. Overrides user-specific env setup with no-ops. """ def make_preexec_fn(self, name): """no-op to avoid setuid""" return lambda : None def user_env(self, env): """no-op to avoid setting HOME dir, etc.""" return env c.JupyterHub.spawner_class = SameUserSpawner
Wow! That was a lot. Take some time to read the settings and absorb them!
Now, keep the Django server running as is! Next, we are going to have to launch Jupyterhub but Jupyterhub requires some more pieces of info in the form of environment variables - the URL in Django which will authorize JupyterHub as an application and which gives the token. So, we will create a shell script that initializes these variables and then launches the hub.
1 2 3 4 5 6 | #! /bin/bash export OAUTH2_AUTHORIZE_URL="http://localhost:8000/o/authorize" export OAUTH2_TOKEN_URL="http://localhost:8000/o/token/" jupyterhub -f jupyterhub_config.py |
Let's launch!
$ chmod u+x launch.sh $ ./launch.sh
And now test! Head to http://localhost:8010
. Click on the button "Sign In With
GenericOAuth2". If you are logged into admin console already, then you should be
take to a page where you click Authorize. If you are not already logged in, the
Login page will appear first after which the page where you click Authorize will
present itself. Once you click Authorize, your Jupyter notebook should launch.
That's it! JupyterHub has successfully learned how to authenticate a user using user account information stored in your Django application!
Remember: Source Code Available Here.
Note: In the application that I actually built I had to containerize both the Django application as well as JupyterHub and allow the latter to launch per user notebook servers as containers. I'll try to bring this out in a future tutorial.