Using Dask Outside of Saturn

Saturn manages Dask clusters that can be accessed securely outside of Saturn

Connecting to Dask Clusters

For security, Saturn tightly controls network traffic to prevent unauthorized access to resources running in Saturn. By default, network traffic from outside Kubernetes cannot reach any Dask clusters.

dask-saturn has an ExternalConnection object which tells Saturn to open up a connection between your dask cluster and outside network traffic, secured with a TLS handshake.

The Code

from dask_saturn.external import ExternalConnection
from dask_saturn import SaturnCluster
from dask.distributed import Client


user_api_token = ...
base_url = ...
project_id = ...

conn = ExternalConnection(
    project_id=project_id,
    base_url=base_url,
    saturn_token=user_api_token
)

cluster = SaturnCluster(external_connection=conn)
c = Client(cluster)

There are 3 pieces of information needed to connect.

base_url

The base_url is the URL of the Saturn deployment. For example for Saturn Hosted, it’s `https://app.community.saturnenterprise.io/

user_api_token

  • user_api_token is an API token that tells Saturn who you are. If you are logged in to Saturn, you can retrive it in JSON form at the following URL ${base_url}/api/user/token For example, on Saturn Hosted, the URL is https://app.community.saturnenterprise.io/api/user/token and it returns something like:
{"token": "8108a9955a4747559df869a5ce3e1a5f"}

Here, 8108a9955a4747559df869a5ce3e1a5f is my user_api_token

project_id

The ‘project_id’ can be retrieved from the project details page:

Which for me, is at the following URL: “https://app.internal.saturnenterprise.io/dash/projects/cc231aee63504395a1bd1fa93cd6e5d9", my project_id is cc231aee63504395a1bd1fa93cd6e5d9

Can I use this without Jupyter?

Yes! Using saturn-client, you can create Dask clusters in Python, running on your laptop, or anywhere. saturn-client will still create a Saturn project, which has a Jupyter instance you can use if you’d like - but if you don’t spin it up, it consumes no resources and costs nothing. More details coming soon.

The Code

We won’t cover all of saturn-client here, but this should give you a sense of how it works.

Creating a new Dask cluster

Saturn works with Projects, and projects contain a number of resources including Dask clusters. To create a new Dask cluster, just create new project. The above code will create a new project - if you run it 5 times you’ll end up with 5 projects (and 5 dask clusters)

from saturn_client import SaturnConnection
saturn_connection = SaturnConnection(base_url, user_api_token)

project = saturn_connection.create_project(
     name="just-dask",
     image_uri='saturncloud/saturn:2020.10.23',
 )
project_id = project['id']

Updating an existing Dask cluster

This code will create a project, and retrieve the project_id. You’ll need to supply values for base_url, and user_api_token. With these values, you can connect to your dask cluster as shown above

You can also retrieve other projects via the API, and update the proejct.

projects = saturn_connection.list_projects()
project_id = [x for x in projects if x['name'] == 'just-dask'][0]['id']

saturn_connection.update_project(project_id, image_uri="saturncloud/saturn:2020.11.30")