Using Dask Outside of Saturn
Connecting to Dask Clusters
For security, Saturn tightly controls network traffic to prevent unauthorized access to resources running in Saturn. By default, network traffic from outside Kubernetes cannot reach any Dask clusters.
dask-saturn
has an ExternalConnection
object which tells Saturn to open up a connection between your dask cluster and outside network traffic, secured with a TLS handshake.
The Code
from dask_saturn.external import ExternalConnection
from dask_saturn import SaturnCluster
from dask.distributed import Client
user_api_token = ...
base_url = ...
project_id = ...
conn = ExternalConnection(
project_id=project_id,
base_url=base_url,
saturn_token=user_api_token
)
cluster = SaturnCluster(external_connection=conn)
c = Client(cluster)
There are 3 pieces of information needed to connect.
base_url
The base_url
is the URL of the Saturn deployment. For example for Saturn Hosted, it’s `https://app.community.saturnenterprise.io/
user_api_token
user_api_token
is an API token that tells Saturn who you are. If you are logged in to Saturn, you can retrive it in JSON form at the following URL${base_url}/api/user/token
For example, on Saturn Hosted, the URL ishttps://app.community.saturnenterprise.io/api/user/token
and it returns something like:
{"token": "8108a9955a4747559df869a5ce3e1a5f"}
Here, 8108a9955a4747559df869a5ce3e1a5f
is my user_api_token
project_id
The ‘project_id’ can be retrieved from the project details page:
Which for me, is at the following URL: “https://app.internal.saturnenterprise.io/dash/projects/cc231aee63504395a1bd1fa93cd6e5d9", my project_id
is cc231aee63504395a1bd1fa93cd6e5d9
Can I use this without Jupyter?
Yes! Using saturn-client, you can create Dask clusters in Python, running on your laptop, or anywhere. saturn-client
will still create a Saturn project, which has a Jupyter instance you can use if you’d like - but if you don’t spin it up, it consumes no resources and costs nothing. More details coming soon.
This is Bleeding Edge
You’ll have to grabsaturn-client
from GitHub in order to test this out. If you’re using this functionality, please let us know - so we can work with you as we improve the API.
The Code
We won’t cover all of saturn-client
here, but this should give you a sense of how it works.
Creating a new Dask cluster
Saturn works with Projects, and projects contain a number of resources including Dask clusters. To create a new Dask cluster, just create new project. The above code will create a new project - if you run it 5 times you’ll end up with 5 projects (and 5 dask clusters)
from saturn_client import SaturnConnection
saturn_connection = SaturnConnection(base_url, user_api_token)
project = saturn_connection.create_project(
name="just-dask",
image_uri='saturncloud/saturn:2020.10.23',
)
project_id = project['id']
Updating an existing Dask cluster
This code will create a project, and retrieve the project_id. You’ll need to supply values for base_url, and user_api_token. With these values, you can connect to your dask cluster as shown above
You can also retrieve other projects via the API, and update the proejct.
projects = saturn_connection.list_projects()
project_id = [x for x in projects if x['name'] == 'just-dask'][0]['id']
saturn_connection.update_project(project_id, image_uri="saturncloud/saturn:2020.11.30")