Connect to Dask from Outside Saturn Cloud

Directions for connecting from anywhere outside of Saturn Cloud

What if you’d like to just connect directly from your laptop to a Dask cluster, instead of using a Jupyter server at all? Saturn Cloud lets you do this too!

This is a very new feature for Saturn Cloud, so you’ll have to grab dask-saturn from GitHub in order to test this out. You’ll also need to install the saturn-client library if you want to create projects from your command line.

Create the Conda Environment

In order to use this new feature, you’ll need a specific set of Python packages. To make this easy, we recommend you create a conda environment with the right specifications, as shown below. (Run this set of commands in your terminal.)

conda create -n dask-saturn dask=2.30.0 distributed=2.30.1 python=3.7
conda activate dask-saturn
pip install dask-saturn==0.2.2

Now you have the environment all set to go!

Sometimes we find that people’s local environments also have old versions of pandas, and this can be an issue. Check that your version of pandas is current.

Connect to Saturn Remotely

Because this process doesn’t take you into the Saturn Cloud UI at all, you need to establish your remote connection to the service. You’ll have to have some information to do this.

  • url: The url is the URL of the Saturn deployment. For example for Saturn Hosted, it’s https://app.community.saturnenterprise.io/. If you’re using an enterprise Saturn Cloud installation, it’ll be different.

  • api_token: api_token is an API token that tells Saturn who you are. If you are logged in to Saturn, you can retrieve it in JSON form at the following URL: ${url}/api/user/token For example, on Saturn Hosted, the URL is https://app.community.saturnenterprise.io/api/user/token and it returns something like: {“token”: “8108a9955a4747559df869a5ce3e1a5f”}

Warning: Don’t store your token in an public place, or in plain text on places like github. The token needs to be stored securely so that access to your account is safe!

If you happened to be a Hosted user, and you retrieved the token above, then you’d run this to create your connection.

from saturn_client import SaturnConnection
saturn_connection = SaturnConnection(
    url='https://app.community.saturnenterprise.io', 
    api_token='8108a9955a4747559df869a5ce3e1a5f'
)

Generate Project

Doing this from UI requires the saturn-client library mentioned above.

For this, you do need to know what image you plan to use. If you’re not sure, you can view the available images inside the UI. We have full documentation about images here.

project = saturn_connection.create_project(
     name="just-dask",
     image_uri='saturncloud/saturn:2021.01.13',
 )
project_id = project['id']

The project_id generated is the unique identifier for your project - hang on to that!

Add Connection Code to Script

Now, we can put it all together! Take your project_id, your url, and your token, and fill them in to the chunk below.

from dask_saturn.external import ExternalConnection
from dask_saturn import SaturnCluster
from dask.distributed import Client

conn = ExternalConnection(
    project_id=project_id,
    base_url=url,
    saturn_token=api_token
)

cluster = SaturnCluster(
    external_connection=conn,
    n_workers=3,
    worker_size='8xlarge',
    scheduler_size='2xlarge',
    nthreads=32,
    worker_is_spot=False)

c = Client(cluster)

Run the chunk, and soon you’ll see lines like this:
#> INFO:dask-saturn:Starting cluster. Status: pending

This tells you that your cluster is starting up!

Eventually you’ll see something like:
#> INFO:dask-saturn:{'tcp://10.0.23.16:43141': {'status': 'OK'}}

Which is informing you that your cluster is up and ready to use. Now you can interact with it just the same way you would from Jupyter. If you need help with that, please check out some of our tutorials, such as Training a Model with Scikit-learn and Dask, or the dask-saturn API.


Optional: Get Project List

If you’re not sure what projects you have running/created, or need to check that a project exists, you can just run the following code to see everything, and get project IDs for all of them. Then, you can look for the specific project we just created.

projects = saturn_connection.list_projects()
projects

Places to connect to Saturn Cloud

Not only can you connect to Saturn Cloud from your laptop or local machine, but you can connect from other cloud-based notebooks. Check out instructions for connecting from Google Colab, SageMaker, and Azure.




Need help, or have more questions? Contact us at: We'll be happy to help you and answer your questions!