Fault-Tolerant Data Pipelines with Prefect Cloud
Overview
This tutorial explains how to use Prefect Cloud and Saturn Cloud together.
The tutorial “Scheduled Data Pipelines” introduces how to build data pipelines using prefect
, and how to speed them up by executing them on a Saturn Dask Cluster. If you are not familiar with prefect
yet, consider reading that article first and then coming back to this one.
If you are not familiar with Prefect Cloud or want a deeper understanding of how the integration between Saturn Cloud and Prefect Cloud works, see “Prefect Cloud”.
For this tutorial, we’ll create a flow that mimics the process of getting a batch of records, using a machine learning model to score on them, and capturing metrics.
Set Up a Prefect Cloud Account
To begin this tutorial, you’ll need an existing Prefect Cloud account. Prefect Cloud’s free tier allows you to run a limited number of flows, so you can run this tutorial without spending any money on Prefect Cloud!
- Sign up at https://www.prefect.io/cloud/
- Once logged in, create a project. For the purpose of this tutorial, call it
dask-iz-gr8
. - Following the Prefect documentation, create a
RUNNER
token and aUSER
token. Store these for later.RUNNER
token: must be created by an admin. Allows an agent to communicate with Prefect CloudUSER
token: allows a user to register new flows with Prefect Cloud
Create a Prefect Cloud Agent in Saturn
Prefect Cloud “agents” are always-on processes that poll Prefect Cloud and ask “want me to run anything? want me to run anything?". In Saturn Cloud, you can create these agents with a few clicks and let Saturn handle the infrastructure.
- Log in to the Saturn UI as an admin user.
- Navigate to the “Credentials” page and add a Prefect runner token.
Type
: Environment VariableShared With
: your user onlyName
:prefect-runner-token
Variable Name
:PREFECT_RUNNER_TOKEN
Value
: the RUNNER token you created during setup
- Navigate to the “Prefect Agents” page. Create a new agent.
Name
:test-prefect-agent
Prefect Runner Token
: select thePREFECT_RUNNER_TOKEN
you created earlier
- Start that Prefect Agent by clicking the play button.

After a few minutes, your agent will be ready!
Click on the Agent’s status to see the logs for this agent.

In the Prefect Cloud UI, you should see a new KubernetesAgent
up and running!

Create and Register a Flow
Now that you’ve created an account in Prefect Cloud and set up an agent in Saturn to run the work there, it’s time to create a flow!
- Return to the Saturn UI.
- Navigate to the “Credentials” page and add a Prefect USER token.
Type
: Environment VariableName
:prefect-user-token
Variable Name
:PREFECT_USER_TOKEN
Value
: the USER token you created during setup
- Navigate to the “Projects” page and create a new Project with the following specs.
Name
: test-prefectImage:
any of the available non-gpusaturncloud/saturn
images you wantWorkspace Settings
Hardware
,Disk Space
,Shutoff After
: keep the defaults
Environment Variables
PREFECT_CLOUD_PROJECT_NAME=dask-iz-gr8
Start script
pip install --upgrade dask-saturn prefect-saturn
- Once the Project is created, start it’s Jupyter server by clicking the play button.
- Once that Jupyter is ready, click “Jupyter Lab” to launch Jupyter Lab.
- In Jupyter Lab, open a terminal and run the code below to fetch the example notebook that accompanies this tutorial.
cd /home/jovyan/project/ EXAMPLE_REPO_URL=https://raw.githubusercontent.com/saturncloud/examples/main/examples/examples-cpu/prefect/ wget ${EXAMPLE_REPO_URL}/prefect-cloud-scheduled-scoring.ipynb
- In the file browser in the left-hand navigation, double-click that notebook to open it. Follow the instructions in it and run the cells in order. Return to this article when you’re done.
Inspect Flow Runs
Now that your flow has been created and registered with both Saturn Cloud and Prefect Cloud, you can track it’s progress in the Prefect Cloud UI.
- In the Prefect Cloud UI, go to
Flows --> ticket-model-evaluation
. ClickSchematic
to see the structure of the pipeline.

- Click
Logs
to see logs for this flow run.- from this page, you can search the logs, sort them by level, and download them for further analysis

- In the Saturn Cloud UI, navigate to the Project’s details page. You should see that a new Dask cluster has been created for this flow, with a name like
p-c93609
. Click the dashboard URL to monitor the activity in the cluster.

- In the Saturn Cloud UI, navigate back to the
Prefect Agents
page. Click therunning
status for thetest-prefect-agent
agent you previously set up. You should see new logs messages confirming that the agent has received a flow to run.

Clean Up
The flow created in this tutorial is set to run every 10 minutes. Once you’re done with this tutorial, be sure to tear everything down!
In Prefect Cloud
- navigate to
Flows
. Delete theticket-model-evaluation
flow.
In Saturn Cloud
- Logged in as the user who created the flow, navigate to the Project’s details page. Click the delete button on this flow’s Dask cluster to stop and delete it.
- Click the delete button to stop and delete the jupyter you used to create the flow.
- Logged in as the user you used to create a Prefect agent, navigate to the
Prefect Agents
page. Click the delete button to stop and delete the Prefect agent. - Navigate to the
Credentials
page. Remove the credentialsPREFECT_RUNNER_TOKEN
andPREFECT_USER_TOKEN
.
Learn and Experiment!
In this tutorial, you learned how to use Prefect Cloud to manage a prefect
flow, and how to improve the speed and environment management of that flow using a Saturn Cloud Dask cluster.
To learn more about prefect-saturn
, see https://github.com/saturncloud/prefect-saturn.
To learn more about Prefect Cloud, see https://docs.prefect.io/orchestration/.
If you have any other questions or concerns, send us an email at support@saturncloud.io.