What is Dask?

Dask is a flexible library for parallel computing in python with a highly optimized distributed graph execution framework. The community has implemented the tools you love - Pandas, NumPy, Scikit-Learn, on top of this scalable surface so that you can scale the tools you love without having to learn anything new.

Written in Python

Dask is written in Python and interoperates well with C/C++/Fortran/LLVM or other natively compiled code linked through Python. Spark is written in Scala with some support for Python and R. But really it's a gateway to having to deal with a lot of Scala and Java. Some may think this is a good thing. They would be wrong.

PyData Ecosystem

Dask is a component of the larger Python ecosystem. It couples with and enhances other libraries like NumPy, Pandas, and Scikit-Learn. Anything you can do from Python is fairly easy to do within Dask. Dask lets you work at scale with the tools you already use.

Use Cases

Spark is more focused on traditional business intelligence operations like SQL and lightweight machine learning. Dask is applied more generally both to business intelligence applications, as well as a number of scientific applications including machine learning, and linear algebra. Since Dask supports generic distributed graph evaluation, it isn't limited by what can be done efficiently using Spark's Map-Shuffle-Reduce paradigm.

Pricing

Saturn manages Jupyter, Spark, and Dask deployments so you can focus on data science.

Free Trial

14 Days
  • Use Jupyter in the cloud.
  • Publish jupyter notebooks to the public.
Sign Up

Basic Plan

$0 plus the cost of compute
  • Use Jupyter in the cloud.
  • Publish jupyter notebooks to the public or select individuals.
  • Scale up to 8 cores, 64 GB of RAM and GPUs.
  • In depth expense reports so you understand your cloud spend.
Sign Up

Enterprise

  • Use Jupyter in the cloud.
  • Publish jupyter notebooks to the public or select individuals.
  • Scale up to any size you want, GPUs included.
  • In depth expense reports so you understand your cloud spend.
  • Custom environments and configuration for your entire team.
  • One click Spark and Dask clusters.
  • Automatic version control means you can always roll back your work to any point in time.
  • On premise deployment inside your AWS/GCP/Azure account or VPC.
Contact Us