Alleviating Data Science Pain | Saturn Cloud & AWS Solutions for Teams
In this webinar, we’ll uncover how Saturn Cloud and Amazon Web Services provide flexible and easy solutions to data science infrastructure woes when it comes to compute resources, collaborative workspaces, and end-to-end tools for teams.
Data Science at Internet Scale with GPUs in the Cloud
Join this webinar to hear from data scientists at NVIDIA, DoorDash, and Saturn Cloud as they discuss how data science on the GPU is a game-changer for your organization. We will discuss RAPIDS, a suite of open-source packages for data science on GPUs, and show a demo of how model inference with RAPIDS provides significant performance gains over scikit-learn on CPUs.
Workshop: Speeding Up Transfer Learning
Join our upcoming workshop to learn best practices for faster, better performance on transfer learning and deep learning modeling tasks, how to train a computer vision model on a multi-machine GPU cluster using PyTorch, and more.
Workshop: Scaling LightGBM training with Dask on Saturn Cloud
In this workshop, attendees will get an introduction to LightGBM, a popular lightweight gradient-boosted decision tree (GBDT) library. This introduction will cover GBDTs generally and LightGBM, specifically. It will also describe which parts of a GBDT can be parallelized and how GBDT training works with multiple machines.
Workshop: Introduction to PyTorch with Dask: Batch Inference
In this hands-on workshop, attendees will have the opportunity to see how image classification tasks in PyTorch can be easily parallelized using Dask clusters on Saturn Cloud.
Machine Learning Without Limits: Snowflake and Python Together In The Cloud
In this webinar, we will introduce how data scientists can utilize Snowflake and Saturn Cloud together for their machine learning workloads and more.
Workshop: Escaping MemoryError- Machine Learning on Big Data with Dask
In this hands-on workshop, attendees will be introduced to Dask, a Python-native parallel computing framework. Dask extends traditional Python tools to operate at scale across a cluster of machines, removing memory and compute limitations. Instructors will walk step-by-step through setting up a Dask cluster, processing large datasets efficiently, and performing machine learning model training across the cluster.
Workshop: Introduction to PyTorch with Dask
In this hands-on workshop, attendees will learn how Dask and parallelization can be incorporated into standard PyTorch workflows to create faster inference and training, as well as higher quality training results. Instructors will walk step-by-step through how to run two types of computer vision jobs on GPU Dask clusters: large batch inference and transfer learning.
Accelerating XGBoost With Python
Join us for an interactive discussion with Aaron Richter, Senior Data Scientist at Saturn Cloud, and Mike McCarty, Director of Software Engineering at Capital One. We will be discussing all things XGBoost along with packages, methods, tips for accelerating XGBoost performance in Python, and more.
Parallel Processing in Python
This talk covers the current landscape of parallel processing tools in Python, with a focus on which tools are best suited for various workloads such as arrays, dataframes, machine learning, and deep learning.
Workshop: Scaling Machine Learning in Python
In this hands-on workshop, you’ll have the opportunity to see how a standard data science and machine learning workflow, using pandas and scikit-learn, can easily be parallelized using Dask clusters. Instructors will walk step-by-step through how to migrate existing Python code to Dask, an open-source framework enabling parallelization of Python.
Data & AI Accessibility: The Democratization of Data Science
Join Saturn Cloud’s Senior Data Scientist, Aaron Richter, and Travis Oliphant, CEO of OpenTeams and Quansight for an interactive discussion covering: The creation of NumPy and the start of the OSS PyData community and projects, how the data/AI ecosystem has changed over the last 10-20 years, Dask and Numba, how OSS tools will continue to be well-maintained moving forward, and more.
Next-Generation Big Data Pipelines with Prefect and Dask
Data pipelines are crucial to an organization’s data science efforts. They ensure data is collected and organized in a timely and accurate manner, and is made available for analysis and modeling. In this talk, we’ll introduce the next-generation stack for big data pipelines built upon Prefect and Dask, and compare it to popular tools like Spark, Airflow, and the Hadoop ecosystem.
The Future Of High Performance Computing
Meet the engineering leads behind RAPIDS through an interactive discussion and an opportunity to ask questions during the event
100x Faster Compute: Scaling Python For Data Science on AWS
Learn how to run up to 100x faster data science workloads in Python with Dask and RAPIDS and understand the infrastructure that’s necessary to launch high performance clusters and GPU machines in AWS.