Ponder @ PyData Global 2022

Peter Olson

Nov 30, 2022 2 min read

Ponder @ PyData Global 2022 image

Updated after the event: Ponder featured prominently at the PyData Global 2022 sessions on December 2nd! Three Ponder employees (Rehan Durrani, Alejandro Herrera, Doris Lee) and one Ponder advisor (Matt Harrison) presented.

Here are the details with links to the recordings, session by session:

Session 1: Supercharging your Pandas workflows with Modin

Speaker: Alejandro Herrera

Summary: Data practitioners are typically forced to choose between tools that are either easy to use (Pandas) or highly scalable (Spark, SQL..etc.). Modin, an open-source project originally developed by researchers at UC Berkeley, is a highly scalable, drop-in replacement for Pandas. This talk gives an overview of Modin and practical examples on how to use it to effortlessly scale up your Pandas workflows.

Learn more here.

Session 2: How to maximally parallelize the entire Pandas API

Speaker: Rehan Durrani

Summary: Pandas has rapidly become one of the most popular tools for data analysis, but is limited by its inability to scale to large datasets. We developed Modin, a scalable, drop-in alternative to Pandas, that preserves the dynamic and flexible behavior of Pandas dataframes while enhancing the scalability. This talk walks you through our team’s research at UC Berkeley, which enabled the development of Modin. We also discuss our latest publication at VLDB, which covers a novel approach to parallelization and metadata management techniques for dataframes.

Learn more here.

Session 3: Testing Pandas: Shoots, leaves, and garbage!

Speaker: Matt Harrison

Summary: How do you structure Pandas code? How do you debug it? How do you test it? In this talk, we use real-world data to explore best practices for writing Pandas code, debugging it, managing data integrity, using pytest, and generating tests with Hypothesis. No more excuses.

Learn more here.

Session 4: How to engage in the open-source community

Speaker: Doris Lee

Description: Doris spoke to the PyData Global Impact Scholars, a program for underrepresented groups in technology and open source to help them develop their career further and increase their professional impact. The talk is only available to the Impact Scholars class this year, so no recording is available.

Ready to level up your Pandas game?

Try Ponder now