Snowflake x Ponder

We're proud to announce that Ponder will join Snowflake to supercharge data science and AI in the Snowflake Data Cloud.

Learn More

Your Python data workflows,
now running on your data warehouse

Ponder brings the best of both worlds: a fully pandas-native experience that is familiar and fast to prototype, plus the scalability and reliability of operating in a data warehouse.

Learn more

Data science, now with scale, security, and governance built-in

Iterate on your Pandas workflows quickly, from prototype to deployment, all running securely within your cloud-native data warehouse. Turbocharge your productivity and speed up development cycles with lightning-fast, interactive results.

Get Started

Get insight faster with Ponder. Speeding up Pandas API, including groupby, mean, and hundreds of other functions!

Clean data at scale,
with zero effort

Run your Pandas workflows at all scales, from megabytes to terabytes, without changing a single line of code. No more painful out-of-memory errors or slow single-threaded execution.

Get Started

df.merge()

df.pivot()

df.fillna()

df.describe()

df.explode()

df.merge()

df.pivot()

df.fillna()

df.describe()

df.explode()

# import pandas as pd

import modin.pandas as pd

Time

Run Pandas everywhere

Don’t let your warehouse investments go to waste. Leverage your existing data warehouse as compute. No additional infrastructure setup required.

Get Started

From the team that brought you popular open-source library Modin

Ponder was founded by the creators of the popular open-source library Modin, which enables data scientists to run pandas at scale on distributed computing backends, such as Ray or Dask. Modin is embraced by the community and has seen adoption across sectors, including by the world's leading AI companies.

8M+

Downloads to date

40+

Used by companies & Organizations

100+

Open Source Contributors

8k+

Github stars

Powering data teams at the Fortune 100 and more!

What others are saying about Ponder

"Data scientists hate dealing with databases, but Ponder's integration with DuckDB is helping to change that. I got early access to try Ponder and it was magical. You just import ponder, connect it to DuckDB, and run pandas. And it just works!"

Jordan Tigani

CEO and Co-Founder of MotherDuck

"Wrangling large data sets is one of my biggest model development bottlenecks - it just takes so much time even with optimized clusters! I gave Ponder a try and I was amazed at how fast it was able to handle large data sets. For instance, simple pandas/pyspark functions like value_counts/countDistinct and count would take several minutes but using Ponder, results were coming back in less than 2 milliseconds - it's mad!"

Henok Yemam

ML Scientist at Expedia Group

"At Intel, we believe that Modin is increasingly a critical component of data science and machine learning workflows. Intel is investing heavily in Modin through our Intel oneAPI toolkit to make accelerated computing accessible to all data science teams."

Areg Melik-Adamyan

Principal Engineer and Data Platform Chief Architect at Intel

"Modin allows you to use the same Pandas script for a 10KB dataset on a laptop as well as a 10TB dataset on a cluster. This is possible due to Modin’s easy to use API and system architecture. This architecture can utilize Ray as an execution engine to make scaling Modin easier."

View article

“What does Modin have to offer you as the end user? [...] it offers a very simple, drop-in replacement for pandas – you just switch your “import pandas as pd” statement with “import modin.pandas as pd” and gain better scalability for a lot of use cases.”

View article

“Data scientists who don’t necessarily want to manage OmniSci as a separate component in their workflow sometimes need the full API surface of pandas, particularly during data shaping and ingestion. Modin [...] aims to provide a drop-in (but also scalable and performant) replacement for pandas that can leverage both Ray and Dask for distributed execution.”

View article

“Data Scientists are increasingly required to do and learn more, but tools have largely lagged supporting all of these new requirements. [...]To improve data science productivity, MindsDB has teamed up with Modin to bring SQL to distributed Modin Dataframes. Now you can run SQL alongside the pandas API without copying or going through your disk.”

View article

“Data infrastructure is already Intel-optimized, and Intel has now streamlined the most popular data science and AI tools, and created new ones that help clear the path forward. [...]Modin is an open source library that accelerates the popular Pandas data library by up to 20 times.”

Watch Video

View article

Devin Petersohn
Doris Lee
Aditya Parameswaran

Grounded in cutting-edge research done at UC Berkeley

Ponder is founded by a professor and PhDs from the UC Berkeley RISE Lab. Ponder's underlying technology is based on decades of deep academic research and is built by the team that developed open-source scalable data science library Modin.

Learn more

Latest from the blog:

News

Oct 23, 2023

🐼 ❤️ ❄️

We are excited to announce Snowflake’s intent to acquire Ponder to bring Ponder’s Python data science innovations to its customers and to accelerate the growth of the Modin community.

Articles

Oct 3, 2023

Professional Pandas: Handling Missing Data With Pandas Dropna

This is the fifth in a series of blog posts that teach how to write professional-quality pandas code. We start by discussing pandas dropna generally and going over a simple example. Then we talk about identifying missing values, when to drop data, and how to drop entire rows that are missing.

Articles

Sep 19, 2023

How To Use pandas resample on a Database

In this article, we describe pandas resample + provide some examples, and then show how you can use it at scale in your database.

Ready to level up your Pandas game?

Try Ponder Now

Snowflake x Ponder

Your Python data workflows, now running on your data warehouse

Data science, now with scale, security, and governance built-in

Clean data at scale,with zero effort

Run Pandas everywhere

From the team that brought you popular open-source library Modin

8M+

40+

100+

8k+

Powering data teams at the Fortune 100 and more!

What others are saying about Ponder

Grounded in cutting-edge research done at UC Berkeley

Latest from the blog:

🐼 ❤️ ❄️

Professional Pandas: Handling Missing Data With Pandas Dropna

How To Use pandas resample on a Database

Ready to level up your Pandas game?

Ready to level up your Pandas game?

Your Python data workflows,
now running on your data warehouse

Clean data at scale,
with zero effort