data_manipulation ¤

This module is designed for benchmarking various data processing methods.

It compares the performance of Pandas, Polars and DuckDB for a common data aggregation task.

Polars is a Rust-powered DataFrame library designed for speed that brings multi-threaded execution and query optimization to Python.

Key capabilities include:

DuckDB is an embedded SQL database optimized for analytics that brings database-level query optimization to local files.

Key capabilities include:

At the end of the script, a comparison benchmark table summarizes the performance of Pandas, Polars, and DuckDB across various operations.