MotherDuck, known for its innovative cloud data platform that focuses on simplifying data management and analysis, has announced the beta release of pg_duckdb – a PostgreSQL extension that integrates DuckDB’s analytics engine directly into PostgreSQL.
This release is an open-source collaboration with Hydra and DuckDB Labs, bringing together expertise to enhance data analytics capabilities. More specifically, the release aims to enable organizations to run rapid analytical queries alongside traditional transactional workloads without requiring changes to their existing PostgreSQL infrastructure.
MotherDuck claims the integration delivers up to 1500x improvement for certain analytical queries and a more realistic 10x improvement for many other queries.
“PostgreSQL excels at transactional workloads but wasn’t specifically designed for analytics,” said Jordan Tigani, CEO and Co-Founder of MotherDuck. “With pg_duckdb, we’re bringing DuckDB’s analytical prowess directly to PostgreSQL users, allowing them to dramatically improve query performance without changing how their data is stored or updated.”
The pg_duckdb extension tackles a key challenge for PostgreSQL users who need to analyze their transactional data effectively. While PostgreSQL excels in transactional operations like lookups and small updates, it struggles with ad-hoc analytical queries as data volumes increase and more complex aggregations are required. This often leads users to encounter performance limitations.
By integrating DuckDB’s analytics capabilities directly into PostgreSQL, the extension enables users to run complex queries without disrupting existing workflows or switching to a different system.
According to MotherDuck, this approach helps facilitate better data analysis without altering existing systems. A notable feature of the new release includes the ability to query data directly from Data Lakes and Lakehouses, including AWS S3.
The extension allows users to work with columnar file formats like Parquet and Iceberg, enabling efficient querying and analysis of data stored in these formats. This support enhances the usability of PostgreSQL for various data analytics tasks.
In addition, organizations can scale their analytics workloads using MotherDuck’s cloud resources. This feature enables users to leverage cloud computing capabilities to manage large datasets and complex queries without relying heavily on local infrastructure.
MotherDuck shared performance data showing that the improvement holds even when scaling up to larger data sizes on a production machine. The company claims that running on EC2 in AWS with 10 times the data, a query takes approximately 2 hours with the native PostgreSQL engine, while it only takes about 400 milliseconds with the pg_duckdb extension.
According to MotherDuck, even better performance is possible using columnar format instead of PostgreSQL’s row-oriented storage.
MotherDuck’s serverless analytics platform is based on DuckDB, an open-source columnar database that has gained popularity due to its user-friendly design and efficient performance for analytics. By leveraging DuckDB’s efficient querying capabilities, MotherDuck allows organizations to perform analytics without the need for extensive infrastructure.
DuckDB Labs is the organization behind the development and support of DuckDB. The co-founder and CEO of DuckDB Labs, Hannes Mühleisen, was named one of BigDataWire’s People to Watch 2024.
With the rollout of the beta version, MotherDuck’s development team is now focusing on creating additional features and improvements. Users can track the progress and milestones of the next release on GitHub.
Related Items
Is Big Data Dead? MotherDuck Raises $47M to Prove It
TigerEye Introduces DuckDB.dart to Facilitate Data-Intensive App Development
Data Engineering in 2024: Predictions For Data Lakes and The Serving Layer
The post MotherDuck Launches Beta Extension to Enhance PostgreSQL with DuckDB Analytics appeared first on BigDATAwire.
0 Commentaires