Build a bridge from your data lake to all your databases and data sources.
Conduct two-way universal translation using native SQL or Pig to generate Spark jobs.
Accelerate queries with advanced pushdown and cost-based optimization.
Grunion is a query pushdown optimization and translation framework built on top of Apache Calcite and integrated into Apache Spark. Grunion supercharges your Spark SQL queries and Spark Dataset and DataFrame operations with no code changes. You can connect your applications to Grunion using its JDBC driver or a standalone REST API.
With Grunion, you have the ability to accelerate your entire data science and engineering pipeline. Bootstrap the hand off between data scientists and engineers by translating native SQL queries or legacy Pig and Hive scripts into Spark jobs. Avoid expensive ETL by pushing down complex functions, aggregations, and joins into SQL and NoSQL databases. Optimize complicated join orders.
Grunion builds on top of the best engines and databases available today. Instead of competing with them, it aims to bridge the gaps in order to unlock their full potential. For example, on the TPC Benchmark™ DS, it can fully push down and parallelize Spark SQL queries against a relational database management system to achieve 10-30x improvement compared to Spark alone.