What are BigQuery and Redshift?
BigQuery and Redshift are two industry-leading examples of Data Warehouses.
Why is PAD built on BigQuery?
CTA believes BigQuery is a better fit for the data in the progressive workspace.
Why do we refer to BigQuery as PAD then?
PAD uses BigQuery to provide data efficiently in a user interface already familiar to many. BigQuery is trusted by many in the tech and adjacent industries and is used to “Google” scale: it will calmly, quietly scale up, so you can trust that your data will be there when you need it most. Since CTA manages the set up and administration of BigQuery on behalf of our partners, we call it PAD (Progressive Action Database).
What are the benefits of BigQuery over Redshift?
BigQuery and Redshift are two industry-leading examples of Data Warehouses, but there are key distinctions that we believe make BigQuery a better fit for the progressive workspace.
These boil down to two overarching points:
- BigQuery has less overhead (but scales better): Redshift is built with the expectation of a team of developers managing resources behind the scenes; this is prone to falling behind in the most critical times (like GOTV) or costing too much. BigQuery was created to manage large data syncs and sets. It scales automatically out of the box because it is serverless.
- Ease of Use: BigQuery is easy to use out of the box, with rich integrations and user-friendly features that make it easier to use and easier to manage.
BigQuery has less overhead and scales better
- Redshift expects an AWS expert to provision resources for it. BigQuery manages itself. Redshift is built for those with very predictable, large-scale workflows, where a team of expert developers can provision the exact amount of resources your organization needs ahead of time. Essentially, Redshift lets you "rent" large servers, and then your team uses those servers when they run queries. Redshift is provisioned on clusters and nodes. BigQuery uses computing resources upon request, distributing processing across a large number of machines working in parallel serverless, allowing it to scale up and down based on usage.
- Unless you hire an expert on controlling AWS costs, BigQuery is going to be cheaper for faster service. BigQuery scales up when things are busy and scales down when they aren’t. With Redshift, you pay for your server and the expert(s) managing that server, whether you are using it or not.
- During peak usage, Redshift buckles. BigQuery is consistent. When performing large operations (like loading a voter file, refreshing reports, or even a poorly written query), Redshift becomes slow and even unresponsive. This means any services depending on Redshift - like Civis or your analytics dashboards - can become slow, stale, or experience outages. BigQuery is used to "Google" scale: it will calmly, quietly scale up, so you can trust that your data will be there when you need it most.
- This reliability makes BigQuery much easier to build integrations with: when you or a tool’s API request a query be run in BigQuery, it will execute quickly, no matter what. When Redshift becomes overloaded, services that depend on it can be left waiting.
BigQuery is Easier to Use
- Easy Integration with Open-source tooling. Because of BigQuery's serverless design, it is very easy to let other tools or systems integrate with the data. Google’s documentation is exceptional, its API is intuitive, and virtually every tool works out of the box with BigQuery. Open-source ETL tools like Airbyte, Airflow, and many more work seamlessly.
- Manage permissions through your Google account. Access is managed through your organization’s Google account, meaning securing BigQuery access is as easy as setting up a Gmail account. No VPNs, bastian servers, or other overhead is required.
- Built-in Analytics Features. BigQuery has built-in features that produce powerful analytics dashboards (Looker Studio), beautiful charts and maps, write to and read from any Google Sheet, and let you easily schedule repeated queries without leaving BigQuery.
- Write SQL for humans, not machines. Because BigQuery has numerous under-the-hood query optimizations, you can write code intended to be read and understood by humans, instead of being forced into performant but confusing SQL patterns. No scale is too big for BigQuery.
- Excellent geographic data support. In the political world, determining which voters fall into which precinct is an incredibly important strategic question. Programs like PostGIS are powerful, but can scale only as much as the server or computer they are hosted on; however, BigQuery can handle complex shapefile operations at virtually any scale, mind-bogglingly quickly. Thanks to this capability, it has the ability to create remarkable maps, thanks to BigQuery GeoViz and Google Maps visualizations in Looker Studio.
- Cost-effective, with no management overhead. BigQuery is cost-effective for non-technical managers because you only pay for what you use; with Redshift, the balance between allocating too many resources versus too few is something you need to think about on a daily basis.
A helpful analogy is that Redshift is like renting a car, whereas BigQuery is a cost-controlled taxi service.
- If all you do is transport one person at predictable intervals, renting a car is great. But in critical moments, like when you need to deliver 12 passengers to different destinations at the same time, it's a poor fit.
- When Redshift is busy with somebody else’s query, you need to wait. When things are quiet, you still pay for the unused computing resources.
- With BigQuery, you can request 12 cars simultaneously, instead of waiting for the same car to finish another job and return.
- BigQuery gets its jobs done quickly by scaling up, then promptly scales back down to zero.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article