Perspectives on LedgerSMB: PostgreSQL vs Hadoop

Thursday, August 18, 2016

PostgreSQL vs Hadoop

So one of the clients I do work with is moving a large database from PostgreSQL to Hadoop. The reasons are sound -- volume and velocity are major issues for them, and PostgreSQL is not going away in their data center and in their industry there is a lot more Hadoop usage and tooling than there is PostgreSQL tooling for life science analytics (Hadoop is likely to replace both PostgreSQL and, hopefully, a massive amount of data on NFS). However this has provided an opportunity to think about big data problems and solutions and their implications. At the same time I have seen as many people moving from Hadoop to PostgreSQL as the other way around. No, LedgerSMB will never likely use Hadoop as a backend. It is definitely not the right solution to any of our problems.

Big data problems tend to fall into three categories, namely managing ever increasing volume of data, managing increasing velocity of data, and dealing with greater variety of data structure. It's worth noting that these are categories of problems, not specific problems themselves, and the problems within the categories are sufficiently varied that there is no solution for everyone. Moreover these solutions are hardly without their own significant costs. All too often I have seen programs like Hadoop pushed as a general solution without attention to these costs and the result is usually something that is overly complex and hard to maintain, may be slow, and doesn't work very well.

So the first point worth noting is that big data solutions are specialist solutions, while relational database solutions for OLTP and analytics are generalist solutions. Usually those who are smart start with the generalist solutions and move to the specialist solutions unless they know out of the box that the specialist solutions address a specific problem they know they have. No, Hadoop does not make a great general ETL platform.....

One of the key things to note is that Hadoop is built to solve all three problems simultaneously. This means that you effectively buy into a lot of other costs if you are trying to solve only one of the V problems with it.

The single largest cost comes from the solutions to the variety of data issues. PostgreSQL and other relational data solutions provide very good guarantees on the data because they enforce a lack of variety. You force a schema on write and if that is violated, you throw an error. Hadoop enforces a schema on read, and so you can store data and then try to read it, and get a lot of null answers back because the data didn't fit your expectations. Ouch. But that's very helpful when trying to make sense of a lot of non-structured data.

Now, solutions to check out first if you are faced with volume and velocity problems include Postgres-XL and similar shard/clustering solutions but these really require good data partitioning criteria. If your data set is highly interrelated, it may not be a good solution because cross-node joins are expensive. Also you wouldn't use these for smallish datasets either, certainly not if they are under a TB since the complexity cost of these solutions is not lightly undertaken either.

Premature optimization is the root of all evil and big data solutions have their place. However don't use them just because they are cool or new, or resume-building. They are specialist tools and overuse creates more problems than underuse.

15 comments:

yoonghmAugust 19, 2016 at 8:00 PM
PostgreSQL 10 roadmap from 2ndQuadrant on columnar indexes may make PostgreSQL suitable for big data:
http://blog.2ndquadrant.com/postgresql-10-roadmap/
ReplyDelete
Replies
meSeptember 1, 2016 at 3:39 PM
The CitusDB extension is an interesting alternative to Postgres-XL, and it's also not a fork. You're correct that if your data doesn't shard well you're still in trouble. Though, doesn't the same apply to hadoop? Or does it just use a hash of the entire row/document to partition?
ReplyDelete
Replies
MissisuagaJuly 25, 2023 at 1:57 AM
This comment has been removed by the author.
ReplyDelete
Replies
MissisuagaJuly 25, 2023 at 11:49 AM
This comment has been removed by the author.
ReplyDelete
Replies
abhi January 24, 2024 at 11:11 PM
The growing popularity of Clear American Sparkling Water is a testament to the shifting consumer landscape. As people become more discerning about their choices, the demand for beverages that offer both flavor and health benefits continues to rise. Clear American has positioned itself as a frontrunner in meeting these evolving expectations, offering a sparkling water that not only quenches thirst but also contributes to a healthier lifestyle.

How to Make a Heart with a Gum Wrapper
ReplyDelete
Replies
abhi January 24, 2024 at 11:13 PM
clear-american-sparkling-water In conclusion, Clear American Sparkling Water has emerged as a clear favorite in the competitive beverage landscape. Its crisp effervescence, diverse flavor offerings, health-conscious attributes, and commitment to sustainability make it a compelling choice for those who seek more from their refreshment. With Clear American, every sip is a celebration of effervescent excellence and a step towards a healthier, more flavorful lifestyle.

ReplyDelete
Replies
PrishaMay 22, 2024 at 1:33 AM
Looking for a lucrative investment in Delhi's prime location? Pahle Dukan Phir Makaan Yojana (PDPMY 2.0) offers shops (50-500 sq.ft.) in a bustling commercial complex under DDA's PPP model. Project highlights:
• High-rental potential due to affluent nearby areas.
• Renowned brands like Inox, PVR, Shoppers Stop already on board.
• Over 50% built and leased out, ensuring footfall.
• Modern amenities like lifts, escalators, power backup, and security.
• RERA-approved project with completion by 2025.
Invest in a Grade-A rental asset and secure your future!
Helpline: 9311393393
Read more details: Delhi Shopping Complex
Disclaimer: Information based on publicly available sources. Always verify details directly with the project developer.
ReplyDelete
Replies
Digital Tech UAEOctober 8, 2024 at 12:27 AM
Balancing both technologies could provide a comprehensive solution tailored to your client's needs in life science analytics. hp server distributor in dubai
ReplyDelete
Replies
Paid ConnectionsFebruary 12, 2025 at 3:42 AM
Really helpful post! The tips are clear and easy to follow—thanks for sharing! How to set up a pay-per-minute phone service for coaching
ReplyDelete
Replies
A Plumber ServiceMarch 25, 2025 at 5:34 AM
Really insightful and well-explained! I found this super helpful—thanks for sharing! Drain Cleaning Houston
ReplyDelete
Replies
movieApril 2, 2025 at 12:40 AM
123Movies is unquestionably simple to use, and the best part is that you won't have to deal with annoying pop-ups while watching your favorite movies and shows. Since you won't be interrupted while watching your favorite 123movies
, you can finally forget about slow buffering. We grant ourselves permission to develop a secure and high-quality streaming platform. To begin, our primary objective is to safeguard and guarantee our customers' privacy, which is why we have stringent security measures in place. 123Movies has an exquisite collection of movies and shows that can be viewed on their platform, and it gives its users the opportunity to watch newly released movies, blockbuster shows, and many other captivating options. Prepare to embark on your thrilling journey and take advantage of the platform's entire offering. Let us now take a closer look at the service provider and answer some general questions to help you interact more effectively with the service.
ReplyDelete
Replies
RecipeHindi123May 24, 2025 at 2:23 AM

Youtube to MP3 Converters

We generally search for some tools which can convert or download
our favourite youtube videos into mp3 or mp4 format and later
can be download in our device like computer or mobile.
ReplyDelete
Replies
Smart Tablet MountJune 24, 2025 at 2:41 AM
This was a genuinely helpful post. I appreciate how clearly everything was explained without overcomplicating things. Great job!

Tablet Charging Wall Mount

ipad wall mount with charger
ReplyDelete
Replies

Add comment