Apache Kudu is a free and open source column-oriented data store of the Apache Hadoop ecosystem.
Apache Kudu is a free and open source column-oriented data store of the Apache Hadoop ecosystem.
Customer Reviews
Satheesh N.
Advanced user of Apache Kudu1. Implementing Lambda Architecture (Both Batch & Real Time Streaming) with Kudu is quite straightforward. We additionally used Streamsets as the Ingestion Platform which has quite a good coupling with Kudu.
2. Makes Real Time Analytics Quite Starightforward. We used Kudu to run multiple Real Time Campaigns.
3. Tailor Made for Implementing Data Warehouses within a Big Data Environment.
4. Nice Upsert Functionality.
1. Partition Limitation - Is limited to 2000 tablets/tablet server.
2. Faces Random Timeouts while approaching the tablet number limitation (max 2000/server).
3. Needs a Conformed Schema. No automatic handling of drifting schemas.
4. Needs a Primary Key for every table.
5. #3 & #4 are not really cons per se.
Use it if the Datawarehouse is a strong use case within Big Data enviroments & try to be within the partition/tablet limitation and you should be good to go.
1. Real Time Campaigns.
2. Real Time Lookups & Transaction Enrichment.
3. Data Warehouse Implementation is quite straight forward with Apache Kudu.