Apache Beam is an open source unified programming model designed to define and execute data processing pipelines, including ETL, batch and stream processing.
Apache Beam is an open source unified programming model designed to define and execute data processing pipelines, including ETL, batch and stream processing.
Customer Reviews
Consultant in Automotive
Advanced user of Apache BeamLiked the way Beam provides abstraction for the complex distributed paradigm using pCollection and Transforms etc.
Also its unified approach to deal with both Batch and Stream processing is unique and efficient.
At the time of doing development , i found doing Join operation was not that simple and we had to use coGroupByKey for that which was a bit confusing for us. May be providing abstraction as simple join operation would help the users community. May be the developer community might have already provided that option but it was not there when we were doing our use case.
We have been using Beam for a batch data processing with Source as Google Cloud Storage and Destination is Big Query.
We found it very efficient as we did various tranformations on the fly as well.