AWS Data Pipeline is a web service that helps you process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals.
AWS Data Pipeline is a web service that helps you process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals.
Customer Reviews
Administrator in Cosmetics
Advanced user of AWS Data PipelineA pretty straightforward way to run batch jobs, whether that's ETLs, or some sort of other CRON based job. The tool is quite customizable with what can be done (arbitrary shell scripts) and integrates well with S3 and EC2
There are times when something that should just work is very difficult. For example, the default EC2 instance AMI was very very old and thus, the tools installed on it have to all be upgraded before being used (like the AWS CLI). Everything can be customized, however, including the image, but this takes a little bit of time to get working and can be confusing when just getting started. This particular example has been resolved, but there are other that are similar.
We are ETL-ing data and running batch transform jobs. AWS Data Pipeline allows us to do all of this, although there are definitely some setup costs involved.