Apache Pig

Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.

Languages supported:

7.8/10 (Expert Score) ★★★★★
Product is rated as #21 in category Big Data Analytics Software
Ease of use
7.6
Support
7.4
Ease of Setup
7.1

Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.

Apache Pig
Apache Pig

Show more categories

Customer Reviews

Apache Pig Reviews

Prashant V.

Advanced user of Apache Pig
★★★★★
Apache Pig makes it easy to create efficient data pipelines

What do you like best?

Apache Pig and its query language (Pig Latin) allowed us to create data pipelines with ease. The language is designed to reflect the way data pipelines are designed, so it discards extraneous data, supports user defined functions (UDFs) , and offers a lot of control over the data flow.

What do you dislike?

Pig being a greedy language, will not evaluate data until it's actually needed. So errors are not visible unless you actually try to dump/print the data. There is no "debug" functionality to run the code in a dry-run mode.

Recommendations to others considering the product:

Unless you already have implementations of Pig in the company that you are building on top of, you might be better off with other newer technologies with more

What problems are you solving with the product? What benefits have you realized?

I have used Pig for data piplining and aggregation. The flow of the language reflects the flow of the data and so it is intuitive to understand what the data transformation is doing. However it hasn't kept up with the latest advances in technologies. If you were choosing a language, you would be better off with either Hive or Spark. Pig also has a steeper learning curve since it uses a proprietary language (Pig Latin).

Review source: G2.com

Leave a reply

Your total score

B2B Software Guide