Dremio

Dremio is a data analysis software. It is self-service data platform provided that users discover, accelerate and share data at any time.

Languages supported: English

9.6/10 (Expert Score) ★★★★★
Product is rated as #2 in category Big Data Analytics Software
Ease of use
9.2
Support
9.8
Ease of Setup
8.9

Images

Check Software Images

Maximize the power of your data with Dremio—the cloud data lake engine. Dremio operationalizes your cloud data lake storage and speeds your analytics processes with a high-performance and high-efficiency SQL query engine while also democratizing data access for data scientists and analysts via a governed self-service layer. The result is fast, easy data analytics for data consumers at the lowest cost per query for IT and data lake owners.

Show more categories

Customer Reviews

Dremio Reviews

Mark Z.

Advanced user of Dremio
★★★★★
Made us rethink our whole architecture!

What do you like best?

The ease of which it allows you to quickly explore new data sets, is impressive. I am always in awe at how quickly we can consume huge data sets (folders full of CSV or Parquet files) and structure them to work as a single data table. This process would have typically taken an IT resource to create/apply a script to manipulate/load the data into a database or single file, and we have our "business" users with no IT experience doing it right away. They still rely on IT to write queries against it for them, but they can explore the data right away. With a little training, even our "business" users are writing SQL to explore the data.

We have a large-scale project to allow our entire organization access to the data they need to do their jobs. We had a large-scale ETL process that transforms that data into a data model and combines data generated inside our firm to data provided from our vendors. Adding Dremio into our environment meant that we no longer have to model the data provided by our vendors. We can spend more time modeling our internal data and running additional data quality checks instead of constantly adjusting our data model when we want to onboard new data from external vendors.

With personal spaces, our end users can upload a simple Excel document and join that to the data we have made available in our platform with no assistance from IT. And with the latest tools provided by the Dremio Professional Services, we now have the reports to show us what users are using what data sets! This allows us to constantly monitor our environment for bottlenecks and stale or unused data sets. This is a massive win for us!

What do you dislike?

While Dremio has been a huge asset to the firm, there are several things that could be improved and there are some scenarios we have seen that it is not the appropriate tool for. We have an environment that has multiple storage accounts in the cloud and several databases that we connect to. We have had several performance issues when we combine data in our data lake to the databases. It turns processes into a single threaded query and essentially locks up or blocks all access to both the dremio environment and the database (Synapse in this case). Since implementing Dremio they have added Delta Lake support and we have turned to this to solve that issue. Since implementing Delta Lake instead of Synapse, we have essentially eliminated this issue.

As with any tool, there is a learning curve to the interface, the interface is rich and has a lot of features but lacks some usability aspects. We have provided feedback to Dremio on this and they have been attentive to these requests so I have confidence this will get better. Going from a typical SQL IDE like Management Studio is a bit of an adjustment, but you get used to it.

We user Power BI and to date, Dremio is not a first level provider for Power BI. You can connect and consume data from Dremio, but I cannot get information about what user is connecting, etc. I am waiting for MS to make them a first party provider.

What problems are you solving with the product? What benefits have you realized?

We were trying to solve a data virtualization issue. We wanted to disconnect the data we provide to our end users from the physical data sources we get the data. Using the best practices of Dremio, we have been able to accomplish this and have already benefited from this. We were able to adjust from providing data from a Synapse instance to Delta Lake with zero impact to our end users and did not have change any of our queries.

Another side benefit of using Dremio is the time to market of our external data. We are able to quickly onboard sample data from the vendor and allow end users to explore this to determine if this is something they wish to pay for. We can then automate the feed of that data very quickly and make it immediately available.

Review source: G2.com

Leave a reply

Your total score

B2B Software Guide