Derive insights from unstructured text using Google machine learning
Derive insights from unstructured text using Google machine learning
Customer Reviews
User in Computer Software
Advanced user of Google Cloud Natural Language APIThe classifications are extremely accurate and the premise behind the service is exciting. I would love this service if not for the absurd pricing.
The pricing is a complete ripoff. They allow you to submit HTML-based content to be classified, so naturally one would want to submit a webpage to be classified.
Unfortunately, the pricing works as follows: It's ≈ $2 per 1M unicode characters.
An average webpage can easily stretch into the hundreds of thousands of unicode characters. Most of that is junk that you can strip out on your own, like css and html tags. But they apparently don't strip this out, as they let me blow through $500 worth of credits by classifying a few hundred webpages.
For reference, this is about as expensive as getting a human to do that work. You could probably pay a human $10 / hour to classify webpages for you, and they could probably get through at least 60 web pages an hour. So ≈ $1 / 6 web pages. W/ google it's $1 for 5-10 web pages, since each webpage is hundreds of thousands of unicode characters.
It's an *api call*, the marginal cost here is extremely low and there is a completely obvious solution here, which is to STRIP OUT HTML TAGS. You can do this in a few lines of python code.
It would be nice if: 1/ they sent you actual billing alerts to let you know that you are quickly running through credits, and 2/ if they didn't count obvious 'junk' characters towards your billing.
Be careful of billing and carefully strip out all html/code from any webpage you want classified. Really sad that google is willing to nickel and dime you over something with extremely low marginal cost to them.
Trying to understand the content classification of a set of web pages.