Automatically extract printed text, handwriting, and other data from scanned documents that goes beyond simple optical character recognition (OCR) to extract data from forms and tables.
Automatically extract printed text, handwriting, and other data from scanned documents that goes beyond simple optical character recognition (OCR) to extract data from forms and tables.
Customer Reviews
Samuel C.
Advanced user of PDF2SpreadsheetPDF2Spreadsheet make it easy to extract any tables from pdf, their interface is simple to use and get the job done. We simply drag and drop the pdf/pages containing the table and pdf2spreadsheet returns an Excel file containing the table. In some case we also get the CSV file format, so we can integrate in our data pipeline (ETL). In some case, the table was not extracted, we contacted them and they fixed the problem. From my understanding, it's a new product they are launching and they are responsive to fix/improve their product.
If they could provide an API, that would be easier for us to fully integrate their technology in our data processing pipeline (ETL).
Give it a try! They offer a free trial.
As a data scientist, I have to aggregate and mix different sources of data. Some of them are locked in PDF, extracting a table by hand is a tedious task. Having a tool that automatically extract tables from PDF is a huge benefit. We save over 20 minutes per page by using pdf2spreadsheet instead of retyping the table by hands.