Optical Character Recognition (OCR) for Invoices

Both procurement and office culture in general have long been closely tied to paper. From purchase orders to vendor invoices to memos and spreadsheets, paper documents have been at the core of business since the beginning. But thanks to digital transformation, driven by the needs of the modern workplace and the dream of a truly paperless work environment, the time and labor demanded by old-school document management is an obstacle to profitability, productivity, and competitiveness. A technology known as optical character recognition (OCR) laid the groundwork for modern digital solutions, but has its own limitations.

Understanding what OCR can do—and what it can’t—is essential when you’re considering implementing an automated software solution to transform your own procurement function and your business as a whole.

Invoice OCR: An Aging Automation Essential

The desire to improve efficiency, accuracy, and productivity through automation is hardly new. In fact, the roots of digital data capture technology as we understand it stretch back to the Age of Disco. In the early 1970s, a man named Ray Kurzweil invented OCR technology as a way to assist the visually challenged by scanning information and converting it to digital audio. But by 1978, companies like Xerox had recognized the benefit of scanning the same information and storing it as an editable virtual document, and so OCR shifted from a reading assistance device to an information management tool.

OCR works by scanning and digitizing information (from, for example, printed invoices). Recognition software analyzes the scanned content, performs data validation to ensure accuracy, and then converts the data into standalone documents (e.g., text files or .pdf files) or exports it for use in other software, such as Enterprise Resource Planning (ERP) packages and office productivity software such as Microsoft Word and Excel.

For procurement and finance departments, OCR is used to capture invoices and incorporate invoice data into the accounting systems and other software. Once hailed as a game-changer, promising to free office staff from manual data entry, OCR is now recognized as a useful, but limited, component of more comprehensive, and effective digital automation toolkits.

“For most modern businesses, the promise OCR once held is being fulfilled by more advanced digital automation solutions. Here in the twilight of the paper-driven office, a hybrid solution that includes OCR as needed, but has a plan to replace it when possible, is where this aging star of automation can really shine.” 

Pros and Cons of OCR

The great promise that comes with optical character recognition is freedom: no more time-consuming manual data entry that steals time from more important tasks; no more need to store paper documents; no more manual processing of incoming invoices or needing to hire quite as many in-house staff in the Accounts Payable (AP) department to deal with a mountain of paperwork.

Yet, despite its importance as one of the core components of automation, OCR has clear limits. It still requires a steady flow of paper documents to provide value, and in some cases, adding OCR to your AP department might prove more costly than skipping ahead to a solution without it.

OCR Pros

If your payable department is transitioning from paper to electronic documents, OCR solutions can give you an edge over manual processing through:

  • Flexibility. Modern OCR software is designed to handle a wide range of fonts and even handwriting. It can be trained to parse and correctly transfer information, including tables and other special formatting, from unknown fonts and writing styles with training thanks to machine learning and artificial intelligence (AI). It can recover information even if a document has suffered damage or discoloration, provided the characters to be read are intact.
  • Speed. Compared to the time required to manually key data, OCR scanning is fast, allowing for large numbers of documents to be processed quickly.
  • Cost reductions. Electronic documents don’t require special physical storage. OCR reduces staff demand and improves efficiency, lowering total labor costs.

OCR Cons

Despite its utility, OCR does present a number of challenges and is not a replacement for a fully automated, touchless AP software solution. This is particularly true for invoice data extraction, but OCR can complicate your automation implementation in a variety of ways due to:

  • Formatting Limitations. While it certainly can be trained to recognize unusual fonts and handwritten text, OCR software is at its best with clearly printed documents with simple formatting. Handwriting training can be finicky, and has to be done for the handwriting of every individual using the system—internally and externally, e.g. vendors—making the potential cost greater than the savings that come from avoiding manual entry. Dirty and damaged documents are subject to read errors, and if the OCR is used mainly for invoice scanning software, different formats submitted by different vendors may cause errors. One-off or infrequently made purchases will generate invoices the system can’t readily handle, since those vendors won’t be acclimated to the requirements of the invoice recognition system.
  • Data Accuracy and Quality Issues. It’s light years ahead of its earliest iterations, but modern OCR technology still requires manual oversight and review to ensure complete accuracy. In addition, lack of complete buy-in from your suppliers can create a frustrating and time-consuming problem: your suppliers might generate their invoices electronically, but submit them on paper, creating the need to run their documents through your OCR system to process and pay them. This can lead to missed early-payment discounts and even late-payment fees if your department is struggling to keep up with the flow of printed PDF documents and other paper invoices.
  • Human Intervention Requirements. The time saved by avoiding manual data entry may be lost again when staff reviews and corrects OCR errors. Training is required to learn the OCR process, hardware, and software, taking time away from other tasks. For some systems, transfer to ERP systems and other software is not automatic and may require direct human intervention.
  • Total Cost of Ownership (TCO) and Complexity. In addition to the OCR software itself, special scanning hardware, training, materials and staffing costs, and mail costs can require substantial initial investment. Given that more advanced software packages that offer touchless, paperless performance can offer more features at a comparable price, OCR may not be cost-effective unless your accounting system is still primarily paper-based.

For most modern businesses, the promise OCR once held is being fulfilled by more advanced digital automation solutions. Here in the twilight of the paper-driven office, a hybrid solution that includes OCR as needed, but has a plan to replace it when possible, is where this aging star of automation can really shine.

OCR And the Digital Future

The sun may be setting on OCR as the core of AP automation, but it still has a role to play. The best approach is to find a complete and truly paperless procurement and AP software package that offers versatile and effective OCR when you need it without relying on it to create value or drive your process development.

Recognize the Benefits of Total AP Automation Starting From Procurement

Find Out How

Business is Our Business

Stay up-to-date with news sent straight to your inbox

Sign up with your email to receive updates from our blog

Schedule A Demo

Enter your email below to begin the process of setting up a meeting with one of our product specialists.