13 Tips for Conquering Cognitive Document Automation Challenges to Maximize Productivity
Welcome to part five of our six-part series that takes you on a journey through the latest concepts in multichannel document capture and intelligent OCR. We’re focused on how AI has transformed what’s possible in making your documents and data work for you—and not against you. In this part, we will look at the challenges RPA customers face when implementing CDA solutions
In part one, we looked at how RPA marked a revolution in empowering businesses to solve problems associated with manual, data-centric tasks, yet was historically ineffective in automating document processing. In part two, we examined the emergence of cognitive document automation (CDA), which does the “head work” of understanding what the document or email is about, what information it contains, and what to do with it. Part three of our series took a deeper dive into what you should look for in a CDA solution. (Hint: You’ll need much more than just OCR functionality). And part four tackled the question: How do we measure the success of CDA?
In part five, let’s look at some of the challenges RPA customers face when implementing CDA solutions and trying to maximize user productivity (remember, user productivity is defined as accuracy + efficiency—see part four for a refresher).
Read more: RPA Is Just the Beginning of Your Intelligent Automation Journey (Part 1)
For all of their benefits, CDA solutions can still exhibit pitfalls and limitations. Below are some of the common challenges we’ve seen, along with advice on how to successfully navigate them before embarking on your CDA journey. We recommend ensuring the CDA solution you choose adequately addresses each.
The image source affects the quality of the image and, therefore, the level of classification and extraction accuracy. This means that faxes, for example, will have lower image quality than an emailed, born-digital PDF. And scanner hardware delivers different levels of quality depending on the vendor and model.
Image File Type and Resolution
Some image file types have better inherent quality than others. 300 dpi gifs are most common, but often, companies can’t control the file type received from external sources. Lower-resolution images will have lower levels of classification and extraction accuracy (300 dpi is considered ideal).
The saying “garbage in, garbage out” also applies to CDA. Images faxed multiple times; mobile images with skew, tilt, blur, similar background or bad lighting; monochrome scans; documents with stamps, scribbles and stains…all of these can affect classification and extraction accuracy. Images acquired by CDA solutions should be image-processed and perfected before applying automated classification and extraction to ensure maximum possible accuracy.
The number of samples and their similarity to the real world also impacts accuracy. Generally speaking, the more samples that are “machine-learned” by the CDA solution, the better. The number of samples required ranges from a few to hundreds, depending on the type of document. Samples should reflect as closely as possible what will be seen in the “real world” during production processing.
Read more: What Makes Cognitive Document Automation So Smart? (Part 2)
Structured forms generally have the highest level of classification and extraction accuracy, and require the fewest number of trained samples. Nonetheless, the form design will have a significant impact on accuracy—from proximity of fields to each other to field boxes vs. letter boxes to field shading (if any). If your organization has control over the form design, make sure it’s laid out for maximum automation potential.
Semi-structured documents (such as invoices, purchase orders, sales orders and bills of lading) generally show lower accuracy than structured forms. Different CDA solutions have different approaches for locating the desired data, and some are more reliable than others at finding the data and extracting it successfully. These documents also tend to have embedded tables (e.g., invoice line items), multiple tables or tables within tables that may have lower extraction accuracy rates than regular fields.
Unstructured documents such as emails (body), letters and contracts are the most challenging to classify and extract automatically. AI-based technologies such as Natural Language Processing (NLP) have improved extraction accuracy rates for these types of documents in recent years.
The type of print on the document also affects extraction accuracy rates. Generally, machine-printed fields have the highest accuracy rates, followed by hand-printed fields and then by cursive fields. For machine print, font type and character spacing also impact accuracy rates. Document language can also impact accuracy rates. OCR engines used by CDA solutions exhibit varying OCR accuracy depending on the language, with Latin languages typically claiming the highest accuracy rates.
Read more: Cognitive Document Automation – Your “Must-Haves” Checklist (Part 3)
Barcodes and Checkboxes
Barcode and checkbox fields typically show the highest extraction accuracy on a document. It’s not uncommon for CDA solutions to boast an accuracy percentage in the high 90s for extracting barcode values and checkbox/bubble values. However, there are dozens of barcodes in use, including 1D, 2D and now 3D barcodes (2D with color), so ensure the CDA solution supports the most frequently encountered ones.
One of the primary reasons paper is still in use by many organizations is the requirement for a signature, and the paper signature must be captured, classified and extracted. Moving to electronic signatures can remove the need for paper scanning, and this improves the productivity and capacity of your CDA users. Consider whether you simply need signature presence detection, or signature verification and fraud detection, as well.
A CDA solution’s classification and extraction accuracy rates can significantly improve through the use of databases. By matching to similar content in databases, minor OCR errors can be ignored. The result? Less human involvement to confirm/correct low-confidence OCR results. Database content can include customer names, account numbers, ERP data such as PO number or vendor name, word dictionaries specific to industries or languages, etc.
Rules can also be used to increase the extraction accuracy of a field. For example, checking that subtotal plus tax equals total is a simple rule that can flag any errors, even after a human corrects one of those field’s values. Formatting rules are also a simple way to ensure high field accuracy (e.g., a social security number should always have the format xxx-xx-xxxx, where x is a number between 0 and 9). Checking for field values’ checksums also increases field extraction accuracy.
CDA solutions aren’t complete without an easy way to send the documents and data to the systems, processes, and people who need them. User productivity decreases immensely if users must manually move document images and data from one system to another. Remember that an RPA robot can automate the process of moving and aggregating data between systems if an out-of-the-box connector for the destination system isn’t available.
Read More: Cognitive Document Automation: How Do We Measure Success? (Part 4)