Demystifying Forms Processing and Data Capture

Demystifying Forms Processing and Data Capture

Forms Processing is a proven technology that allows organizations of all sizes to benefit by improving efficiency and decreasing operational costs.  There are many case studies available online to support these facts.  When implemented properly the cost of a Forms Processing solution can easily be justified with a tangible 12-18 month return on investment.  With such overwhelming evidence of decreased operational costs and drastically improved efficiency then a logical question would be why wouldn’t every business in the world be using this wonderful technology?  Traditionally only large organizations with dedicated technical staff and humungous IT budgets could consider implementing a sophisticated Data Capture solution but times are changing.  No longer does it have to take years to realize the benefits of Forms Processing once only available to Fortune 1000 type companies.  In this blog post I hope to dispel the myth that this useful technology is only available to Enterprise organizations.

While the concept of automatically extracting information from a hard copy document is not new, what is new is a different method of implementation.  Specifically, the “cloud” offers an intriguing opportunity for Data Capture.  Why?  First, Data Capture is a very CPU intensive process and the cloud offers unmatched processing power within gigantic data centers.  Second, sharing resources and ‘renting’ a cloud service such as ‘Cloud Capture’ reduces the barrier to entry.  No longer is it the case where the upfront cost to implementing Data Capture should be an issue.  The cost of Data Capture can now be a Operating Expense versus a Capital Expenditure.

I have written previously about the “No Folder Zone” and in this blog post I will elaborate on the solution to avoid using Folders as a cop-out for a truly effective Information Capture solution.  In a traditional installation environment of on-premise software.  After the Forms Processing system is installed, tuned and tested then it is ready for deployment.  This is the point where the Document Capture system Crosses the Chasm and the organization can now truly benefit from the 80% investment and turn this effort into 80% benefit.

The basics of Forms Processing are quite simple and straight-forward.  The idea is to create a template overlay of the form for which you wish to extract information.  As seen in the photo to the left, you would basically draw zones over the image where you can capture typed text (Optical Character Recognition, or OCR), handwritten text (Intelligent Character Recognition, or ICR) or even check boxes (Optical Mark Recognition, or OMR).  After the template is created then the next time the system encounters this type of form then these fields will be automatically captured and eliminate manual data entry.

One of the most important objectives of any data capture system should be the quality of the information being captured versus just the pure speed of the system.  The accuracy of information captured is based on many factors including original document quality, image enhancement or scan resolution but a critical step is to validate, or verify, any questionable data BEFORE it enters your information system.  There are many effective methods to capturing highly accurate data including logic such as a Social Security Number field should contain only numbers instead of letters and, therefore, the number “5” would not be incorrectly recognizing as a letter “S”.  In a perfect world you would hope for no verification at all but this is simply not reasonable all the time.  A good rule of thumb is that 2% verification is acceptable which means 98% of work is done for you quickly and automatically.  This translates into major efficiency gains.

A key misconceptions about Data Capture, or Forms Processing, is that the integration into back-end systems needs to be complicated or costly.  While this could be true the fact of the matter is that all electronic information systems rely on some flavor of a database.  And basically a database is composed of a bunch of tables with fields.  In context of Forms Processing think about a table of Document Types.  Then in the Document Types table you have the various types of documents you wish to capture and the Fields are the index values you wish to extract from an image.  So the real magic is “matching” the extracted index values to the fields in the database.  I think the term “Field Mapping” most accurately describes this integration of Data Capture technology with Electronic Information Systems.  Fortunately, new trends in open connectivity such as Web Services and Content Management Interoperability Services (CMIS) is making the connectivity between Capture and Storage much more affordable and less time-consuming than ever.

As I mentioned earlier in this blog post, all applications have some flavor of a database to store information.  It’s just a fact of how things operate and if you really think about it all we have to do is match Data Capture fields with database fields to make a fully integrated Data Capture solution.  Often times we get wrapped-around the axel on the technical details but when we simply integration to it’s lowest common denominator then we can truly dispel the myth that Forms Processing is too complicated or expensive for everyone to utilize.

Now that I’ve covered the basics of Forms Processing and illustrated the fact that interoperability can be achieved rather easily in certain cases, I hope that we can move out of the stone ages of manual data entry and realize a truly efficient organization with Automatic Data Capture.

AIIM has just published a whole suite of educational videos on a collection of interesting topics including one on Information Capture (http://www.aiim.org/Training/Certification/Get-Trained/Videos/Capture-Manage).

Leave a Reply

Your email address will not be published. Required fields are marked *

one × 2 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.