Increase ECM Automation Processes With Higher Resolution Scanning

Source: Business Solutions Magazine


Written by: Kevin Neal, product manager – production scanners, Fujitsu Computer Products of America

When we talk about software automation, it’s safe to say that we truly live in remarkable times. Automation, as it will be referred to in this article, can be defined as allowing a computer to accomplish tasks that traditionally took human intervention and/or action to complete.The rapid adoption of automation via software is driven by several basic technical factors, including high-powered, affordable CPUs (more cycles and lines of code executed per second), drastic increases in memory capacity in conjunction with reduced prices, as well as the ever-evolving intelligence within software packages. The computing resources behind all of the advancements are helping to reduce costs, improve efficiencies, and assist with compliance and regulation.

Software automation is becoming more pervasive among ECM (enterprise content management) and document scanning solutions. The virtue of implementing ECM solutions has historically been cost reduction, which could have meant decreased headcount or reallocating employee resources to other business units. It may even have been tangible costs such as reducing mailing and shipping charges, eliminating expensive fax transmissions, or increasing physical storage space too, by removing cabinets and file drawers.

Because of computing advancements, businesses and organizations are no longer asking the questions of whether ECM systems are truly viable. Instead, they are asking more pointed questions about how much the return on investment is and how quickly they will realize the ROI. In fact, according to Gartner, Inc. the worldwide ECM software market is expected to grow more than 12% per year through 2010, from $2.6 billion in 2006 to more than $4.2 billion in 2010. These days, it’s more about which hardware, software, and services best fit the needs rather than whether or not to put a solution in place.

With most of the pain points of the DIP (document image processing), DIM (document image management), and/or ECM solutions behind us, we now have an opportunity to do more remarkable automation tasks with software. But the success or failure of the entire system is closely tied to the ‘on-ramp’ of electronic document automation and your document scanner, in particular. In the next few paragraphs, I’ll examine several important software automation solutions from some of the premier forms processing and capture software companies in the industry.

High Resolution Maximizes Recognition Results (Contributed by ABBYY)
When scanning for OCR (optical character recognition) or data capture, start with an excellent quality original. This may be the single most important consideration to achieve optimal results for recognition and capture, as well as for the purposes of long-term preservation. In fact, using a high-quality image takes on increasing importance as more users depend on electronic documents to take the place of paper-based originals because of the searchability and cost savings. On the downside, once scanned, the paper document is often no longer available — so it is important to retain maximum quality from the outset.

Today, 300 dpi (dots per inch) color remains the gold standard for scanning. However, high-quality grayscale is an option when color is not achievable (since color scanning often results in 32-bit files). Whenever possible, maintain color images. Color provides additional depth, which enhances the ability of recognition software to gather additional information about the scanned document in order to maximize accuracy. In short, consider quality first when scanning for recognition and archiving.

Classification Of Forms (Contributed by ReadSoft)
Organizations are turning to one portal for all incoming documents — no matter if they arrive on paper or in electronic form. Technology is available to automatically sort incoming documents and classify them according to case. This enables the simple inputting of all incoming mail into a scanner (without any separator sheets) and lets the computer sort the documents. If documents arrive in electronic form, they are also easily incorporated into the flow. By digitizing paper documents through high resolution scanning, users can easily search and retrieve all incoming mail. What will this do for an organization? Efficiency increases when each and every document is distributed correctly. Fast access to status reports and audit trails gives users better control over information flow. In addition, a smooth integration with back end systems such as customer management applications, databases, and archives boosts the performance of IT systems. The overall result of high resolution scanning is automated classification and sorting — less need for document preparation, one portal for all incoming documents, (paper and electronic), electronic distribution to authorized staff, and control of information flows.

300 dpi — Friend Not Foe For Automated Document And Data Capture (contributed by AnyDoc Software, Inc.)
The idea that scanning documents at 300 dpi will create backlogs and bottlenecks within automated document and data capture solutions is an outdated myth. In fact, within many solutions, product settings default to 300 dpi to maximize character recognition with little or no adverse impact on processing or transmission speed or storage capabilities — and with a great positive impact on recognition accuracy. And, when processing healthcare forms such as explanation of benefits (EOB), Health Care Financing Administration (HCFA) and Uniform Bill (UB04s) — known for their notoriously small font and extremely high character density per page, proper resolution is critical. At a 300 dpi setting, recognition engines are optimized and file size is still very manageable. Because the average size of a 300 dpi 8.5” x 11” bi-tonal TIFF image is 40 KB, it means approximately 3,000,000 document images can be stored on a standard 120 GB hard drive.

In decades past, files competed for space that was limited and expensive, but no more. Now, a 40 KB file travels on today’s fast networks at what can be conversationally considered to be the speed of light. A lower scanning resolution can negatively impact data recognition, which is not offset by the saving of space — no longer the limited commodity it once was.

And, some of the better document processing packages will process at 300 dpi, but output at a lesser (i.e. 200) dpi, giving you the best of both worlds. Scanning at a higher resolution can dramatically improve data recognition, decrease the need for human intervention, and increase the efficiency of all downstream applications without negatively impacting electronic transmission or storage space.

More dots per inch (dot) for increased automation
So, maybe now you’re thinking — “Of course I want everything automated and I’ll scan everything at 300 dots per inch and/or color, or both.” Well, not so fast. First, we must consider the risks versus the rewards for this type of a decision as we addressed in an upcoming article entitled “Trends Towards Higher Resolution Scanning.”

To quote Gartner, “The quality, performance, and ease of use of software products will improve.” This will help drive adoption; however, an inefficient document capture solution, due to settling for anything but the most software automation, should be unacceptable these days considering the pros and cons of higher resolution scanning.

In a day and age where no two ECM solutions are built alike, and organizations have choices for software automation components, it’s important to implement the best-of-breed solutions that garner optimal automation results. Whether it is OCR, ICR (), forms processing, separation, classification, unstructured forms, bar code recognition, etc., each step in the automation process and the rest of the automation workflow is directly related to a prior event, and it all starts with document scanning. As more desktop scanners are deployed throughout organizations, there is certain to be an ever increasing demand for ease-of-use and automation. Give your ECM solution the best chance for automation success and don’t underestimate the trends towards higher resolution scanning.

For more information on topics covered in this article or more information in general please visit:

Fujitsu –


AnyDoc Software –

ReadSoft –

Kevin Neal, product manager – production scanners, with Fujitsu Computer Products of America has been involved in the document scanning/enterprise content management industry for over 18 years. He has held various customer service, sales and management positions for many hardware and software products during his career. In addition, he has years of experience installing, configuring, and troubleshooting networking components as a consultant and network administrator. Currently he handles product management responsibilities for Fujitsu’s complete line of production scanners.

– See more at:

Economies of scale: Cloud processing

Economies of scale:  Cloud processing

Document Capture technology has been available for many years and is a proven method to decrease operational costs and improve business efficiency.  However, this technology has traditionally been expensive to purchase, implement and deploy.

In organizations, small or large, Information Technology systems are comprised of similar components:  Hardware, software and services.  The emergence of Cloud Computing offers a new method to provide workers with technology such as Advanced Data Capture that was traditionally only available to Enterprise organizations due to high cost and technical complexity.  Now organizations of all sizes, in many different industries can benefit with the Economies of Scale with Cloud Processing as a service.

Businesses purchase computer hardware as a resource for workers to get their jobs accomplished.  As it relates to data capture from paper documents specifically, more than ever these businesses can benefit from advances in technology.   For example, most smart phones today are equipped with cameras that are capable as acting as a portable image acquisition device.  Or, to capture higher volumes of documents a business might choose to purchase dedicated scanners or use the office copy machine’s scanning functionality.  The point-is that there is still a certain amount of equipment that a business needs to function.

The fast-growing popularity of Cloud-based storage also makes Advanced Data Capture as a cloud service extremely logically and quite complimentary.  There are billions of users currently using some form of cloud storage whether it be business users of applications such as, or social networks such as Facebook or hybrid applications such as LinkedIn.  Additionally, and especially with the undeniable trend of using mobile devices for business data consumption, it only makes perfect sense to allow these devices to also contribute information easily via advanced data capture.  Consuming information on mobile devices is easy but to add a business contact, for example, is difficult and frustrating with small display sizes and awkward virtual software-only keyboards.

One of the most logical services to utilize Cloud Computing is Data Capture.  Why?  Data Capture is a service and with cloud computing an organization can ‘rent’ this service as a shared resource.  Since data capture doesn’t store images or information, then it’s ideal for sharing this resource and, therefore lowering the cost to use this service.

Cloud Capture is appealing for many reasons.

First, it allows small and medium sized businesses the opportunity to finally realize the benefits of Advanced Data Capture by sharing resources.  This reduces total ownership costs to the organization because these companies ‘rent’ this data capture service.  Secondly, it allows the organizations to quickly start utilizing this technology because they do not have to install, configure or maintain these services.  This is all taken care of by the hosting company which allows organizations to focus on their core business instead of being burdened by supporting technology.


Additionally, a Cloud Capture platform is also appealing to Enterprise customers.  Why?  Within any large organization the business typically has many different departments such as Administrative, Marketing, Sales, Purchasing, Accounting and others.  Also, the Information Technology (IT) department typically uses many software applications and services to support the business units.  With the emergence of Cloud Computing and with more and more corporations moving applications to ‘the cloud’, one service that makes the most sense is Data Capture.  Since Data Capture truly is ‘a service’ and does not store data permanently then capture technology infrastructure is ideal for Cloud Computing.  Scalability to add additional capacity or seamlessly incorporate new services are added benefits.


Demystifying Forms Processing and Data Capture

Demystifying Forms Processing and Data Capture

Forms Processing is a proven technology that allows organizations of all sizes to benefit by improving efficiency and decreasing operational costs.  There are many case studies available online to support these facts.  When implemented properly the cost of a Forms Processing solution can easily be justified with a tangible 12-18 month return on investment.  With such overwhelming evidence of decreased operational costs and drastically improved efficiency then a logical question would be why wouldn’t every business in the world be using this wonderful technology?  Traditionally only large organizations with dedicated technical staff and humungous IT budgets could consider implementing a sophisticated Data Capture solution but times are changing.  No longer does it have to take years to realize the benefits of Forms Processing once only available to Fortune 1000 type companies.  In this blog post I hope to dispel the myth that this useful technology is only available to Enterprise organizations.

While the concept of automatically extracting information from a hard copy document is not new, what is new is a different method of implementation.  Specifically, the “cloud” offers an intriguing opportunity for Data Capture.  Why?  First, Data Capture is a very CPU intensive process and the cloud offers unmatched processing power within gigantic data centers.  Second, sharing resources and ‘renting’ a cloud service such as ‘Cloud Capture’ reduces the barrier to entry.  No longer is it the case where the upfront cost to implementing Data Capture should be an issue.  The cost of Data Capture can now be a Operating Expense versus a Capital Expenditure.

I have written previously about the “No Folder Zone” and in this blog post I will elaborate on the solution to avoid using Folders as a cop-out for a truly effective Information Capture solution.  In a traditional installation environment of on-premise software.  After the Forms Processing system is installed, tuned and tested then it is ready for deployment.  This is the point where the Document Capture system Crosses the Chasm and the organization can now truly benefit from the 80% investment and turn this effort into 80% benefit.

The basics of Forms Processing are quite simple and straight-forward.  The idea is to create a template overlay of the form for which you wish to extract information.  As seen in the photo to the left, you would basically draw zones over the image where you can capture typed text (Optical Character Recognition, or OCR), handwritten text (Intelligent Character Recognition, or ICR) or even check boxes (Optical Mark Recognition, or OMR).  After the template is created then the next time the system encounters this type of form then these fields will be automatically captured and eliminate manual data entry.

One of the most important objectives of any data capture system should be the quality of the information being captured versus just the pure speed of the system.  The accuracy of information captured is based on many factors including original document quality, image enhancement or scan resolution but a critical step is to validate, or verify, any questionable data BEFORE it enters your information system.  There are many effective methods to capturing highly accurate data including logic such as a Social Security Number field should contain only numbers instead of letters and, therefore, the number “5” would not be incorrectly recognizing as a letter “S”.  In a perfect world you would hope for no verification at all but this is simply not reasonable all the time.  A good rule of thumb is that 2% verification is acceptable which means 98% of work is done for you quickly and automatically.  This translates into major efficiency gains.

A key misconceptions about Data Capture, or Forms Processing, is that the integration into back-end systems needs to be complicated or costly.  While this could be true the fact of the matter is that all electronic information systems rely on some flavor of a database.  And basically a database is composed of a bunch of tables with fields.  In context of Forms Processing think about a table of Document Types.  Then in the Document Types table you have the various types of documents you wish to capture and the Fields are the index values you wish to extract from an image.  So the real magic is “matching” the extracted index values to the fields in the database.  I think the term “Field Mapping” most accurately describes this integration of Data Capture technology with Electronic Information Systems.  Fortunately, new trends in open connectivity such as Web Services and Content Management Interoperability Services (CMIS) is making the connectivity between Capture and Storage much more affordable and less time-consuming than ever.

As I mentioned earlier in this blog post, all applications have some flavor of a database to store information.  It’s just a fact of how things operate and if you really think about it all we have to do is match Data Capture fields with database fields to make a fully integrated Data Capture solution.  Often times we get wrapped-around the axel on the technical details but when we simply integration to it’s lowest common denominator then we can truly dispel the myth that Forms Processing is too complicated or expensive for everyone to utilize.

Now that I’ve covered the basics of Forms Processing and illustrated the fact that interoperability can be achieved rather easily in certain cases, I hope that we can move out of the stone ages of manual data entry and realize a truly efficient organization with Automatic Data Capture.

AIIM has just published a whole suite of educational videos on a collection of interesting topics including one on Information Capture (

Technology = Positivity


Despite the current economic situation and unprecedented turmoil in the financial markets, I remain extremely positive for the future. In particular, I continue to pay close attention to IT (Information Technology) spending and try to predict whether it might have a positive or negative impact as it relates to the industry I work, document scanning and Enterprise Content Management (ECM) solutions. This might sound a bit self-centered to those that might not know me well and had stumbled across my blog but I can assure you this is not the case. I deeply care about others and worry about the near-term future for those that had been negativity impacted by the unmanageable financial greed by a small group of individuals that has caused such misery for us ‘normal’ folks.

My philosophy, however, is as follows:

You have to take care of yourself first, put things in perspective, get a plan, work harder, work smart and know that you can only help others one person at a time. In other words, as hard as you might try, you can’t change the world overnight. It’s like building a skyscraper, you have to create a solid foundation with solid blocks at the core of the building or else everything you are working to establish could eventually come crumbling down.

So why do I remain positive for the future in spite of such negativity? Am I trying to fool myself? No. Is this the new Administration with their positive attitude? No. Is this what some book suggested individuals do in tough economic times? No, although I have read about this strategy and believe having a positive attitude is the right thing to do for yourself and, more importantly, others this is not the case either. After all, I’m a realistic as well and I think being positive for being the sake of being positive is not only phony, but it’s also transparent and insincere. Be real, accept the facts and deal with it. The reason for my positive attitude is simple, Technology.

One disclaimer before I try and explain myself. My blog is new and you might, or might not, know my personality. Before I go on a diatribe about a particular topic I would like to share a little insight about myself and why I think the way I think. It’s not about me, per se, it’s trying to give you that perspective of where I’m coming from. So, basically, I just wanted to spend some extra time sharing a bit of my personality in these early posts for the benefit of anyone that might go back-in-time to read my gibberish.

Ok, Kevin, *focus*, back on topic now. J. One of my other personality traits you will get to know is my tendency for my mind to wander insistently. *Thinking to myself* “what have I accomplished that is of use to anyone reading this blog thus far?”. *Answering my myself* “establish topic = Check. Share a bit about yourself = Check. Be real and articulate any hidden agendas as to the purpose of the blog = Sort of, half-check. Quantify established topic = Oh yeah, that’s where I was”.

Why Technology = Positive outlook

It’s easy to forget in this day and age, Feb 2009, how far we’ve progressed technology-wise in a fairly short period of time. I’m sure that all of us base a majority of our opinions on personal experience. I was just hit with two blatant examples, first of all I just took a moment to reflect on this blog I’m creating. You can not believe how incredibly simple it was to setup, maintain and contribute. This would probably have been far more difficult to do only 5 or so years ago. Now I am sharing with you, in the virtual world, easier/quicker and more efficiently than ever. Example two, tonight is the 2009 NBA All-Star game in Phoenix. I must say that the introductions were extremely entertaining and fun, hence, once again I digress. Anyhow, near the end of the first quarter right now on the TV they suggested that you hop-on the internet and ‘connect via Facebook’ to something they had setup online. Not sure what it is they had created on Facebook but it’s just an example how ubiquitous all this technology is becoming. It’s everywhere, all the time.

Why is this technology different?

Simple. Information comes to you instead of you going to it. Take e-mail as an example. I send a message, you read the message, you respond to the message. Viola, we communicated full-circle. Unlike e-mail, with new programs available to the masses such as places such as Facebook, MySpace, Twitter, Digg and others, the information is automatically fed to you 24×7. I don’t know about you but I’m feeling overwhelmed with information. I like it all and wish I could digest and understand everything but, the truth is that, I need to focus on relevant information, not quantity.

In retrospect do you know where I’m going for more of my BUSINESS information these days? Yep, you guessed it. The Internet and it’s vast variety of blogs, tweets, status updates and diggs. Naturally I take a lot of this information at it’s face value with a grain of salt because anyone with an internet connection now has the ability to publish a ‘news’ report that were only available to the likes of Wolf Blitzer, Shepard Smith and Ted Coppel before. The playing field has changed.

Danger Will Robinson

As much as I love technology there are clearly pitfalls, as you all know. Therefore, no need to pontificate here. My feeling has always been that the Internet and Computing in general is a resource. If people abuse the resource then that’s one thing. If the resource isn’t viable, then that’s another. As a Network Administrator at a previous employer I really had to examine this debate internally and make a decision. Do I run wide-open with unlimited access like the Wild-West? Do I lock-down everything like Alcatraz and make access non-existent Or, do I create a blend with a little of both, access versus security? Regardless of what my ultimate decision was, it’s irrelevant and counter-productive. Why? What I did doesn’t necessarily apply to your particular situation. What is right for one person isn’t necessarily right for the next.

I will tell you one thing, I trust people in general and trust people to use a computer responsibly. As a Net Admin, I felt you get one ‘free pass’ on my network then, if you cause me and/or the other users undo hardship, I’ll lock you down tighter than Fort Knox! *that’s my tough Clint Eastwood impression by-the-way*. Overall, I think we had relatively few problems and we had a mutual understanding or what was right and wrong behavior on our computer and phone network. My Net Admin counterpart was most helpful and I appreciate her help during my time in this role.

Opportunities, fun and profitability

The future is exciting and, I honestly believe, this will truly be the technological revolution age that’s remembered for increasing efficiency in a tough economic environment. Recently I’ve been fortunate enough to travel a little bit and get some sincere feedback from our customer base on the pulse regarding technology. Whether it’s my companies’ technology or other technology I think that the general sentiment that is what is of most important. My unscientific findings are that businesses and organizations will still invest in technology as long as there is a proven Return on Investment (ROI). Hallelujah! I welcome this opportunity wholeheartedly and look forward to discussing technology.

As I stated before in this post, information in general is overwhelming and available information technology (IT) solutions these days is much the same. It’s daunting. It’s hard enough to be an expert on one portion such as security or access, and still be expected to be knowledgeable on things such as Search and Networking.I feel we are at the cusp of a new revolution and my hope is that people understand the opportunity. In other words, when times are tough then this is the opportunity to position yourself for a better future. Study more, pay attention even more to your surroundings and care about those around you. Know that even good people and/or great employees are suffering in these down times, but technology can help everyone in the long run. But we must remember…Technology itself is stupid. It needs people to develop, tailor and build it. We all have that opportunity now. To use another analogy, you can not fit a square peg into a round hole. Technology must fit the business and not the other way around. Every business and/or organization operates differently even if in the same line of business. Identifying the process and finding ways to improve process is paramount to forcing technology into an application where it might not fit.

Now that I must wrap up this long post, I realize that I’ve missed all the technology points that I hoped to address. This is a good thing. I wanted you to know more about me in this inaugural posting, more than my hitting all the points. Mission accomplished? Yes, we can? Change is here?

In future installments of my blog I hope to discuss topics I’m interested in such as Cloud Computing, Virtualization, Network Attach, OCR, ECM, DIM, scanning, toolkits, Open Source, SharePoint, capture, unstructured document processing, business process management, ICR, Web 2.0 (has this ever been defined by-the-way?), social networking and discussion boards/forums.

Thanks for reading; I appreciate your time. I highly encourage your comments on my blog.