The logic of document capture

Indexing, Metadata, Keyword, SharePoint, Capture, Scanner, Documents, ECM, Content Management

What is wrong with the collection of words above?  Well, it’s a collection of terms that are closely related but have no logical structure in order to be of value to anyone reading them.  In order for these words to be valuable in terms of readability for context they need to be logically organized into a sentence.  The logic of document capture and Enterprise Content Management is much the same.  In this blog post, instead of going into the nuts and bolts of document capture I thought it is more important to discuss two critical components to your overall success, or failure, of your content management strategy.  These two critical components are taxonomy and metadata.  This is philosophy and not technology.

To break down document capture in its simplest form, just think of this as the process of extracting information from a document and making that information available in the future.  The future could be immediate where a scanned invoice, for example, immediately kicks-off a payment process.  Or it could be two weeks from now where a customer service agent needs to retrieve a signed airbill for a proof of delivery.  The point is that document retrieval is based on some unique keyword or a set of keywords related to a particular document.  In the case of the invoice it could have been the invoice number and in the case of the airbill it could have been the shipping tracking number.

If you do not consider a well thought-out strategy then your organization could have accomplished the task of taking an organized paper mess and simply converted it to an electronic mess.

Establish a well thought-out taxonomy

Taxonomy is defined as classifying organisms into groups based on similarities.  Why is taxonomy relevant for document capture?  For several reasons, including security, quicker access to information and retention policies.  So, if you work backwards in the methodology of how and what, technology to implement for your document capture solution a solid consensus of the end result is of paramount importance.  The end result is typically a high-quality scanned image conducive for data capture (OCR, ICR, OMR, bar code, etc.) and the metadata itself.  So if your taxonomy has organized methodology then it should assist in making your document capture strategy fairly obviously.  Let’s take security as a benefit for a well thought-out taxonomy strategy.  By segregated documents based on a logical taxonomy, organizations are afforded an addition level of comfort knowing that a set of security policies can be applied to, for example, Human Resource, documents allowing access to everyone for a general set of available scanned documents such as the café menu which is clearly not a information sensitive document.  Additionally, another benefit of a well thought-out taxonomy is quicker access to information for users.  Many content management software applications and search engines use a ‘crawl’ method to check newly added content and add them to an index (database) which is then searchable.  As you can imagine, common sense and logic dictates that ‘crawling’ a more narrow scope is much quicker to keep the database up-to-date, but also access times could be considerably less by not having to search the entire database and only the relevant data indexed.  This makes access to data quicker.  Lastly, in regards to retention policies, having your data well organized is a major benefit for this area.  Imagine that an organization has all of their tax documents properly electronic stored via a well thought-out taxonomy in their content management system.  If they did then easily, and within corporate governance standards and policies the organization can removed these images from their repository based on a retention schedule.  So, as illustrated, investing the time to develop a strong taxonomy is important for many reasons including security, searchability and retention.

It is extremely important to not over look this important concept when planning out a document capture strategy.  A simple taxonomy might be organized like below:

  • Accounting
    • Accounts Receivable
      • Check
      • Statement
    • Accounts Payable
      • Invoice
      • Receipt
  • Human Resources
    • Applications
    • Resumes
    • W2 Forms

Considering a well thought-out strategy might seem cumbersome in the initial stages of establishing your document capture strategy, but it can save organizations significant time, money and aggravation in the long-run.  As a best document capture practice it is important to establish a solid taxonomy for scanned documents and also re-evaluate the strategy as it relates to taxonomy as any new documents are introduced within your organization.

Consider what information is important, and what is not

Creating Searchable PDF’s is one form on document capture; however, it is not always an ideal document capture strategy.  While sometimes, in certain situations, creating Searchable PDF images of your scanned documents is the right approach for an organization sometimes this technique of document capture often creates inefficiencies.  You might be thinking to yourself how could creating a fully Searchable PDF with all the words of the document indexed be construed as being inefficient?  Let me elaborate.  When creating a Searchable PDF the scanning software does its best job possible to recognize every single character and every single word on a page.  This might sound appealing but let’s consider the possible results in real-world applications.  Imagine that an organization in the insurance business scans as little as 100 single-page documents and creates Searchable PDF documents.  Then they want to retrieve a document based on a keyword so they use the word “claim” in their search criteria to find a document a user is searching for.  As you can imagine the user would most likely be presented with a long set of links to possible documents but only one is the important document they are looking for and the rest is “irrelevant search”.  This is because the entire page was indexed via the Searchable PDF method.  Alternatively, if your data capture strategy had included only extracting “relevant search” terms that apply to a particular document then you make the organization much more efficient by being able to find the data you have requested much quicker with the first search.

One of the other significant benefits with an integrated document capture/content management strategy is that often times any sort of metadata fields created, and rules applied, in the content management system can be brought forward and applied into the document capture system itself.  For example, if an organizations’ policy dictates that on a healthcare insurance form that for a metadata field the social security number is required and can only be nine characters long of numeric characters, then directly in the document capture system these rules can be enforced.  This allows for great business continuity and consistency in your data capture process.

An analogy I like to use is go to your favorite internet search engine and enter in a vague term such as “taxonomy for document capture” then you will get a long list of ‘hits’ that probably are not of interest because you might be looking for a specific piece of information, or a scanned image.  In the contrary, if the user enters-in a more specific term such as “aim document taxonomy” then the focus of the search is narrowed down to a more relevant list of potential information the user is searching for.  This is an example of relevant search versus irrelevant search and it’s all related to applying metadata to web pages, electronic documents and, yes, especially scanned images.

Summary: Organized taxonomy + relevant metadata = Efficient process

In summary, my point is to carefully plan out your document capture process.  Pay close attention to developing an effective taxonomy for your documents.  Determine what information is important on a particular document and what is not.  Document capture technology has evolved to nearly magically proportions but, the truth is that organizations can still greatly help their efficiency and content management effectiveness through careful planning; after all there still is logic to document capture.

I am always open to constructive criticism.  I’m not always right – believe it or not – and I’m always willing to have a healthy debate about any topics.  I look forward to your feedback and comments.

Sincerely,

Kevin

http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/digg_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/reddit_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/stumbleupon_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/delicious_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/newsvine_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/technorati_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/magnolia_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/google_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/facebook_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/yahoobuzz_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/twitter_16.png

No Comments

私の日本語を体験 – My Japanese Experience

This week I made my fifth trip to Japan for business.  It had been two years since my last visit and each time I come I gain a greater admiration of the Japanese people.

While I do enjoy seeing the sites and touring various areas of the country what I enjoy most is the interaction with everyday Japanese citizens in hotels, on trains or in restaurants.  They go out of their way to make everyone feel comfortable and welcome. I think they are especially considerate to western travelers and out-of-country visitors in general.  I stick out like a sore thumb with my blond/reddish hair color and goatee so it’s fairly obvious that I am a foreigner.

This particular trip was to the Tokyo area so the travel wasn’t nearly as brutal as most of my other visits but it still took a 10 hour flight, 1 ½ hour bus ride and 40 minute train ride to arrive at the hotel.  I hardly sleep when I come on these trips to intentionally avoid fatigue by trying to adjust to the time change. I just sincerely enjoy the time here so I manage to get by pretty much on pure adrenaline and get through all the meetings then I’m worthless for the next week after I get home.

Tokyo Train Rail System

This was the first trip alone.  Typically I have traveled with other co-workers and they had been familiar with the train and bus schedules.  This time I had to figure out myself.  Surprisingly enough I didn’t get lost once!  Everything here is small compared to the United States.  Cars are smaller. Hotel rooms are MUCH smaller.  Although there is a large population throughout the Tokyo area I am always so impressed by the Japanese efficiency.  Trains absolutely arrive and leave on time, all the time.  If the train is to be there at 10:11 then it’s always there like clock-work so don’t be late!  Things just simply happen quicker and there is no room for wasted energy.

It’s always refreshing to visit and be part of this society even if it’s only for a few days.  Of course there are some negative things such as most people don’t own vehicles so you must rely on public transportation. Or the fact that everything is small could throw someone with claustrophobia into an episode.  But aside from a few inconveniences and adjustments that would take some getting-used-to much of the Japanese experience is what people should aspire to do.  Be kind and considerate.  Work hard and be respectful.  And, above all else, don’t miss the last train home at night or you are stuck!

Sayonara!

http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/digg_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/reddit_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/stumbleupon_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/delicious_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/newsvine_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/technorati_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/magnolia_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/google_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/facebook_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/yahoobuzz_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/twitter_16.png

, ,

No Comments

Hey you, yeah you – get into my Cloud

A quick visual on the emerging Cloud Computing market to create some discussion.  Any thoughts, commentary or discussion on this topic are welcome.
.
Click image to view full size diagram.

Click image to view full size diagram.

.

http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/digg_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/reddit_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/stumbleupon_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/delicious_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/newsvine_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/technorati_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/magnolia_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/google_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/facebook_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/yahoobuzz_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/twitter_16.png

, , , , , , , , , , , ,

No Comments

Fun in the Sun & The “BurritoZilla”!

This weekend was fun. It’s the first signs of Spring coming and the weather is wonderful. Saturday was probably around the low-80’s at the high and Sunday was even better in the low-90’s is my best meteorological guess.

Saturday we went out to get some household essentials but also some plants and herbs. Although we picked up a few herb plants, some of the stuff we want is a bit more particular especially if you’ll be eating the stuff (al la..herbs). We thought it was best to call it a day on Saturday and then head down to a local plant and garden specialty place on Sunday. I don’t think the stuff at the specialty place was any better and way more expensive. I think if you want quality then you have to grow it yourself.

After the essentials shopping and plant/herb store we decided to head and get some lunch at this place we saw recently on the History Channel’s “Man vs. Food”. The place is called Iguanas and their gigantic burrito is called the “BurritoZilla”. It’s 18 inches and 5 pounds of Burrito gluttony! The atmosphere of the place was fun because it was tightly integrated into the college lifestyle and atmosphere. It’s on the San Jose State University campus. Brandee was observant to the fact that after ordering our food and taking our seat, nearly every patron in the place was staring (and very obviously so) at us while we tried to dine on this mammoth burrito. As much as we tried to inflict damage to this 18” BurritoZilla, we only managed to finish about 2 total inches and less than a half-pound, leaving 16” and 4.4 pounds of meat, beans and cheese to go. We brought the thing back and cut it into four full-size servings then froze them into bags as leftovers.

On Sunday we spent the day getting some additional planting supplies at Home Depot, went to Hooter’s for lunch, then came home to clean and open the pool for the season. Brandee loves to go to Hooter’s more than me. Before the place was open and still under construction last year I jokingly suggested that we go there as we drove by. She was quite open to the idea and ready to go. I wasn’t really shocked but I was surprised at her level of enthusiasm to go. Now nearly every time we are in the Campbell area on the weekend; yep, it’s the Hooter’s we go! We really enjoy “people watching” in the place and especially how the girl’s treat the male clientele versus the female customers. It’s comic relief.

Our pool isn’t anything fancy but it’s a great relief when the temperature gets hot and it’s nice to take a cool dip. The pool is an above ground type that is 18 feet around and four feet deep. It’s certainly enough to float around in and be lazy. Earlier this week it was getting into the low-90 degree temperatures so we got to enjoy the water.

Overall it was a great, enjoyable, relaxing weekend and a fantastic kick-off to Spring.

http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/digg_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/reddit_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/stumbleupon_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/delicious_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/newsvine_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/technorati_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/magnolia_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/google_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/facebook_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/yahoobuzz_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/twitter_16.png

2 Comments

Why a network scanner?

I often get asked this question so I decided to consolidate some of the compelling reasons organizations should consider dedicated network scanners:

Dedicated use device for scanning documents
• No need to wait for the copy machine to become free for use
• Versatile functionality without compromise of added complexity
• Advanced scanning functions performed transparent to user
• Ability to preview images before sending to destinations
• Simple operation easy to understand

General Office - Ease of Use

General Office Functionality

A dedicated scanning device that seamlessly integrates within an organizations existing network infrastructure can be of tremendous value to enhance work processes. Network scanners benefit organizations by decreasing the complex nature, yet not comprising access to important functionality. From users of these devices to network administrators and business managers to basically an entire organization, businesses in a wide variety of markets are benefiting from network scanning.

Uptime/Reliability
• Access to scanning functionality is not hampered by other possible failures of a multifunction device
• Limited physical moving parts decreases likelihood of hardware malfunction
• Straight thru paper path design helps decrease possibility of document jams
• Network scanners inherit attributes designed for mission-critical document scanners

Organizations can only realize the true productivity enhancement of their IT investment when their systems are performing at peak performance. Disruption in the work process wastes time, costs money and causes frustration among employees and customers alike. Dedicated network scanners have been designed with the sole purpose of document scanning and, therefore contain the hardware and software attributes organizations expect which make them desirable in mission-critical business applications. Organizations of all sizes have sought the quality and reliability of single-function document scanners for years. For example, one of these intangible qualities which are sometimes hard to measure is lost productivity caused by a mechanical malfunction of a multifunction copier. Downtime for maintenance is simply not an option.

Ease of Use
• Eliminate complexity and provide simply operation with large touch screen
• Simple touch screen driven scanning operation eliminates specialized training
• In the unlikely event of a document jam, easy jam recovery without damaging documents
• Multiple language support
• Programmable job function buttons can perform repetitive tasks with the touch of one button

Customization and Control

Customization and familiar user experience

Large touch screen displays and integrated keyboards are two physical attributes which make digitizing documents with a network scanner simple. Similar to your on computer desktop at home which you may have customized with a particular look and feel, business users get the most value out of technology when they are familiar with the presentation of interfaces and have the versatility to customize screens. Network scanners adhere to this principal as well. For example, user state migration among devices presents the user with the same, consistent user experience based on their logon information no matter which device they decide to use.

Quality
• Image quality built on experience focused on document capture technology
• Paper path designs are careful engineered to excel at document handling including some with the capability of scanning plastic cards
• Document scanning technology hardware and software integration with specialize content management application providers

The quality of IT products typically is not appreciated until there are operational disruptions caused by failure such as a paper jammed in the device. Experience in developing feeding technology to efficiently handle documents of different shapes, sizes and weights have helped set dedicated document scanner vendors apart from other technology. Network scanners have inherited many of the qualities of traditional document scanners used in mission-critical applications and are bringing the opportunities of network scanning to organizations of all sizes. This focus on the importance of mission-critical scanning is evident in network scanners with specifically design features such as a straight paper path to reduce potential document jams and the ability to scan plastic cards through the document feeder. Additionally, the ability to preview images after scanning and before committing them to a destination is an example of a quality found in some network scanners.

Secure
• Restrict access to only authorized users with secure authentication
• User data such as username/password or image data does not reside on the scanner
• Data is encrypted on device to provide additional level of security
• No external USB port to hijack sensitive information
• Highly secure login authentication and transmission protocols (SSL)
• Lock-down job profiles to adhere to organization established policies

Login - Security and Authentication for compliance

Whether it is for regulation, compliance or other reasons, data security plays a major role for network scanners. As a device that is ‘always on’ and connected to corporate networks, the risk of a data compromise of information has to be careful considered. From access to the devices themselves, or the manner in which information is electronically communicated to which level of functionality should be provided to particular users or groups is all functionality that organizations need to be thought through thoroughly. Network scanners provide these security features to assist organizations utilize devices in a manner which adheres to their specific established policies. The threat of data compromise comes in many fashions; not only externally but maybe internally and sometimes inadvertently, not maliciously.

Total cost of ownership
• Decrease deployment costs with remote administration tools
• Reduce ongoing maintenance costs with ability to push updates to devices from a centralized location
• Utilize existing network resources and systems to conserve budget
• Inexpensive and user replaceable consumables

Stretch your budget further using a dedicated network scanner through simple initial deployment of devices. Simply connect the scanner to the network then IT departments or network administrators can remotely configure and manage devices. No longer do organizations have to incur the expense or time consumed by having to send technicians on-site to setup devices. Additionally, on-going maintenance costs are drastically reduced by not having to replace expensive toner or fuser parts. Easily accessible user replaceable consumables provide a convenient way to keep the network scanner performing at optimal performance, yet decreasing the need for IT involvement.

Simple Deployment and Effective Device Management

Central Administration Server software

Integrated for Business Process Improvement
• Direct connectivity to back-end systems
• Index values and metadata sent directly into Content Management repositories
• Database lookups for validation
• Image enable your Line of Business application with Software Developer’s Kit (SDK) development

Some network scanner vendors offer optional Software Developer Kit’s (SDK) where developers can create unique integration screens to be displayed on the touch panel. These integrations offer tight interoperability with business systems such as Enterprise Content Management (ECM) repositories, Line of Business (LOB) applications, Electronic Medical Records (EMR), Enterprise Resource Planning (ERP) and other third-party solutions. In addition, user interface screens can be created with a custom look and feel to fit corporate branding. An integrated approach to network scanning enables organizations of all sizes to image-enable their current software applications and offers the assurance of delivering images directly into back-end servers without the traditional high costs, aggravation and loss of productivity involved with other approaches.

Integrated Software Applications to Improve Business Efficiency

Third-Party Software Solutions - Integrated

http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/digg_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/reddit_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/stumbleupon_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/delicious_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/newsvine_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/technorati_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/magnolia_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/google_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/facebook_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/yahoobuzz_16.png http://www.kevinneal.com/blog/wp-content/plugins/sociofluid/images/twitter_16.png

No Comments