The logic of document capture

Indexing, Metadata, Keyword, SharePoint, Capture, Scanner, Documents, ECM, Content Management

What is wrong with the collection of words above?  Well, it’s a collection of terms that are closely related but have no logical structure in order to be of value to anyone reading them.  In order for these words to be valuable in terms of readability for context they need to be logically organized into a sentence.  The logic of document capture and Enterprise Content Management is much the same.  In this blog post, instead of going into the nuts and bolts of document capture I thought it is more important to discuss two critical components to your overall success, or failure, of your content management strategy.  These two critical components are taxonomy and metadata.  This is philosophy and not technology.

To break down document capture in its simplest form, just think of this as the process of extracting information from a document and making that information available in the future.  The future could be immediate where a scanned invoice, for example, immediately kicks-off a payment process.  Or it could be two weeks from now where a customer service agent needs to retrieve a signed airbill for a proof of delivery.  The point is that document retrieval is based on some unique keyword or a set of keywords related to a particular document.  In the case of the invoice it could have been the invoice number and in the case of the airbill it could have been the shipping tracking number.

If you do not consider a well thought-out strategy then your organization could have accomplished the task of taking an organized paper mess and simply converted it to an electronic mess.

Establish a well thought-out taxonomy

Taxonomy is defined as classifying organisms into groups based on similarities.  Why is taxonomy relevant for document capture?  For several reasons, including security, quicker access to information and retention policies.  So, if you work backwards in the methodology of how and what, technology to implement for your document capture solution a solid consensus of the end result is of paramount importance.  The end result is typically a high-quality scanned image conducive for data capture (OCR, ICR, OMR, bar code, etc.) and the metadata itself.  So if your taxonomy has organized methodology then it should assist in making your document capture strategy fairly obviously.  Let’s take security as a benefit for a well thought-out taxonomy strategy.  By segregated documents based on a logical taxonomy, organizations are afforded an addition level of comfort knowing that a set of security policies can be applied to, for example, Human Resource, documents allowing access to everyone for a general set of available scanned documents such as the café menu which is clearly not a information sensitive document.  Additionally, another benefit of a well thought-out taxonomy is quicker access to information for users.  Many content management software applications and search engines use a ‘crawl’ method to check newly added content and add them to an index (database) which is then searchable.  As you can imagine, common sense and logic dictates that ‘crawling’ a more narrow scope is much quicker to keep the database up-to-date, but also access times could be considerably less by not having to search the entire database and only the relevant data indexed.  This makes access to data quicker.  Lastly, in regards to retention policies, having your data well organized is a major benefit for this area.  Imagine that an organization has all of their tax documents properly electronic stored via a well thought-out taxonomy in their content management system.  If they did then easily, and within corporate governance standards and policies the organization can removed these images from their repository based on a retention schedule.  So, as illustrated, investing the time to develop a strong taxonomy is important for many reasons including security, searchability and retention.

It is extremely important to not over look this important concept when planning out a document capture strategy.  A simple taxonomy might be organized like below:

  • Accounting
    • Accounts Receivable
      • Check
      • Statement
    • Accounts Payable
      • Invoice
      • Receipt
  • Human Resources
    • Applications
    • Resumes
    • W2 Forms

taxonomy

Considering a well thought-out strategy might seem cumbersome in the initial stages of establishing your document capture strategy, but it can save organizations significant time, money and aggravation in the long-run.  As a best document capture practice it is important to establish a solid taxonomy for scanned documents and also re-evaluate the strategy as it relates to taxonomy as any new documents are introduced within your organization.

 

Consider what information is important, and what is not

Creating Searchable PDF’s is one form on document capture; however, it is not always an ideal document capture strategy.  While sometimes, in certain situations, creating Searchable PDF images of your scanned documents is the right approach for an organization sometimes this technique of document capture often creates inefficiencies.  You might be thinking to yourself how could creating a fully Searchable PDF with all the words of the document indexed be construed as being inefficient?  Let me elaborate.  When creating a Searchable PDF the scanning software does its best job possible to recognize every single character and every single word on a page.  This might sound appealing but let’s consider the possible results in real-world applications.  Imagine that an organization in the insurance business scans as little as 100 single-page documents and creates Searchable PDF documents.  Then they want to retrieve a document based on a keyword so they use the word “claim” in their search criteria to find a document a user is searching for.  As you can imagine the user would most likely be presented with a long set of links to possible documents but only one is the important document they are looking for and the rest is “irrelevant search”.  This is because the entire page was indexed via the Searchable PDF method.  Alternatively, if your data capture strategy had included only extracting “relevant search” terms that apply to a particular document then you make the organization much more efficient by being able to find the data you have requested much quicker with the first search.

One of the other significant benefits with an integrated document capture/content management strategy is that often times any sort of metadata fields created, and rules applied, in the content management system can be brought forward and applied into the document capture system itself.  For example, if an organizations’ policy dictates that on a healthcare insurance form that for a metadata field the social security number is required and can only be nine characters long of numeric characters, then directly in the document capture system these rules can be enforced.  This allows for great business continuity and consistency in your data capture process.

An analogy I like to use is go to your favorite internet search engine and enter in a vague term such as “taxonomy for document capture” then you will get a long list of ‘hits’ that probably are not of interest because you might be looking for a specific piece of information, or a scanned image.  In the contrary, if the user enters-in a more specific term such as “aim document taxonomy” then the focus of the search is narrowed down to a more relevant list of potential information the user is searching for.  This is an example of relevant search versus irrelevant search and it’s all related to applying metadata to web pages, electronic documents and, yes, especially scanned images.

Summary: Organized taxonomy + relevant metadata = Efficient process

In summary, my point is to carefully plan out your document capture process.  Pay close attention to developing an effective taxonomy for your documents.  Determine what information is important on a particular document and what is not.  Document capture technology has evolved to nearly magically proportions but, the truth is that organizations can still greatly help their efficiency and content management effectiveness through careful planning; after all there still is logic to document capture.

Do you have thoughts of the topic of document capture, taxonomy or classification?  Please share your comments.

Capture begins with process

Capture begins with process

As a prelude to an upcoming series of blog posts I will be posting on the topic of “Building an effective capture solution” I wanted to preface these posts and focus on the question of ‘where do I start if I want to build an effective capture solution?’.

More education, less self promotion

With information capture being such an obvious way to decrease operational costs, increase efficiency, reduce risk and assist with compliance, then it begs the question of why wouldn’t everyone be using capture?  I think the answer lies in the fact that as an industry we have done a dis-service to our community.  Every vendor’s product is the best *sarcasm*.  Everyone can offer the complete solution *eyeroll*.  Vendors compete for business on a list of features instead of a genuine desire to assist their customers become more productive *disgust*.  Of course this is a generalization and not every vendor, or person, is so self-centered but my point is that a resource such as the AIIM community, which is rich in educational information and maintains a genuine vendor-neutral stance, are too few and far between.  We need to breakdown the components of a capture solution to their lowest common denominator and share with others how to achieve an effective capture solution so that everyone can benefit from a technology that has a proven track record of success.  Breaking down the components of a capture solution involves three basic parts:  User Interface, Processing and Storage.  It’s really that simple.  Of course this is an oversimplification but those are the basic three components.

Eating my own dog food

Having spent nearly my entire professional career in the document capture/ECM industry you would think that someone like me might suggest that a ‘solution’ starts with consideration of capture hardware or capture software.  Not true.  An effective capture solution, to the contrary, does not start with capturing information from an image.  Rather it starts with a well-defined process.  Capture is an extension of a process that makes things more efficient.

To give some specific examples I would like to provide four different business processes and breakdown the ‘Activity’, as it might happen in a manual process, and the ‘Benefit’, which is the result of what we are trying to achieve.  You will notice, while it’s pretty obvious, that the ‘Activity’ in each case can be slow, costly and inefficient yet many organizations continue to operate in this fashion because it’s the traditional way of doing business.  However, if you truly consider the ‘Benefit’ and know that in each ‘Process’ example below there are well established document capture solutions that can drastically improve these processes then hopefully this will drive more adoption of such a fantastic technology:

Process Activity Benefit
Contact Management Typing the information from a Business Card into Contact Relationship database You want to be able to organize and retrieve contact details
Expense Management Entering the information from a receipt into an Accounts Payable system You want to get reimbursed for your expense
Invoice Management Manual Data Entry of vendor, terms and total information into ERP application The organization would like to realize pre-pay discounts
Inventory Management Keying the line item details from a Packing List into inventory system The business can be more efficient by making product available for sale quicker

capture begins with process_network

Building an effective capture solution:

Part 1 of 3 (User Experience/Device/Interface)
Part 2 of 3 (Capture/Processing/Transformation)
Part 3 of 3 (Storage/Business Policy/Workflow)

 

Your killer SaaS app

Is your SaaS value proposition convincing enough without automatic data entry? 
Imagine you’ve just created the next ‘killer’ Software as a Service (SaaS) app and you are absolutely convinced your new software service is going to revolutionize a particular industry or solve a significant pain point for organizations all over the world.  You create some compelling sales and marketing materials with a heavy emphasis on Return on Investment.  After all, you have conviction that your service is going to help businesses decrease operational costs, improve worker productivity and provide much better access to information which all translates to achieving tangible payback on your customer’s technology investment.
So you’ve done your research, you’ve developed the software application; you created awesome marketing materials, assembled a sales team and created a terrific support structure but for some reason your totally revolutionary SaaS application just isn’t selling as well as you had hoped.  Do you think that you might be overlooking a feature or function that is so fundamental to providing tangible Return on Investment that customers simply cannot say “No” to immediately deploying your innovative solution?
whats missing_data capture
Time is money
I might really be overstating the obvious but employers pay employees to work, not do data entry.  Whether your core expertise is in accounting, customer service or mechanical, your employer pays you to spend a majority of your time focusing on your respective skills.  However, organizations often overlook the total amount of time that is consumed with such tedious activities such as manually entering data from a bank statement into an accounting system.  Or how many total hours field service technicians are spending collecting and entering work order data into an ERP system.  These are real, tangible costs that the organization is paying.  This directly relates to unrealized business productivity and effects the financial bottom-line significantly.  Time is money and time utilized manually entering data into systems is, quite frankly, a waste.
Use cases for Information Capture
Let’s take a look at a few use case scenarios and focus on Mobile Information Capture, specifically, since there is a lot of interest in this area and there is an abundance of data to support that this is one of the greatest opportunities to achieve quick return on investment.
First, consider the industry of Field Service technicians.  According to a November 2011 study by Dave Wood of Harvey Spencer Associates (HSA) entitled “A Study of the Mobile Capture Marketing in the United States”, he cites DF Blumberg Associates as sizing the Field Service market at $225 billion in 2011 and growing to $500 billion by 2018 with nearly half of the 3 million workers using mobile productivity solutions by then.  Since a good majority of these mobile devices will most likely be equipped with a camera this translates directly into a great opportunity to provide these workers with the ability to nearly effortlessly snap pictures of objects such as work order signatures, checks for payment, assessment photos or even invoices and then automatically have the data extracted from these images to populate database fields in a Field Service SaaS application.  Just to name a few of the Field Service benefits for Mobile Capture could be enhanced customer service, the ability to realize the payments quicker and, of course, improve overall worker efficiency.
hsa
In a second use case scenario, also taking data from the same Mobile Capture Market survey, consider the Transportation industry.  For the survey, they focused on Long Haul Trucking.  They found that this particular market featured 1.9 million trucks and 1.7 million deliveries daily.  The research showed that each delivery generated a packet of documents that must be captured for invoicing, with an average of 5 pages per packet.  This translated into a total capture volume of this market of 8.5 million documents PER business day.  The types of items that needed to be captured will slightly vary depending on the particular trucking organization, yet generally documents such as Bills of Lading, Trip Sheets, Scale Tickets and Vehicle Expense Receipts were common amongst most organizations.  After some calculation of the projected number of drivers that will have access to dedicated scanners or multifunction devices, the survey predicted that approximately 400,000 drivers will have only smart phones as their primary capture device.  This presents a terrific opportunity to capture all these documents DURING the trip instead of waiting until the trip is complete which could be days, or even weeks later.
The last use case scenario shared by the HSA survey was general Capture to Cloud.  This was predicted to be, by-far, the largest growth opportunity for Mobile Capture and anyone would be hard pressed to argue this prediction.  With the prediction of 2 billion smart phones by 2018 and cloud storage vendors competing like crazy for market share, it only stands to reason that these factors are going to contribute to huge growth for Capture to Cloud applications using mobile devices.
Bringing easy to use, yet highly-effective Ubiquitous Information Capture into the mix
Now that you have your killer SaaS app ready for prime-time.  Your story is polished and you are earning business because your SaaS application is addressing customer pain points such as decreasing operational costs, improving worker productivity and providing better access to information.   You can prove, without a doubt, a tangible Return on Investment with reduced labor costs associated with manual data entry and you recognize the unbelievable potential in the Mobile Capture market, so the question begs, ‘what do you do to make your SaaS application even more appealing to potential customers?’
current solution offering
‘Add Data Capture to you SaaS’ is the answer.  It’s really that simple.  The technology has evolved over the past couple years so that the technology offers extremely advanced features and functions that are completely transparent to the users themselves.  This helps achieve a pleasant user experience which helps drive adoption of the solution among users.  Additionally, the behind-the-scenes technology is performing tasks traditionally done by humans so the processing is highly effective from an automation standpoint.  The user simply snaps a picture and this technology can automatic recognize the type of document and will intelligently extract all the information from the image.
enhanced solution offering
With this new Data Capture capability not only will your SaaS application provide a much more elegant user experience but you can absolutely guarantee cost savings to your customers with the quantifiable amount of time that is recouped by not having users do manual data entry.  The benefits of your SaaS can be incrementally increased with this new Data Capture capability.  Overall you can offer a truly appealing ROI story before you even being to discuss all the wonderful capabilities of your particular application.  The additional features are just like icing on the cake to solidify the sale.

Total Hours x Dollars per Hour = Tangible Cost Savings

This helps achieve a few things in your favor as the preferred software vendor of choice:
* Encourages your customers to make a quicker decision on purchase and implementation of your solution because every day they choose not to make a decision they are squandering money and resources
* Helps differentiate your application from competitors with valuable business functionality that makes the user experience much more enjoyable and helps drive higher adoption rates
* The likelihood of selling more subscriptions to your customers is higher because they can justify adding more licenses due to the fact that they have proven ROI
uic_large
So, are you ready to take your killer SaaS app to the next level with Ubiquitous Information Capture?

 

Coyote Lake Camping – August 16-18, 2013

IMG_6965There you have if folks, another successful camping trip in the books!  Brandee (my wife), Jack (our dog) and I packed up the car once again to make our second camping trip in three weeks.  The weather conditions were ideal; not too hot, not too cold and no rain in the forecast.  We were ready and eager to head-out on an adventure.  The anticipation in our house had been growing as Brandee and I were making our plans you could also feel that Jack’s eagerness to head into the wilderness was also building.

IMG_6966After several weeks of planning with a bunch of my technology-minded friends, we all packed up our gear and headed out to Coyote Lake Park for two nights and three days of fun, sun and socializing.  Coyote Lake was the same place Brandee and I were at a few weeks ago and we were nearly in the same location this time so we were well-prepared with knowledge of the campgrounds.  I reserved a total of three campsites.  Each campsite is big enough for 2-3 tents and Coyote Lake is also a dog-friendly campground so the finally tally of people and pets was as follows:

  • People:  17!
  • Pets:  7!

Yep, you got that right, 17 people and 7 dogs (pets)!  It was SO much fun with all the activity going on.  Never a dull moment.

IMG_6960 IMG_6961 IMG_6962 IMG_6976 IMG_6978 IMG_6981 IMG_6982 IMG_6984 IMG_6985 IMG_6986 IMG_6987 IMG_6988 IMG_6990 IMG_6996 IMG_7002 IMG_7003

 

IMG_6972On this particular adventure we were joined by Marc (with his family, friend and three dogs), Semyon (with his family), Eugene (with his wife), Chris (with his wife and three dogs), and Ivan (with his family).

We all arrived at Coyote throughout the day on Friday afternoon so between 3pm and 8pm-ish most of the time was busy just helping get everyone setup and conformable with the settings.  By evening everyone was well prepared and ready to have a great time for the weekend.

 

semyon eugene kevin ivan coyote lake 08 17 13So all of us technology nerds arrive at the campground with either non-existent, or at the best, weak internet signal so this indicated our sincere dedication to disconnecting from the cyber-world and connecting with each other (face-to-face) as people.  Overall, this is absolutely what I enjoyed most about this particular excursion — the social interaction between all the groups was tremendous.  One of the most amazing events happened, by total coincidence, which was remarkable.  Four of us had worn t-shirts advertising different U.S. cities such as New York, Boston and San Jose.  I assure you this was not planned but it was quite a hilarious experience!

 

eugene kevin chris marc smoking cigars coyote lake 08 17 13One special moment that was absolutely noteworthy was that one of the camping participants (whom shall remain nameless) brought a bottle of nice Scotch and a few cigars for ‘the boys’ to enjoy. So we all gathered our lounge chairs on Chris’ camp site and talked about solving the world’s most serious problems (insert sarcasm here) as well as other topics way beyond my meager brains compute power. This picture does not pay true justice to all the fun of the event, however.  Not pictured are all of the other campers, friends and family that came to socialize together.  This was an absolute highlight for me to gather everyone in an informal setting and just being fun together.

A personal highlight of this trip was a long, roughly 5+ mile walk I went on with Semyon and his wife.  When I say long, I mean extremely long :-) but it was great fun.  Neither Jack nor I are used to so much physical activity.  Give me a computer keyboard and internet access and I’m wickedly active but long walks, I’m just not accustomed to.  We walked nearly the entire length of the lake and back!

IMG_7006I would like to say that Coyote Lake has some of the most wonderful and kind park rangers.  Every single one of them has been so kind and helpful.  Coyote Lake is highly recommended as a convenient place in the San Francisco Bay Area for overnight camping, day picnics, fishing or all sorts of water activities.

In summary, I would like to thank everyone (and of course their willing families) for such a great memory.  This was the sort of adventure that lasts a lifetime.  So much fun and I can’t wait to do it again very soon!

Additional Photos of our fun:

IMG_6964 IMG_6967 IMG_6968 IMG_7011 IMG_7014 IMG_7015

 

Sharen Neal, Lifetime Achievement Award, NAMIPC

NAMI – Placer County

 IMG_6959

 

IMG_6957Today, August 15th 2013, my mom, Sharen Neal, received a well-deserved lifetime achievement award from National Alliance on Mental Illness association of Placer County (NAMIPC – http://namipc.org).  It was so wonderful that Loretta alerted us in advance of this event so that we could plan to be there in-person.  This was a regularly scheduled meeting for NAMIPC but Loretta (President of NAMI – Placer County) added a twist by adding a special ceremony before their regularly scheduled meeting for my mom and Pauline.

IMG_6952It was very special and they presented both Sharen and Pauline with an awesome/custom-engraved plaque recognizing their years of service and then we all had some delicious celebratory cake and drink.  Of course neither woman went looking for recognition like this; rather personal situations drove them to commit their lives to helping others.  I admired both Pauline and my mom for their commitment to the cause of improving the welfare of the Mentally Ill because it is such a vicious condition.  It’s hard to diagnose, difficult to treat and challenging to treat in the proper way (especially with such limited resources).  It simply is a lifetime affliction for most and more research/education on the topic needs to be forthcoming so we can better address the needs of those in-need.

 

namipc_banner

Click the banner link above to learn more about NAMI – Placer County 

For those who might not know, my mom has been a relentless advocate on behalf of the Mental Ill of Placer County (California, in the Sierra Foothills where they live).  She has been a strong advocate for rights and due-process.  Her life is dedicated to making others lives better through her advocacy in the association.

While I’m not exactly sure how many years she was involved in NAMIPC as a volunteer serving in several capacities, the fact-of-the-matter is that she will most likely always be involved in some capacity.  To help others is simply in my mom’s DNA and I absolutely love her for this!

namipc_081513

It was quite the feat to pull-off getting everyone organized but my wife, Brandee, managed to pull it off perfectly, although there were a few close-calls considering the constrained time considerations.  This is how it went down.  First, my older brother, Mike and his wife, made the trip from Santa Rosa.  Then my younger brother, John, made the trip from Chico and, finally, Brandee and I made the trip from San Jose.  So, as you can see, this was a weekday (Thursday afternoon) collaboration with many participants, from various parts, that typically does not happen.  However we were all determined to be there for this great event and we wouldn’t let anything stand in the way;  so we did it in honor of Sharen!


IMG_6954
Pictured from left to right:  John, Sherry, Mike, Floyd, Sharen, Kevin
(behind the camera = Brandee)
 

Mom, we, and especially on behalf of all those whom you have helped in life previously, as well as that you will help in the future, appreciate your passion and commitment to helping the Mentally Ill!  Thanks so much from everyone!!!