{"id":13637,"date":"2025-02-12T14:55:00","date_gmt":"2025-02-12T22:55:00","guid":{"rendered":"https:\/\/www.kevinneal.com\/blog\/?p=13637"},"modified":"2025-07-07T16:14:35","modified_gmt":"2025-07-08T00:14:35","slug":"ai-dev-world-2025-a-wake-up-call-for-data-from-documents-importance-for-ai","status":"publish","type":"post","link":"https:\/\/www.kevinneal.com\/blog\/ai-dev-world-2025-a-wake-up-call-for-data-from-documents-importance-for-ai\/","title":{"rendered":"AI Dev World 2025: A Wake-Up Call for Data from Documents importance for AI"},"content":{"rendered":"\n<p id=\"ember16578\">Today, I had the opportunity to attend <a href=\"https:\/\/aidevworld.com\/\">AI Dev World<\/a> at the Santa Clara Convention Center. I\u2019ll admit, I walked in with low expectations\u2014after all, we\u2019ve been inundated with hype about generative AI over the past few years. But within just two visits to expo booths and conversations with vendors, my skepticism transformed into sheer excitement. The relevancy of \u2018data from documents\u2019 to AI and Large Language Models (LLMs) is not just theoretical\u2014it\u2019s happening, and it\u2019s significant!<\/p>\n\n\n\n<p id=\"ember16579\"><strong>A Vision for TWAIN Direct and PDF\/R in AI<\/strong><\/p>\n\n\n\n<p id=\"ember16580\">Over the past few weeks and especially in preparation for the upcoming <a href=\"https:\/\/www.aiim.org\/global-summit-2025\">AI+IM Global Summit<\/a> in Atlanta, I\u2019ve been socializing a concept that integrates <a href=\"https:\/\/twaindirect.org\/\">TWAIN Direct<\/a> and <a href=\"https:\/\/pdfraster.org\/\">PDF\/R<\/a> technologies into an AI\/LLM solution. To date, the response has been lukewarm at best\u2014many see it as too ambitious, outside our scope, or better left to systems integrators. But after what I saw at AI Dev World, I believe that anyone who isn\u2019t seriously considering participating in some form or fashion in this reference solution is making a huge mistake.<\/p>\n\n\n\n<p id=\"ember16581\"><strong>Real-World Validation: Edge AI Innovations and Moorche<\/strong><\/p>\n\n\n\n<p id=\"ember16582\">Let me share one real example that underscores the importance of document data in AI: a company called <a href=\"https:\/\/www.edgeaiinnovations.com\/\">Edge AI Innovations<\/a>. They\u2019ve developed a Semantic Search engine called Moorche, designed specifically to train AI using your own documents as the primary data source. Right on their sandbox UI, there\u2019s a button that says, \u201cUpload Document.\u201d That\u2019s how integral document data is to their solution!<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media.licdn.com\/dms\/image\/v2\/D5612AQE8CK8sYCPZXA\/article-inline_image-shrink_1500_2232\/B56ZT9_qdKGUAY-\/0\/1739428112600?e=1750291200&amp;v=beta&amp;t=y43MGzHQ5z8nqS2Xe1FvjSjhwO672MqmZiuC_ROI9Lo\" alt=\"Article content\"\/><figcaption class=\"wp-element-caption\">Edge AI Innovations<\/figcaption><\/figure>\n\n\n\n<p id=\"ember16584\">Here\u2019s how they describe their technology:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>\u201cIntroducing Moorche Serverless RAG\u2014the simplest way to build secure, high-performance AI chatbots and assistants. Designed by Edge AI Innovations, our platform removes the usual hurdles of setting up and scaling AI systems. With just a few steps\u2014log in, upload your documents, pick a model, and start chatting\u2014Moorche makes it effortless to bring your own data to life.\u201d<\/em><\/p>\n<\/blockquote>\n\n\n\n<p id=\"ember16586\">After a great discussion at their booth, I couldn\u2019t wait to test it myself. So, I uploaded a white paper on proposed SAML 2.0 support for TWAIN Direct that the <a href=\"https:\/\/twain.org\/\">TWAIN Working Group<\/a> will soon publish. The result? I asked a natural language question, and Moorche not only understood but generated a well-reasoned answer: <em>\u201cYes, TWAIN Direct should support SAML 2.0,\u201d<\/em> providing a rationale based on the single document I uploaded!<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media.licdn.com\/dms\/image\/v2\/D5612AQEXH0eIdHXYVg\/article-inline_image-shrink_1500_2232\/B56ZT9_1UVHQAc-\/0\/1739428157144?e=1750291200&amp;v=beta&amp;t=i7ecDVu1shNPqzcTLRpQ2Muqude5Q4dABby0EPD8dRU\" alt=\"Article content\"\/><figcaption class=\"wp-element-caption\">Moorche answers, \u201cYes, TWAIN Direct should support SAML 2.0,\u201d with rationale<\/figcaption><\/figure>\n\n\n\n<p id=\"ember16588\"><strong>The Takeaway: Document Data is Critical to AI\u2019s Future<\/strong><\/p>\n\n\n\n<p id=\"ember16589\">This simple yet powerful example reinforces key points I\u2019ve been advocating for months:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Private Small Language Models (SLMs)<\/strong> built from your own data are critical\u2014relying solely on public LLMs is not the best path forward.<\/li>\n\n\n\n<li><strong>Edge-based AI models<\/strong> for IoT devices are not just possible; they\u2019re highly desirable.<\/li>\n\n\n\n<li><a href=\"https:\/\/www.linkedin.com\/pulse\/stargate-deepseek-energy-efficiencies-ai-twain-working-kevin-neal--om07c\"><strong>Optimizing networks and reducing energy consumption<\/strong><\/a> is a priority for vendors looking to gain a foothold in the AI market.<\/li>\n\n\n\n<li><strong>Data from Documents, including the volume, variety, and real-time distributed capture capability of scanned images from document scanners and copy machines, is a crucial onboarding solution for AI systems.<\/strong><\/li>\n<\/ol>\n\n\n\n<p id=\"ember16591\"><strong>Call to Action: Our Industry Must Act Now<\/strong><\/p>\n\n\n\n<p id=\"ember16592\">To my colleagues in the document scanning, capture, and IDP industry\u2014our expertise is more valuable than ever. AI developers are hungry for structured, high-quality data, and we are the ones who can provide it.<\/p>\n\n\n\n<p id=\"ember16593\">If this resonates with you and you don\u2019t want to be left behind, let\u2019s talk! I am actively seeking collaborators for our proposed <strong>AI\/LLM reference solution project.<\/strong> Contact me, and let\u2019s explore how you can play a role in shaping the future of AI through document data.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today, I had the opportunity to attend AI Dev World at the Santa Clara Convention Center. I\u2019ll admit, I walked in with low expectations\u2014after all, we\u2019ve been inundated with hype about generative AI over the past few years. But within just two visits to expo booths and conversations with vendors, my skepticism transformed into sheer [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":13638,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-13637","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-miscellaneous"],"uagb_featured_image_src":{"full":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI.png",1025,577,false],"thumbnail":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI-150x150.png",150,150,true],"medium":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI-300x169.png",300,169,true],"medium_large":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI-768x432.png",768,432,true],"large":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI.png",980,552,false],"1536x1536":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI.png",1025,577,false],"2048x2048":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI.png",1025,577,false],"post-thumbnail":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI-940x198.png",210,44,true],"coral-dark-medium-large-2x":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI.png",1025,577,false],"coral-dark-large-2x":["https:\/\/www.kevinneal.com\/blog\/wp-content\/uploads\/2025\/04\/AI-Dev-World-2025-A-Wake-Up-Call-for-Data-from-Documents-importance-for-AI.png",1025,577,false]},"uagb_author_info":{"display_name":"Kevin","author_link":"https:\/\/www.kevinneal.com\/blog\/author\/kneal\/"},"uagb_comment_info":0,"uagb_excerpt":"Today, I had the opportunity to attend AI Dev World at the Santa Clara Convention Center. I\u2019ll admit, I walked in with low expectations\u2014after all, we\u2019ve been inundated with hype about generative AI over the past few years. But within just two visits to expo booths and conversations with vendors, my skepticism transformed into sheer&hellip;","_links":{"self":[{"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/posts\/13637","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/comments?post=13637"}],"version-history":[{"count":1,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/posts\/13637\/revisions"}],"predecessor-version":[{"id":13639,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/posts\/13637\/revisions\/13639"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/media\/13638"}],"wp:attachment":[{"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/media?parent=13637"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/categories?post=13637"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kevinneal.com\/blog\/wp-json\/wp\/v2\/tags?post=13637"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}