By Dennis Mortensen In Best Practices | May 2024

Why investing in quality Support Articles is your most important AI Strategy

Applying off-the-shelf algorithmic AI, without investing in proprietary data sets, offers little long-term competitive advantage. In customer support, a winning position stems from proprietary data, often comprising hand-crafted support articles, which can be fed into proprietary ML models to create a truly competitive advantage.

LaunchBrightly Worlds Most Valuable Resource Is No Longer Oil But Data

The emphasis on providing your customers with self-service support options has been growing for many years. Your self-service support solutions are often your customers first point of contact with your customer support organization and, many times, your company. And the foundation of your self-service support offering remains your help center with a Forrester study reporting it is the preferred self-service channel for customers. And it can also be one of your most important AI strategies!

Productized AI for CX

Off-the-shelf AI refers to pre-built, generalized AI solutions that are designed for widespread adoption across various industries and use cases, and allowing for little customization. Some common examples of where these solutions are found in CX include:

  • Sentiment Analysis Tools where AI algorithms are used to analyze customer feedback, emails, or social media comments for sentiment
  • Automated Call Routing Systems that use voice recognition and natural language processing to direct customer calls to the appropriate department or agent

The standardized functionality of off-the-shelf AI allows businesses to implement these solutions quickly and efficiently, and without the need for extensive development or specialized expertise. However, these off-the-shelf solutions often lack the specificity and customization most businesses need and, without investing in proprietary data sets, fail to provide a significant long-term competitive advantage. That is not to say, that you should not go about deploying this, just that this alone will not bring about a winning position.

The world’s most valuable resource is no longer oil, but data †

Proprietary AI, on the other hand, can truly become one of your greatest competitive advantages. Unlike off-the-shelf solutions, proprietary AI is developed internally or customized to meet the specific needs and objectives of an organization and is owned, and used exclusively, by that organization. It harnesses proprietary datasets to train machine learning models that are tailored to the organizations requirements and designed to provide a competitive advantage by addressing specific business challenges. And your hand-crafted support articles are at the core of these proprietary datasets.

Some example of how proprietary AI in CX include:

  • Support Chatbots: These AI-powered agents utilize support articles as their foundational knowledge base, automating customer service inquiries and minimizing the need for human involvement in routine queries
  • Knowledge Base Search: AI-driven search algorithms leverage support articles to provide relevant results and employ a robust understanding of natural language to deliver the most pertinent content, not just links
  • Event Recommendations: AI systems that analyze product events to suggest relevant support articles based on user behavior

The value of quality Support Article data

Your support articles contain a wealth of data that can be extracted to enhance and improve your machine learning models. Just some of which includes:

  1. Text Content: The unstructured content in the body of your support articles.
  2. Structured Data: The organization, formatting and structure of the content in articles.
  3. Metadata: This includes information like the author, publication date, date last updated, version number, tags, keywords, categories or topics.
  4. Product Screenshots: The fresh, accurate and well annotated product screenshots of your application
  5. Graphics: The visuals that support the text such as diagrams, charts, infographics, etc.
  6. Image Metadata: This includes information like the format and dimensions, as well as the concepts and objects being depicted.
  7. Hyperlinks and Cross-References: The links that live within your support articles.
  8. Usage Data: The analytics related to how support articles are accessed and used by your customers.
  9. User Feedback: The comments, ratings, or information in other feedback forms that your customers provide.
  10. Queries: Data on search terms and query parameters that customer use to find and access your support articles.

High-quality support article data plays a pivotal role in enhancing the effectiveness of any machine learning models and overall AI strategies. These articles are meticulously crafted to ensure accuracy, thereby minimizing possible errors and inconsistencies within datasets (*see them as your data annotations). As a result, and stating the obvious, machine learning models trained on such data can produce more reliable and precise outcomes. Quality support articles also help to reduce noise within the data, leading to more accurate predictions and a reduction in biases that may affect the performance of certain AI algorithms. Furthermore, the various use cases covered within these articles enables machine learning models to possibly generalize more effectively to new and unseen scenarios, while also reducing the risk of overfitting. The largely structured nature of these articles also facilitates potential transfer learning, allowing models to leverage learned features across related domains. And they provide a solid foundation for implementing data augmentation techniques and optimal data encoding methods to create robust datasets, and improve the generalization and overall performance of your systems.

But perhaps most importantly, high-quality support article data fosters consistency in customer interactions. Machine learning models, when trained on high-quality standardized support articles, provide consistent answers to customer questions which helps to drive enhanced brand reliability by reducing the risk of misinformation or varied responses that can erode customer trust.

Beware of Rotten Data

The effectiveness of proprietary AI built upon your support articles heavily relies on the quality and relevance of the content. Inaccurate or outdated data within support articles can significantly impair the performance of your AI. Outdated content, whether it be the text or obsolete product screenshots, not only risks providing customers with inaccurate and inconsistent information but it also introduces errors and hallucinations †† in machine learning models. These errors can be subtle and challenging to identify, making it much harder to understand how the AI arrived at its conclusions. Or in other words: Garbage in, garbage out!

This can negatively impact many of your most important customer support metrics and it is therefore crucial to ensure that your support articles, from the text to the product screenshots, are up-to-date, relevant and accurate.


Proactively investing into proprietary, comprehensive and unique datasets is the key to success. And, in CX, one of the most valuable sources of data is your hand-crafted support articles. From the written content, to the product screenshots, to the wealth of metadata available to be extracted from support article content. So keep creating and maintaining stellar support articles enriched with quality metadata.

This is a topic we are super passionate about! And we’ve given a talk on this at numerous events. If you’d like to see a set of slides to help build your internal business case and better advocate for the importance of knowledge management and quality help documentation as the foundational component of your AI strategy do email us and we'll be happy to send a copy.

Or if you’d like to chat more about how stellar support articles can set you up for success, or just want to geek-out on product screenshots, feel free to grab some time on my calendar!

†† Hallucinations in AI models, particularly in natural language processing (NLP) models, refer to instances where the model generates information that is factually incorrect, misleading, or entirely fabricated while sounding confident and conclusive.