Skip to main content

Best Practices for AI-Ready Knowledge

Structure and maintain your knowledge collections so AI agents can retrieve accurate, relevant information quickly.

The quality of an AI agent's responses depends directly on the quality of the knowledge it can access. These best practices help administrators build collections that are precise, maintainable, and optimized for AI retrieval.

Focus Each Collection on a Single Domain

Avoid creating one large collection that covers all topics. Instead, organize collections around a specific domain, team, or request type — for example, IT Onboarding, Password & Access, or Hardware Requests.

Focused collections allow agents to retrieve answers with higher precision. When a collection contains too many unrelated topics, AI retrieval returns mixed results that reduce response quality.

In practice: Create a separate collection per AI agent or per team in AI Studio, and only attach the collection most relevant to that agent's purpose.

Keep Content Accurate and Current

AI agents only know what they are given. Outdated or incorrect content produces outdated or incorrect responses.

  • Publish knowledge base articles before adding them to a collection, draft articles are not indexed.
  • Review your collections quarterly and archive or remove content that is no longer accurate.
  • For website sources, configure a sync schedule so the collection automatically stays updated with changes to the source site. See Adding Knowledge Sources — Configuring a Sync Schedule.
  • Use the Test Collection feature after any significant content update to verify that retrieval still returns relevant results. See Testing a Collection.

Write Content the Way Users Ask Questions

AI retrieval is semantic, it matches meaning, not just keywords. Content that mirrors how users phrase their questions retrieves more reliably.

  • Use plain language and avoid internal jargon in article titles and body text.
  • Write section headings as questions where possible (e.g., "How do I reset my password?" rather than "Password Reset Procedure").
  • Include common variations and synonyms for key terms — for example, an article about VPN access should mention both "remote access" and "VPN" since users may search for either.

Tune Chunk Settings for Your Content Type

AI agents read knowledge in chunks which are segments of text broken from the source content. The default chunk size (1028 characters with 128 overlap) works well for most content, but certain content types benefit from adjustment:

Content TypeRecommendation
Dense technical documentationIncrease Max Chunk Length to preserve context
FAQs or short procedural stepsDecrease Max Chunk Length for sharper retrieval
Websites with long pagesIncrease Chunk Overlap to avoid cutting off important context at boundaries

Adjust chunk settings when adding files or websites via the Advance Chunk Options panel. After adjusting, use Test Collection with a Scope Threshold of 0.50.7 to verify improvement.

Avoid Duplicate Content Across Collections

If the same information appears in multiple collections, AI agents may retrieve conflicting or redundant answers. Maintain a single authoritative source for each topic.

  • If multiple agents need the same knowledge, attach the same collection to all of them rather than duplicating its content.
  • When archiving a collection, verify it is not the only source of information still in use by an active agent.

Validate Before Attaching to an Agent

Never attach a collection to an AI agent without first testing it. Use the Test Collection panel to:

  1. Run at least 5 representative queries that your users are likely to ask.
  2. Confirm relevant chunks are returned with similarity scores above 0.5.
  3. Adjust the Scope Threshold in Vector Store settings if results are consistently too broad or too narrow.

See Testing a Knowledge Collection for step-by-step guidance.