Best Practices for AI-Ready Knowledge
Structure and maintain your knowledge collections so AI agents can retrieve accurate, relevant information quickly.
The quality of an AI agent's responses depends directly on the quality of the knowledge it can access. These best practices help administrators build collections that are precise, maintainable, and optimized for AI retrieval.
Focus Each Collection on a Single Domain
Avoid creating one large collection that covers all topics. Instead, organize collections around a specific domain, team, or request type — for example, IT Onboarding, Password & Access, or Hardware Requests.
Focused collections allow agents to retrieve answers with higher precision. When a collection contains too many unrelated topics, AI retrieval returns mixed results that reduce response quality.
In practice: Create a separate collection per AI agent or per team in AI Studio, and only attach the collection most relevant to that agent's purpose.
Keep Content Accurate and Current
AI agents only know what they are given. Outdated or incorrect content produces outdated or incorrect responses.
- Publish knowledge base articles before adding them to a collection, draft articles are not indexed.
- Review your collections quarterly and archive or remove content that is no longer accurate.
- For website sources, configure a sync schedule so the collection automatically stays updated with changes to the source site. See Adding Knowledge Sources — Configuring a Sync Schedule.
- Use the Test Collection feature after any significant content update to verify that retrieval still returns relevant results. See Testing a Collection.
Write Content the Way Users Ask Questions
AI retrieval is semantic, it matches meaning, not just keywords. Content that mirrors how users phrase their questions retrieves more reliably.
- Use plain language and avoid internal jargon in article titles and body text.
- Write section headings as questions where possible (e.g., "How do I reset my password?" rather than "Password Reset Procedure").
- Include common variations and synonyms for key terms — for example, an article about VPN access should mention both "remote access" and "VPN" since users may search for either.
Tune Chunk Settings for Your Content Type
AI agents read knowledge in chunks which are segments of text broken from the source content. The default chunk size (1028 characters with 128 overlap) works well for most content, but certain content types benefit from adjustment:
| Content Type | Recommendation |
|---|---|
| Dense technical documentation | Increase Max Chunk Length to preserve context |
| FAQs or short procedural steps | Decrease Max Chunk Length for sharper retrieval |
| Websites with long pages | Increase Chunk Overlap to avoid cutting off important context at boundaries |
Adjust chunk settings when adding files or websites via the Advance Chunk Options panel. After adjusting, use Test Collection with a Scope Threshold of 0.5–0.7 to verify improvement.
Avoid Duplicate Content Across Collections
If the same information appears in multiple collections, AI agents may retrieve conflicting or redundant answers. Maintain a single authoritative source for each topic.
- If multiple agents need the same knowledge, attach the same collection to all of them rather than duplicating its content.
- When archiving a collection, verify it is not the only source of information still in use by an active agent.
Validate Before Attaching to an Agent
Never attach a collection to an AI agent without first testing it. Use the Test Collection panel to:
- Run at least 5 representative queries that your users are likely to ask.
- Confirm relevant chunks are returned with similarity scores above
0.5. - Adjust the Scope Threshold in Vector Store settings if results are consistently too broad or too narrow.
See Testing a Knowledge Collection for step-by-step guidance.