Adding Knowledge Sources

Add articles, files, and websites to a knowledge collection so AI agents can retrieve accurate, organization-specific information during conversations.

Once a collection is created, populate it with one or more source types. Each source type is suited to different content — choose based on where your knowledge already lives.

Source Types at a Glance

Open a knowledge collection. The header provides actions to add sources and test the collection:

Action	Source Type	Best For
Add Article	Knowledge Base articles	Published content already maintained in ServiceOps KB
Add Files	PDF, DOCX, TXT documents	Internal documentation, manuals, or policy files
Add Website	Crawled web pages	External docs, public knowledge bases, or intranet sites
Test Collection	—	Validating retrieval after adding sources. See Testing a Collection

Adding Articles

Articles are pulled from your existing ServiceOps Knowledge Base and stay in sync automatically.

info

Only published articles are indexed — drafts are excluded.
Archived or expired articles are automatically removed from the collection.
Updates to KB articles sync automatically, keeping the collection current.

Click Add Article. A side drawer opens showing your KB folders and articles.
Select the articles or folders you want to include.
Click Add Article. The selected articles appear in the collection.

Adding Files

Upload documents directly into the collection for content that exists outside your Knowledge Base.

Supported formats: PDF, DOCX, TXT — up to 5 files per upload.

Click Add Files. The upload dialog appears.
Drag and drop files onto the dialog or click Browse Files to select them.
Optionally, expand Advance Chunk Options to control how file content is split for AI processing:
- Max Chunk Length — Maximum characters per chunk (default: 1028). Increase for dense technical content; decrease for short FAQs.
- Chunk Overlap — Characters shared between consecutive chunks (default: 128). Increase if important context is being cut off at chunk boundaries.
Click Add Files. The uploaded files appear in the collection.

Adding Websites

Crawl pages from a URL and add the extracted content to the collection. Useful for documentation sites, public knowledge bases, or intranet pages.

Click Add Website. The Website Crawling dialog appears.
Enter the Website URL (mandatory). For example: https://docs.motadata.com/serviceops-docs/

Expand Advance Crawling Options to control how much of the site is crawled:

Option	Description	Default
Depth	How many link levels deep the crawler follows from the starting URL	`2`
Max URLs	Maximum pages to crawl	`10`
Use Sitemap	Uses the site's sitemap for faster, more complete discovery	Off
Deep Crawl	Follows more internal links for broader coverage	Off

Expand Advance Chunk Options to configure how crawled content is processed:
- Max Chunk Length — Maximum characters per chunk (default: 1028).
- Chunk Overlap — Characters shared between consecutive chunks (default: 128).
tip
Use the same chunk settings guidance as for files: increase Max Chunk Length for long-form pages, increase Chunk Overlap if context is being lost between chunks.
Click Continue. The system begins crawling the site.

Reviewing and Selecting Crawled Pages

Once crawling completes, the Crawled status appears on the source entry.

Click the eye icon (View Crawled Items) to open the Select Pages to Index dialog.

Select individual pages by checking their boxes, or use Select All.
Click Upload Pages Count to add the selected pages to the collection. The number of pages uploaded is displayed. Here, you can use the Search bar, to view any specific URL from the list.

Managing a Website Source

After pages are uploaded, additional options are available from the three-dot (More Options) menu on the source entry:

Configure Scheduler

Set up automatic synchronization so the collection stays current with changes on the source website.

Click More Options > Scheduler. The Domain Sync Schedule dialog appears.

Configure the Frequency:

Once — Set a one-time sync by specifying a date and time in the Start At field.

Monthly — Set a recurring monthly sync. Select one of the two scheduling modes:

On Specific Date — Runs on a fixed date each month. Configure the following fields:
Field Description
Start At The time of day the sync runs
Time Time in HH:MM format
Date The day of the month to run the sync (e.g., 15 for the 15th)
Month The month(s) in which the schedule applies

On Specific Day — Runs on a relative day each month (e.g., the second Tuesday). Configure the following fields:

Field	Description
Start At	The time of day the sync runs
Time	Time in HH:MM format
Occurrence	Which occurrence of the weekday in the month — First, Second, Third, Fourth, or Last
Weekday	The day of the week to run (e.g., Monday, Tuesday)
Month	The month(s) in which the schedule applies

The Schedule Preview shows a summary of your configuration. Click Save to apply.

Sync Now

Click More Options > Sync Now to immediately pull the latest content from the website, outside of any scheduled sync.

Preview Chunks

View how the crawled content has been broken into chunks for AI processing. You can adjust chunk values directly from this view and save updated settings.

Delete

Click More Options > Delete to permanently remove the website source and all its indexed content from the collection. A confirmation dialog appears — click Yes to confirm.

Next Step

After adding sources, test the collection to verify that AI retrieval returns relevant results before attaching it to an agent. See Testing a Collection.

Source Types at a Glance​

Adding Articles​

Adding Files​

Adding Websites​

Reviewing and Selecting Crawled Pages​

Managing a Website Source​

Configure Scheduler​

Sync Now​

Preview Chunks​

Delete​

Next Step​

Related Topics​