System Best Practices and Checklist

Adhering to these best practices enhances system stability, improves security, ensures business continuity, and increases operational efficiency.

These practices apply across all modules of ServiceOps (Service Desk, Asset Management, and Patch Management) to ensure system stability, security, and reliability. They provide a comprehensive framework for maintaining a healthy and robust ServiceOps environment. This document combines the explanatory guide with an actionable checklist for Support Teams to follow on a daily, weekly, monthly, quarterly, and annual basis.

Backups & Recovery

Perform daily database backups (full) and incremental backups as per workload.
Backup configuration files, attachments, and logs regularly.
Store backups in secure, offsite/cloud storage with encryption.
Test restore procedures periodically to confirm usability.
Define and follow a retention policy (e.g., daily → 30 days, monthly → 12 months).

Backup Methods

note

Implement a retention policy and ensure older backup data is removed or moved to another location after its retention period.

Ensure adequate storage size to prevent future interruptions due to space constraints.

Backups & Recovery Checklist

Daily Checklist

Verify database backup completed successfully. Below are the methods to take the backup:
Method 1 – Database Dump File Backup
- Take a PostgreSQL database dump at a specific time daily and store it in a designated backup folder.
- You can set up a shell script to automate this process and copy the backup folder on a daily basis.
Method 2 – Database Folder Backup
- Folder path: /var/lib/postgresql/17/main
Maintain documentation to ensure consistency and future reference.

Weekly Checklist

Perform a sample restore test of the FileDB from the latest backup at the following path: Folder path: /opt/flotomate/main-server/filedb/ Folder path: /opt/flotomate/cm-analytics/filedb/
Maintain documentation to ensure consistency and future reference.

Monthly Checklist

Perform a full restore drill (DB + application configs).
Review log retention and archive old logs as per policy.
Maintain documentation to ensure consistency and future reference.

Quarterly Checklist

Review backup retention policies and adjust if needed.
Maintain documentation to ensure consistency and future reference.

Annual Checklist

Perform full disaster recovery simulation (restore + failover).
Maintain documentation to ensure consistency and future reference.

Monitoring & Services

Check application services (API, DB, notification, scheduler) daily.
Monitor CPU, memory, disk usage, and network throughput.
Configure alerts for:
- Service downtime
- Disk usage > 80%
- High database connections or slow queries
Monitor SLA breaches and system error logs proactively.

Monitoring & Services Checklist

Daily Checklist

Check application services (API, DB, scheduler, notification) are running.
Monitor CPU, memory, and disk usage on all servers.
Review system error logs for warnings or failures.
Check SLA breach alerts for incidents/requests.
Maintain documentation to ensure consistency and future reference.

Weekly Checklist

Check application and database performance metrics for anomalies.
Maintain documentation to ensure consistency and future reference.

Monthly Checklist

Check capacity trends (DB growth, ticket volumes, patch volumes).
Maintain documentation to ensure consistency and future reference.

Annual Checklist

Validate long-term capacity plans (hardware, cloud, DB).
Maintain documentation to ensure consistency and future reference.

Security & Access Control

Enforce role-based access control (RBAC) with least privilege principle.
Enable Multi-Factor Authentication (MFA) for admin/support logins.
Regularly review user accounts (quarterly) and disable inactive accounts.
Keep SSL/TLS certificates valid and renewed before expiry.
Apply security patches for OS, database, and ServiceOps dependencies.

Security & Access Control Checklist

Daily Checklist

Confirm SSL certificates are valid (no near-expiry).
Maintain documentation to ensure consistency and future reference.

Weekly Checklist

Review inactive user accounts and disable/remove as needed.
Review patches and security updates pending on OS/DB.
Maintain documentation to ensure consistency and future reference.

Monthly Checklist

Audit RBAC permissions for least-privilege compliance.
Verify compliance requirements (GDPR, HIPAA, ISO, etc.).
Maintain documentation to ensure consistency and future reference.

Quarterly Checklist

Conduct vulnerability scans and review reports.
Audit service configurations against baseline standards.
Review and renew expiring software licenses or certificates.
Maintain documentation to ensure consistency and future reference.

Annual Checklist

Conduct a security and compliance audit (internal or external).
Maintain documentation to ensure consistency and future reference.

Change & Upgrade Management

Always test upgrades and patches in a staging environment.
Maintain rollback plans for every upgrade or configuration change.
Use a Change Calendar to track upcoming updates and maintenance.
Communicate downtime windows to stakeholders in advance.

Change & Upgrade Management Checklist

Weekly Checklist

Validate Change Calendar is updated with planned updates.
Maintain documentation to ensure consistency and future reference.

Logging & Auditing

Centralize logs (application, system, security).
Use structured logging with timestamps, severity, and context.
Enable log rotation to prevent storage overflow.
Retain audit trails for critical actions (user creation, role changes, deletions).
Review logs regularly to detect anomalies or repeated errors.

Logging & Auditing Checklist

Weekly Checklist

Check disk utilization and clean old/rotated logs if required.
Maintain documentation to ensure consistency and future reference.

Monthly Checklist

Review log retention and archive old logs as per policy.
Maintain documentation to ensure consistency and future reference.

Performance & Capacity Planning

Track response times, transaction rates, and system load.
Monitor database growth, asset counts, and ticket volumes.
Review patch/job execution times for bottlenecks.
Plan for scaling hardware or cloud resources before capacity is reached.

Incident & Problem Handling

Respond quickly to high-priority incidents.
Perform root cause analysis (RCA) for major or recurring issues.
Maintain a Known Error Database (KEDB) for faster resolution.
Share post-incident reports with action items.

Incident & Problem Handling Checklist

Monthly Checklist

Review incident reports and perform RCA for recurring issues.
Update documentation/runbooks with recent changes.

Annual Checklist

Refresh support team training (incident handling, DR, security).
Maintain documentation to ensure consistency and future reference.

Compliance & Audits

Follow data protection and privacy guidelines (e.g., GDPR, ISO, HIPAA if applicable).
Retain logs and access records for audits.
Regularly review license compliance for software and integrations.
Perform periodic internal audits for configuration, security, and access control.

Compliance & Audits Checklist

Monthly Checklist

Verify compliance requirements (GDPR, HIPAA, ISO, etc.).
Maintain documentation to ensure consistency and future reference.

Quarterly Checklist

Conduct vulnerability scans and review reports.
Audit service configurations against baseline standards.
Maintain documentation to ensure consistency and future reference.

Annual Checklist

Conduct a security and compliance audit (internal or external).
Review and update system architecture diagrams.
Maintain documentation to ensure consistency and future reference.

Backups & Recovery
- Backup Methods
- Backups & Recovery Checklist
Monitoring & Services
- Monitoring & Services Checklist
Security & Access Control
- Security & Access Control Checklist
Change & Upgrade Management
- Change & Upgrade Management Checklist
Logging & Auditing
- Logging & Auditing Checklist
Performance & Capacity Planning
Incident & Problem Handling
- Incident & Problem Handling Checklist
Compliance & Audits
- Compliance & Audits Checklist

Backups & Recovery​

Backup Methods​

Backups & Recovery Checklist​

Monitoring & Services​

Monitoring & Services Checklist​

Security & Access Control​

Security & Access Control Checklist​

Change & Upgrade Management​

Change & Upgrade Management Checklist​

Logging & Auditing​

Logging & Auditing Checklist​

Performance & Capacity Planning​

Incident & Problem Handling​

Incident & Problem Handling Checklist​

Compliance & Audits​

Compliance & Audits Checklist​

Backups & Recovery

Backup Methods

Backups & Recovery Checklist

Monitoring & Services

Monitoring & Services Checklist

Security & Access Control

Security & Access Control Checklist

Change & Upgrade Management

Change & Upgrade Management Checklist

Logging & Auditing

Logging & Auditing Checklist

Performance & Capacity Planning

Incident & Problem Handling

Incident & Problem Handling Checklist

Compliance & Audits

Compliance & Audits Checklist