Capacity Planning - KaireonAI

This document provides sizing guidance, cost estimates, and scaling thresholds for operating KaireonAI across three deployment tiers.

1. Tier Overview

Tier	Nodes	RPS Capacity	Estimated Monthly Cost	Typical Use Case
Startup	2 EKS (t3.large)	< 100 RPS	~$150/mo	Proof of concept, small teams
Growth	3-4 EKS nodes	100-1,000 RPS	~$400-600/mo	Production workloads, mid-market
Enterprise	6+ EKS nodes	1,000+ RPS	~$2,000+/mo	High-volume, multi-tenant, regulated

2. Startup Tier

Target: Teams evaluating KaireonAI or running low-volume production workloads with fewer than 100 requests per second.

2.1 Compute

Component	Spec	Notes
EKS Nodes	2x t3.large (2 vCPU, 8 GiB)	Managed node group
Next.js App	2 replicas, 512Mi RAM, 250m CPU	Covers API + UI
Pipeline Workers	1 replica, 512Mi RAM, 250m CPU	Batch processing

2.2 Data Stores

Component	Spec	Notes
PostgreSQL	In-cluster (Helm), 20 GiB EBS	Single instance, no replication
Redis	In-cluster (Helm), 1 GiB	Session cache, rate limiting

2.3 Cost Breakdown

Item	Monthly Cost
2x t3.large on-demand	~$120
EBS (30 GiB gp3)	~$8
EKS control plane	$0 (free tier) or$ 73
Data transfer	~$5
Total	~$150-220

2.4 Limitations

No database failover. A PostgreSQL pod restart causes brief downtime.
Not suitable for workloads requiring high availability or disaster recovery.
Pipeline throughput is limited to a single worker.

3. Growth Tier

Target: Production deployments serving 100 to 1,000 requests per second with availability requirements.

3.1 Compute

Component	Spec	Notes
EKS Nodes	3-4x t3.xlarge (4 vCPU, 16 GiB)	Managed node group, multi-AZ
Next.js App	3-4 replicas, 1Gi RAM, 500m CPU	HPA enabled, target 70% CPU
Pipeline Workers	2 replicas, 1Gi RAM, 500m CPU	Parallel pipeline execution

3.2 Data Stores

Component	Spec	Notes
PostgreSQL	RDS db.t3.medium (2 vCPU, 4 GiB)	Automated backups, single-AZ
Redis	ElastiCache cache.t3.small (1.5 GiB)	Single node, snapshot backups

3.3 Cost Breakdown

Item	Monthly Cost
3x t3.xlarge on-demand	~$290
EKS control plane	~$73
RDS db.t3.medium	~$55
ElastiCache cache.t3.small	~$25
EBS + storage	~$15
Data transfer	~$20
Total	~$480-600

3.4 Key Improvements Over Startup

Managed database with automated backups and point-in-time recovery.
Horizontal Pod Autoscaler for the application tier.
Multi-AZ node placement for compute resilience.
Dedicated Redis for consistent cache performance.

4. Enterprise Tier

Target: High-volume production deployments exceeding 1,000 requests per second with strict availability, compliance, and multi-region requirements.

4.1 Compute

Component	Spec	Notes
EKS Nodes	6+ m6i.xlarge (4 vCPU, 16 GiB)	Multi-AZ, cluster autoscaler
Next.js App	6+ replicas, 2Gi RAM, 1 CPU	HPA + PDB (minAvailable: 3)
Pipeline Workers	3-4 replicas, 2Gi RAM, 1 CPU	Autoscaled on queue depth
Decision Cache	Dedicated Redis read replicas	Sub-millisecond cached decisions

4.2 Data Stores

Component	Spec	Notes
PostgreSQL	RDS db.r6g.large (2 vCPU, 16 GiB) Multi-AZ	Read replicas, IAM auth, encrypted
Redis	ElastiCache cluster (3 shards, 2 replicas)	Cluster mode, auto-failover

4.3 Cost Breakdown

Item	Monthly Cost
6x m6i.xlarge on-demand	~$690
EKS control plane	~$73
RDS db.r6g.large Multi-AZ	~$400
RDS read replica	~$200
ElastiCache cluster (3 shards)	~$450
EBS + storage	~$50
Data transfer + NAT gateway	~$100
WAF + Shield Standard	~$50
Total	~$2,000-2,500

4.4 Key Improvements Over Growth

Multi-AZ RDS with synchronous replication and automatic failover.
Read replicas to offload analytics and reporting queries.
ElastiCache cluster mode for horizontal cache scaling.
Pod Disruption Budgets ensure rolling updates never drop below minimum replicas.
Cluster Autoscaler adjusts node count based on pending pod demand.

5. Component Sizing Guide

5.1 Next.js Application Pods

Metric	Startup	Growth	Enterprise
Replicas	2	3-4 (HPA)	6+ (HPA)
CPU request/limit	250m / 500m	500m / 1	1 / 2
Memory request/limit	512Mi / 1Gi	1Gi / 2Gi	2Gi / 4Gi
HPA target	N/A	70% CPU	70% CPU

5.2 Pipeline Workers

Metric	Startup	Growth	Enterprise
Replicas	1	2	3-4 (KEDA)
CPU request/limit	250m / 500m	500m / 1	1 / 2
Memory request/limit	512Mi / 1Gi	1Gi / 2Gi	2Gi / 4Gi
Scaling trigger	N/A	Manual	Queue depth

5.3 PostgreSQL

Metric	Startup	Growth	Enterprise
Instance	In-cluster pod	RDS db.t3.medium	RDS db.r6g.large Multi-AZ
Storage	20 GiB gp3	50 GiB gp3	200 GiB io2 (3000 IOPS)
Max connections	100	200	500
Backups	Manual	Automated (7 days)	Automated (30 days) + snapshots
Read replicas	0	0	1-2

5.4 Redis

Metric	Startup	Growth	Enterprise
Instance	In-cluster pod	cache.t3.small	Cluster mode (3 shards)
Memory	1 GiB	1.5 GiB	3x 6.5 GiB (19.5 GiB total)
Persistence	None	Snapshot	AOF + snapshot
Failover	None	None	Automatic (Multi-AZ)

6. Monitoring Thresholds and Scaling Triggers

Use the following thresholds to determine when to scale up or transition to the next tier.

6.1 Compute Scaling Triggers

Metric	Warning Threshold	Action
Node CPU utilization (avg)	> 70% sustained	Add nodes or increase instance size
Node memory utilization (avg)	> 75% sustained	Add nodes or increase instance size
Pod CPU throttling	> 10% of periods	Increase CPU limits or add replicas
Pending pods (unschedulable)	> 0 for 5 min	Enable cluster autoscaler or add nodes
HPA at max replicas	Sustained 15 min	Increase maxReplicas or node capacity

6.2 Database Scaling Triggers

Metric	Warning Threshold	Action
RDS CPU utilization	> 70% sustained	Upgrade instance class
RDS freeable memory	< 500 MiB	Upgrade instance class
RDS connection count	> 80% of max	Add read replicas or use PgBouncer
RDS read latency	> 10 ms avg	Add read replica for read-heavy queries
RDS free storage	< 20%	Enable autoscaling or increase volume
RDS IOPS utilization	> 80% of baseline	Upgrade to io2 or increase provisioned

6.3 Cache Scaling Triggers

Metric	Warning Threshold	Action
Redis CPU utilization	> 65%	Upgrade instance or add shards
Redis memory utilization	> 80%	Increase instance size or add shards
Redis evictions	> 0 sustained	Increase memory or review TTL policies
Redis cache hit rate	< 90%	Review cache strategy, increase memory

7. When to Upgrade Tiers

Startup to Growth

Upgrade when any of the following conditions persist for more than one week:

Sustained RPS exceeds 80.
Database connection count regularly exceeds 80.
Application pod CPU consistently above 70%.
Downtime from in-cluster database restarts is unacceptable.
Business requires automated backups or point-in-time recovery.

Growth to Enterprise

Upgrade when any of the following conditions persist:

Sustained RPS exceeds 800.
Decision latency P99 approaches the 200ms SLO limit.
Compliance requirements mandate Multi-AZ database or encryption at rest.
Read replica is needed to offload analytics workloads.
Cache evictions occur despite proper TTL tuning.
Business requires 99.9% or higher availability with automatic failover.

8. Cost Optimization Tips

Reserved Instances: Purchase 1-year reserved instances for predictable node types to save 30-40%.
Spot Instances: Use spot instances for pipeline worker nodes (stateless, tolerant of interruption).
Right-sizing: Review CloudWatch/Prometheus metrics monthly. Downsize over-provisioned instances.
Storage tiering: Use gp3 for general workloads, io2 only when IOPS-bound.
Data transfer: Keep services in the same AZ where possible. Use VPC endpoints for AWS services.
Scheduled scaling: Scale down non-production environments outside business hours.

Get Started

Deploy & Operate

Runbooks

Data Platform

Decisioning Studio

Execute & Optimize

Intelligence

Platform & Security

Integrations

Reports

Release Notes

Documentation Index

​1. Tier Overview

​2. Startup Tier

​2.1 Compute

​2.2 Data Stores

​2.3 Cost Breakdown

​2.4 Limitations

​3. Growth Tier

​3.1 Compute

​3.2 Data Stores

​3.3 Cost Breakdown

​3.4 Key Improvements Over Startup

​4. Enterprise Tier

​4.1 Compute

​4.2 Data Stores

​4.3 Cost Breakdown

​4.4 Key Improvements Over Growth

​5. Component Sizing Guide

​5.1 Next.js Application Pods

​5.2 Pipeline Workers

​5.3 PostgreSQL

​5.4 Redis

​6. Monitoring Thresholds and Scaling Triggers

​6.1 Compute Scaling Triggers

​6.2 Database Scaling Triggers

​6.3 Cache Scaling Triggers

​7. When to Upgrade Tiers

​Startup to Growth

​Growth to Enterprise

​8. Cost Optimization Tips