# Deploying to Amazon EKS

Complete guide to deploying self-hosted Spacelift Flows on Amazon EKS using OpenTofu and Helm.

This guide provides a way to quickly get Spacelift Flows up and running on an Elastic Kubernetes Service (EKS) cluster. The infrastructure is deployed using OpenTofu, and the application services are deployed using Helm charts.

Currently, agents have to run outside the Kubernetes cluster, e.g., on an EC2 Auto Scaling Group. We will soon provide a fully Kubernetes-native solution.

## Overview
[Section titled “Overview”](#overview)
This deployment creates a complete Spacelift Flows instance with the following components:

 - **EKS Auto Mode cluster** for container orchestration
 - **RDS Aurora PostgreSQL** for the database
 - **S3 bucket** for object storage
 - **KMS encryption** for data at rest
 - **ACM certificates** for SSL/TLS
 - **Agent pool** deployed via Terraform module

The following services will be deployed as Kubernetes pods using the Helm charts:

 - The server.
 - The worker.
 - The gateway.

The server hosts the Spacelift Flows HTTP API and serves the embedded frontend assets. The server is exposed to the outside world through an Application Load Balancer for HTTP traffic, including the OAuth and MCP endpoints required for external integrations.

The worker is the component that handles recurring tasks and asynchronous jobs.

The gateway is a service that hosts the WebSocket server for agents and routes JavaScript evaluations to the right agents/runtimes.

The agent pool is deployed as an EC2 ASG and consists of an agent service. Agent services handle requests from the gateway and distribute execution commands.

## Requirements
[Section titled “Requirements”](#requirements)
Before starting, ensure you have:

### AWS Prerequisites
[Section titled “AWS Prerequisites”](#aws-prerequisites)
 - AWS CLI configured with appropriate permissions
 - Access to an AWS account with the following service limits:
 - EKS clusters: At least 1 available
 - RDS Aurora clusters: At least 1 available
 - VPC: At least 1 available (or use existing)
 - NAT Gateways: At least 1 available per AZ
 - Elastic IPs: At least 1 available per NAT Gateway

### Tools Required
[Section titled “Tools Required”](#tools-required)
 - [OpenTofu](https://opentofu.org/docs/intro/install/) >= 1.6.0 (or [Terraform](https://www.terraform.io/downloads) >= 1.5.0)
 - [kubectl](https://kubernetes.io/docs/tasks/tools/) for Kubernetes management
 - [Helm](https://helm.sh/docs/intro/install/) >= 3.0 for application deployment
 - [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) >= 2.0

### Domain Requirements
[Section titled “Domain Requirements”](#domain-requirements)
 - A registered domain name with DNS management access
 - Ability to create DNS records for certificate validation

### Optional Requirements
[Section titled “Optional Requirements”](#optional-requirements)
 - SMTP server for email notifications (recommended). You can enable Amazon SES by setting the Terraform variable `enable_ses = true`.
 - [Anthropic API key](https://console.anthropic.com/) for AI features (recommended)
 - OpenTelemetry collector endpoint for observability

## Deploy Infrastructure
[Section titled “Deploy Infrastructure”](#deploy-infrastructure)
The infrastructure deployment uses a modular approach with OpenTofu to provision all necessary AWS resources.

### 1. Prepare the Environment
[Section titled “1. Prepare the Environment”](#1-prepare-the-environment)
First, ensure your AWS CLI is configured with the appropriate credentials and region:


*Terminal window*

```
# Verify AWS CLI configurationaws sts get-caller-identity
# Set your preferred region and export as environment variableexport TF_VAR_aws_region=us-west-2
```

### 2. Create Working Directory
[Section titled “2. Create Working Directory”](#2-create-working-directory)
Create a new directory for your infrastructure deployment:


*Terminal window*

```
mkdir spacelift-flows-infracd spacelift-flows-infra
```

### 3. Create Infrastructure Configuration
[Section titled “3. Create Infrastructure Configuration”](#3-create-infrastructure-configuration)
Create a `main.tf` file that references the Spacelift Flows infrastructure module:


**

```
terraform {  required_version = ">= 1.5.0"  required_providers {    aws = {      source  = "hashicorp/aws"      version = "~> 6.0"    }    random = {      source  = "hashicorp/random"      version = "~> 3.4"    }  }}
provider "aws" {  region = var.aws_region}
module "spacelift_flows" {  source = "github.com/spacelift-io/terraform-aws-eks-spacelift-flows-selfhosted?ref=v0.2.0"
  # Required variables  app_domain        = var.app_domain  organization_name = var.organization_name  admin_email       = var.admin_email  aws_region        = var.aws_region  license_token     = var.license_token  anthropic_api_key = var.anthropic_api_key # optional  expose_gateway    = true
  # Email configuration (choose one of the following options):
  # Option 1: Dev mode - emails logged to server logs (testing only)  # Useful for initial setup, but not suitable for production  # email_dev_enabled = true
  # Option 2: Custom SMTP server  # smtp_host         = "smtp.example.com"  # smtp_port         = 587  # smtp_username     = ""  # smtp_password     = ""  # smtp_from_address = "noreply@yourcompany.com"
  # Option 3: Amazon SES (recommended for AWS deployments)  # Requires domain verification and production access request (see Verify Deployment section)  # enable_ses = true
  # Optional variables  k8s_namespace     = var.k8s_namespace  admin_password    = var.admin_password}
# Uncomment to deploy the agent pool when the spacelift flows backend services are deployed.# module "spacelift_flows_agent_pool" {#   source             = "github.com/spacelift-io/terraform-aws-spacelift-flows-agentpool-ec2?ref=v0.4.0"#   agent_pool_id      = module.spacelift_flows.agent_pool_id#   agent_pool_token   = module.spacelift_flows.agent_pool_token#   backend_endpoint   = "https://${var.app_domain}"#   gateway_endpoint   = "https://gateway.${var.app_domain}"#   agent_image_tag    = var.spacelift_flows_image_tag##   reuse_vpc_id         = module.spacelift_flows.vpc_id#   reuse_vpc_subnet_ids = module.spacelift_flows.vpc_private_subnet_ids#   aws_region           = var.aws_region#   min_size             = 1#   desired_capacity     = 5#   max_size             = 10# }
```

See more examples with different configurations in [the GitHub repository](https://github.com/spacelift-io/terraform-aws-eks-spacelift-flows-selfhosted).

If you want to reuse Spacelift Self-Hosted RDS, create a Postgres database:


**

```
CREATE DATABASE flows;
```

We also suggest creating a separate user.

And then you should build a connection URL like this:


**

```
database_connection_url = format("postgres://%s:%s@%s:5432/flows", module.spacelift.rds_username, urlencode(module.spacelift.rds_password), module.spacelift.rds_cluster_endpoint)
```

### 4. Create Variables File
[Section titled “4. Create Variables File”](#4-create-variables-file)
Create a `variables.tf` file with variable definitions:


**

```
# Required variablesvariable "app_domain" {  description = "The domain name for the Spacelift Flows instance"  type        = string}
variable "organization_name" {  description = "Name of the organization"  type        = string}
variable "admin_email" {  description = "Email address for the admin user"  type        = string}
variable "aws_region" {  description = "AWS region for deployment"  type        = string}
variable "license_token" {  description = "The JWT token for using Spacelift flows. Only required for generating the kubernetes_secrets output."  type        = string  sensitive   = true}
variable "k8s_namespace" {  type = string}
variable "spacelift_flows_image_tag" {  type = string}
variable "anthropic_api_key" {  description = "Anthropic API key for AI features"  type        = string  default     = ""}
variable "admin_password" {  description = "Admin password for initial login without SMTP (min 32 characters)"  type        = string  sensitive   = true  default     = ""}
```

### 5. Create Outputs File
[Section titled “5. Create Outputs File”](#5-create-outputs-file)
Create an `outputs.tf` file to expose important values:


**

```
output "config_secret_manifest" {  description = "Outputs manifests that are needed to configure a secret for the Flows app."  value     = module.spacelift_flows.config_secret_manifest  sensitive = true}
output "ingress_manifest" {  description = "Outputs manifests that are needed to configure AWS ingress."  value = module.spacelift_flows.ingress_manifest}
# Uncomment if deploying in a private VPC# output "internal_ingress_manifest" {#   description = "Outputs internal manifests that are needed to configure internal AWS ingress."#   value = module.spacelift_flows.internal_ingress_manifest# }

output "shell" {  value     = module.spacelift_flows.shell}
```

### 6. Set Environment Variables
[Section titled “6. Set Environment Variables”](#6-set-environment-variables)
Configure your deployment using environment variables. Start with the minimum required variables:

Choose a domain you control and can create DNS records for. Subdomains work well (e.g., `flows.yourcompany.com`).


*Terminal window*

```
# Required configurationexport TF_VAR_app_domain="flows.yourcompany.com"export TF_VAR_organization_name="Your Organization"export TF_VAR_admin_email="admin@yourcompany.com"export TF_VAR_license_token=""export TF_VAR_k8s_namespace="spacelift-flows"export TF_VAR_spacelift_flows_image_tag="0.6.0"# Optional configurationexport TF_VAR_anthropic_api_key="" # optionalexport TF_VAR_admin_password=""    # optional, at least 32 chars, enables login without SMTP
```

Flows has an incredibly powerful and helpful AI assistant, which can help you quickly build and debug Flows. The AI assistant requires an API key from Anthropic to work, and you can get it by signing up at [https://console.anthropic.com/](https://console.anthropic.com/) .

While Flows will work without it, we strongly advise not skipping this.

### 7. Initialize OpenTofu
[Section titled “7. Initialize OpenTofu”](#7-initialize-opentofu)
Initialize the working directory and download required providers:


*Terminal window*

```
tofu init
```

### 8. Review the Deployment Plan
[Section titled “8. Review the Deployment Plan”](#8-review-the-deployment-plan)
Generate and review the execution plan:


*Terminal window*

```
tofu plan
```

### 9. Deploy the Infrastructure
[Section titled “9. Deploy the Infrastructure”](#9-deploy-the-infrastructure)
Apply the configuration to create the infrastructure:


*Terminal window*

```
tofu apply
```

When prompted, type `yes` to confirm the deployment.

### 10. Verify Infrastructure Deployment
[Section titled “10. Verify Infrastructure Deployment”](#10-verify-infrastructure-deployment)
Once applied, you should grab all variables that need to be exported in the shell that will be used in next steps. We expose a shell output in tofu that you can source directly for convenience.


*Terminal window*

```
# Source in your shell all the required env vars to continue the installation process$(tofu output -raw shell)
```

### 11. Configure kubectl Access
[Section titled “11. Configure kubectl Access”](#11-configure-kubectl-access)
Configure kubectl to access your new EKS cluster:


*Terminal window*

```
# Update kubeconfigaws eks update-kubeconfig --region $TF_VAR_aws_region --name $EKS_CLUSTER_NAME
# Create a namespacekubectl create namespace $TF_VAR_k8s_namespace
```

### 12. Validate the certificates
[Section titled “12. Validate the certificates”](#12-validate-the-certificates)
You can skip this step if you already provided an issued certificate as a `cert_arn` variable.

After the infrastructure is deployed, you should validate that the ACM certificates are properly issued and configured:


*Terminal window*

```
# Get the certificate ARN from OpenTofu outputsCERT_ARN=$(aws acm list-certificates --region $TF_VAR_aws_region --query "CertificateSummaryList[?DomainName=='$TF_VAR_app_domain'].CertificateArn" --output text)
# Check certificate statusaws acm describe-certificate --certificate-arn $CERT_ARN --region $TF_VAR_aws_region --query "Certificate.Status" --output text
```

The certificate status should show `ISSUED`. If it shows `PENDING_VALIDATION`, you need to validate the certificate by creating the required DNS records:


*Terminal window*

```
# Get DNS validation recordsaws acm describe-certificate --certificate-arn $CERT_ARN --region $TF_VAR_aws_region --query "Certificate.DomainValidationOptions[*].ResourceRecord" --output table
```

Create the CNAME records shown in the output in your DNS provider. The certificate will automatically be issued once DNS validation is complete (usually within a few minutes).

You can monitor the validation status:


*Terminal window*

```
# Check validation statusaws acm describe-certificate --certificate-arn $CERT_ARN --region $TF_VAR_aws_region --query "Certificate.DomainValidationOptions[*].ValidationStatus" --output text
```

Once all domains show `SUCCESS`, the certificate is ready to use.

### Infrastructure Deployment Complete
[Section titled “Infrastructure Deployment Complete”](#infrastructure-deployment-complete)
At this point, you have successfully deployed:

 - ✅ **VPC** with public and private subnets
 - ✅ **EKS Auto Mode cluster** ready for workloads
 - ✅ **RDS Aurora PostgreSQL** database cluster
 - ✅ **S3 bucket** for object storage with encryption
 - ✅ **KMS key** for encryption at rest
 - ✅ **IAM roles and policies** for service access

The next step is to deploy the application services using the generated Kubernetes manifests.

## Deploy Application Services
[Section titled “Deploy Application Services”](#deploy-application-services)
### 1. Apply Configuration Secret
[Section titled “1. Apply Configuration Secret”](#1-apply-configuration-secret)
Create the main configuration secret:


*Terminal window*

```
tofu output -raw config_secret_manifest | kubectl apply -f -
```

### 2. Apply Ingress Configuration
[Section titled “2. Apply Ingress Configuration”](#2-apply-ingress-configuration)
Create an AWS ingress class:


*Terminal window*

```
tofu output -raw ingress_manifest | kubectl apply -f -
```

**For private VPC deployments**: Use the internal ingress class instead:


*Terminal window*

```
tofu output -raw internal_ingress_manifest | kubectl apply -f -
```

### 3. Deploy Core Services
[Section titled “3. Deploy Core Services”](#3-deploy-core-services)
Create helm values file for the spacelift flows installation:


*Terminal window*

```
cat > flows-values.yaml <<EOFappDomain: $TF_VAR_app_domain
global:  image:    tag: $TF_VAR_spacelift_flows_image_tag
ingress:  className: "spacelift-flows"  exposeGateway: true  annotations: {    alb.ingress.kubernetes.io/healthcheck-port: "8080",    alb.ingress.kubernetes.io/healthcheck-path: "/health",    alb.ingress.kubernetes.io/scheme: internet-facing # Change to "internal" for private VPC deployments  }EOF
```

Install the main Spacelift Flows services using the module reference:


*Terminal window*

```
helm repo add spacelift https://downloads.spacelift.io/helmhelm repo update
helm upgrade spacelift-flows spacelift/spacelift-flows --install -f flows-values.yaml --namespace $TF_VAR_k8s_namespace
```

Monitor deployment progress:


*Terminal window*

```
kubectl get pods -n $TF_VAR_k8s_namespace --watch
```

Wait for the load balancer to be provisioned:


*Terminal window*

```
kubectl get ingress -n $TF_VAR_k8s_namespace --watch
```

### 4. Update DNS Records
[Section titled “4. Update DNS Records”](#4-update-dns-records)
Once the ingress has an external IP/hostname, create DNS records:


*Terminal window*

```
# Get the load balancer hostnameLB_HOSTNAME=$(kubectl get ingress spacelift-flows -n $TF_VAR_k8s_namespace -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "Create the following DNS records:"echo "$TF_VAR_app_domain CNAME $LB_HOSTNAME"echo "*.endpoints.$TF_VAR_app_domain CNAME $LB_HOSTNAME"echo "oauth.$TF_VAR_app_domain CNAME $LB_HOSTNAME"echo "mcp.$TF_VAR_app_domain CNAME $LB_HOSTNAME"echo "gateway.$TF_VAR_app_domain CNAME $LB_HOSTNAME"
```

## Deploy Agent Pool
[Section titled “Deploy Agent Pool”](#deploy-agent-pool)
### 1. Uncomment the spacelift_flows_agent_pool module
[Section titled “1. Uncomment the spacelift_flows_agent_pool module”](#1-uncomment-the-spacelift_flows_agent_pool-module)
Now that all backend services are running, uncomment the spacelift_flows_agent_pool module.

### 2. Run Tofu Apply
[Section titled “2. Run Tofu Apply”](#2-run-tofu-apply)

**

```
tofu apply
```

## Verify Deployment
[Section titled “Verify Deployment”](#verify-deployment)
### 1. Configure SES (if using SES)
[Section titled “1. Configure SES (if using SES)”](#1-configure-ses-if-using-ses)
If you enabled Amazon SES for email delivery (`enable_ses = true`), you must verify your domain in SES and either request production access or verify individual email addresses as identities before emails can be sent.

### 2. Access the Web Interface
[Section titled “2. Access the Web Interface”](#2-access-the-web-interface)
Navigate to your domain in a web browser:


**

```
https://$SPACELIFT_FLOWS_DOMAIN
```

You should see the Spacelift Flows login page.

### 3. Login
[Section titled “3. Login”](#3-login)
On first access, you’ll be prompted to provide your email to log in. If you have correctly configured your SMTP credentials, you should receive an email with a login link. Alternatively, if you have enabled email developer mode sending, you will find the login link in the logs.

#### Admin Password Login
[Section titled “Admin Password Login”](#admin-password-login)
If you don’t have SMTP configured yet, you can set an `admin_password` variable in the Terraform module to enable password-based login for the admin user. The password must be at least 32 characters long.


**

```
module "spacelift_flows" {  # ...  admin_password = var.admin_password}
```

Once configured, navigate to `https://$SPACELIFT_FLOWS_DOMAIN/auth/admin-login` to sign in using the admin email and password. This is useful for initial setup, allowing you to configure alternative login methods (such as OIDC) without needing SMTP.

### 4. Test Agent Pool
[Section titled “4. Test Agent Pool”](#4-test-agent-pool)
 1. Log into the web interface
 1. Navigate to the sample flows in your project and verify that they work

### 5. Health Checks
[Section titled “5. Health Checks”](#5-health-checks)
Verify all services are healthy:


*Terminal window*

```
kubectl get pods -n $TF_VAR_k8s_namespace
```

All pods should show `STATUS: Running` and `READY: 1/1`.

## Configuration Options
[Section titled “Configuration Options”](#configuration-options)
### Scaling the Deployment
[Section titled “Scaling the Deployment”](#scaling-the-deployment)
#### EKS Node Scaling
[Section titled “EKS Node Scaling”](#eks-node-scaling)
EKS Auto Mode handles node scaling automatically based on pod resource requests.

### Resource Allocation
[Section titled “Resource Allocation”](#resource-allocation)
Update resource requests and limits in values files:


**

```
# For application servicesapplicationServices:  resources:    requests:      memory: "512Mi"      cpu: "500m"    limits:      memory: "1Gi"      cpu: "1000m"
```

## Troubleshooting
[Section titled “Troubleshooting”](#troubleshooting)
### Common Issues
[Section titled “Common Issues”](#common-issues)
#### Pods Stuck in Pending
[Section titled “Pods Stuck in Pending”](#pods-stuck-in-pending)
Check node capacity and resource requests:


*Terminal window*

```
kubectl describe nodeskubectl get pods -n $TF_VAR_k8s_namespace -o wide
```

EKS Auto Mode should automatically provision nodes, but may take 5-10 minutes.

#### Agent Connection Problems
[Section titled “Agent Connection Problems”](#agent-connection-problems)
Check agent logs in CloudWatch Logs.

### Logs and Debugging
[Section titled “Logs and Debugging”](#logs-and-debugging)
#### Application Logs
[Section titled “Application Logs”](#application-logs)

*Terminal window*

```
# Server logskubectl logs -n $TF_VAR_k8s_namespace -l app.kubernetes.io/component=server -f
# Worker logskubectl logs -n $TF_VAR_k8s_namespace -l app.kubernetes.io/component=worker -f
# Gateway logskubectl logs -n $TF_VAR_k8s_namespace -l app.kubernetes.io/component=gateway -f
```

#### Infrastructure Logs
[Section titled “Infrastructure Logs”](#infrastructure-logs)

*Terminal window*

```
# Check EKS cluster statusaws eks describe-cluster --name $EKS_CLUSTER_NAME --region $TF_VAR_aws_region
```

## Maintenance
[Section titled “Maintenance”](#maintenance)
### Updates
[Section titled “Updates”](#updates)
#### Application Updates
[Section titled “Application Updates”](#application-updates)
Update image tags in Helm values and upgrade:


*Terminal window*

```
helm upgrade spacelift-flows spacelift/spacelift-flows --namespace $TF_VAR_k8s_namespace --values flows-values.yaml
```

#### Infrastructure Updates
[Section titled “Infrastructure Updates”](#infrastructure-updates)
Update OpenTofu configuration and apply:


*Terminal window*

```
tofu plantofu apply
```

#### Re-Apply Configuration Secret
[Section titled “Re-Apply Configuration Secret”](#re-apply-configuration-secret)
Update the main configuration secret:


*Terminal window*

```
tofu output -raw config_secret_manifest | kubectl apply -f -
```

#### Re-Apply Ingress Configuration
[Section titled “Re-Apply Ingress Configuration”](#re-apply-ingress-configuration)
Update the AWS ingress class:


*Terminal window*

```
tofu output -raw ingress_manifest | kubectl apply -f -
```

Or for private VPC deployments:


*Terminal window*

```
tofu output -raw internal_ingress_manifest | kubectl apply -f -
```

#### Force Restart Application Pods
[Section titled “Force Restart Application Pods”](#force-restart-application-pods)
Restart application pods without changing the deployment configuration. This is useful when you need to pick up configuration changes from Secrets, or force a fresh start of the application:


*Terminal window*

```
# Restart server podskubectl rollout restart -n spacelift-flows deployment spacelift-flows-server
# Restart worker podskubectl rollout restart -n spacelift-flows deployment spacelift-flows-worker
# Restart gateway podskubectl rollout restart -n spacelift-flows deployment spacelift-flows-gateway
# Monitor the server rollout statuskubectl rollout status -n spacelift-flows deployment spacelift-flows-server
```

## Cleanup
[Section titled “Cleanup”](#cleanup)
This will permanently destroy all data. Ensure you have backups if needed.

### Remove Application Services
[Section titled “Remove Application Services”](#remove-application-services)

*Terminal window*

```
helm uninstall spacelift-flows --namespace $TF_VAR_k8s_namespacekubectl delete namespace $TF_VAR_k8s_namespace
```

### Destroy Infrastructure
[Section titled “Destroy Infrastructure”](#destroy-infrastructure)
Ensure that you have disabled RDS delete protection and S3 bucket retain on destroy:


**

```
s3_retain_on_destroy = falserds_delete_protection_enabled = false
```


*Terminal window*

```
tofu destroy
```

### Manual Cleanup
[Section titled “Manual Cleanup”](#manual-cleanup)
Some resources may require manual cleanup:

 - ACM certificates (if DNS validation records weren’t removed)
 - Route53 DNS records created manually

## Advanced Settings
[Section titled “Advanced Settings”](#advanced-settings)
For advanced configuration options including private VPC deployment, HTTP proxy configuration, and custom CA certificates, see the [Advanced Settings](../advanced-settings) guide.

## Next Steps
[Section titled “Next Steps”](#next-steps)
Configure traces with [OpenTelemetry operator](https://github.com/open-telemetry/opentelemetry-operator).