How to setup EKS on AWS with terraform

After setup of several kubernetes clusters i would like to share how we do it. I hope this helps people to get start with kubernetes. But also im keen to read your feedback and improvement proposals.

Why Terraform

From time to to time i do explore terraform, in log term since it appearence (See my exploration on AWS automation). Basically i think this tool (as many other from Hashicorp) just had right idea to the right time. Let me explain: The nature of the IT infrastructure is more or less static. No surprize, that declarative approach is very situative here. And here terraform creators seem to have clear vision on this when they evolve terraform language design and elaborated tooling around it. Terraformes has become favorit tool for cloud resources provisioning in many teams.

Meanwhile the concept of "state" finally evolved and found place in new hashicorp terraform cloud (with free tier for small or mid-size projects). It's very convinient and impoves teamworking. Btw. i'm not affilatet with Hashicorp.

Bootstraping

Ok, how do i start with it. Well I usually start with AWS Sub Account for the project. It makes sence anyway especially if there is no connection to other projects or parts of your systems. See AWS Organization. New Organization has at least one user whith enough rights. However for terraform i do addtional user 'terraform user' that has sufficient rights to create my enviroments. Basically i give him AdministratorAccess

From this point on we can bootstrap terraform.

Terrafom config

# See docu https://learn.hashicorp.com/tutorials/terraform/cloud-workspace-configure
provider "aws" {  
  region = "us-west-1" 
}

#https://www.terraform.io/docs/backends/types/remote.html
terraform {  
  backend "remote" {
    hostname = "app.terraform.io"
    organization = "YourOrga"

    workspaces {
      name = "your-aws-infa-workspace"
    }
  }
}

This is basically everything for bootstraping.

This setup assumes you have aws-cli installed and configured1 on your development Maschine, meaning your local terraform executions are able to conntect to AWS API.
Secondly it assumes that terraform state will be hostet and terrafor cloud: backend "remote"

Provided configuration changes your local terrafom workspace to be the remote one.

To be more specific. The terraform cloud can be operated in two ways:

  1. Hosting only your terrafrom state
  2. Hosting state but also be single point of the change and the hostory of that change (full remote operations)

Here i'm talking about the second scenario, where terraform apply is only possibly from the cloud UI only. Please also keep in mind that, when using terraform cloud, the terraform plan and other commands will use variable values from the associated Terraform Cloud workspace. So it's TF-Cloud where you should configure access to you AWS account with the secret key of your terraform IAM user or role.

To summ this up: You need an account at: https://app.terraform.io
And you're can enable team-members not only participate in commiting terraform code, but also for provisioning the infrastruture, without sharing and maintaning admin credentials to the AWS cloud.

Last step is to setup a Workspace and connect your git repository with it. Now you can provision the Infrastructure.
At the end your terraform rund of terraform plan and terraform apply will look something like this:
and

Provisioning Network

# VPC for kubernetes and all other cluster related resources

resource "aws_vpc" "main" {  
  cidr_block           = "10.100.0.0/16"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name      = "main"
    managedby = "terraform"
  }
}

The subnets. The number of your Subnets should correspond to the number of AZ in the Region.

## ==== Kubernetes subnets =====

#us-west-2a
resource "aws_subnet" "eks_a" {  
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.100.1.0/24"
  availability_zone = "us-west-2a"
  map_public_ip_on_launch = true

  tags = {
    managedby                                          = "terraform"
    Name                                               = "EKS, AZ a"
    "kubernetes.io/cluster/${local.cluster_name}"      = "shared"
    "kubernetes.io/role/elb"                           = "1"
    "kubernetes.io/role/internal-elb"                  = "1"
  }
}

#us-west-2b
resource "aws_subnet" "eks_b" {  
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.100.2.0/24"
  availability_zone = "us-west-2b"
  map_public_ip_on_launch = true

  tags = {
    managedby                                          = "terraform"
    Name                                               = "EKS, AZ b"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                           = "1"
    "kubernetes.io/role/internal-elb"                  = "1"
  }
}

#us-west-2c
resource "aws_subnet" "eks_c" {  
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.100.3.0/24"
  availability_zone = "us-west-2c"
  map_public_ip_on_launch = true

  tags = {
    managedby                                          = "terraform"
    Name                                               = "EKS, AZ c"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                           = "1"
    "kubernetes.io/role/internal-elb"                  = "1"
  }

}

This is pretty forward, for details consult Terraform Docu on Resource: aws_subnet, for the kubernetes cluster the provided tags are of interest.
The tags are used by AWS EKS to understand where to put automatically requested LoadBalancers.
ESK requires special subnet tagging
kubernetes.io/role/elb with cluster name. The rest of it is up to you and not much pitfalls here except: map_public_ip_on_launch = true. This is needed because in this scenario i use public sub-nets. EKS Master nodes are managed by AWS and are deployed outside of my VPC while workers inside my VPC need to accessed their masters. So they need to have public IP addresses. The Fine grane access to the worker nodes is defined by Security Groups later. If it not suitable for you there is more defensive option to use private Workers with private sub-nets (not covered here).

Well and while we establihing the communication with Public Addresses, in the AWS universum we need the Internet Gateway.

resource "aws_internet_gateway" "igw1" {  
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "main-igw1"
    managedby         = "terraform"
  }
}

resource "aws_route" "route_to_igw1" {  
  route_table_id            = "rtb-someId" #haven't found better way than hardcoding so far.
  destination_cidr_block    = "0.0.0.0/0"
  gateway_id                =  aws_internet_gateway.igw1.id

}


With that basic networking is in place and it's time for kubernetes.

The EKS Cluster

We have good experinces with the official Terraform EKS module.
Later versions of it utilize special terraform kubernetes provider for the provisioning of cluster Users and roles. So my examples show it.

## 1 Cluster Module starts here.

module "eks_cluster" {  
  source                 = "terraform-aws-modules/eks/aws"
  version                = "12.2.0"
  cluster_name           = "${local.cluster_name}"
  cluster_version        = "1.17"
  subnets                = ["${aws_subnet.eks_a.id}", "${aws_subnet.eks_b.id}", "${aws_subnet.eks_c.id}"]
  vpc_id                 = "${aws_vpc.main.id}"
  cluster_create_timeout = "30m" # need to increase module defaults
  write_kubeconfig       = false # Disabled permanent writing of config files
  providers = {
    # Reference to kuberntes provider, see below
    kubernetes = kubernetes.eks_cluster
  }

  manage_aws_auth = true //TODO enable it https://github.com/terraform-aws-modules/terraform-aws-eks/issues/699

  node_groups_defaults = {
    ami_type  = "AL2_x86_64" #alternative is e.g. AL2_x86_64_GPU
    disk_size = 50
  }


  node_groups = {
    # EKS managed Nodes group with name prefix "ram"
    ram = {
      desired_capacity = 3
      max_capacity     = 10
      min_capacity     = 1

      public_ip = true

      instance_type = "r5.large"
      #Labels for nodes and tags
      k8s_labels = {
        node_type = "default"
      }
      # Resource tags are not labels
      additional_tags = {
        managedby   = "terraform"
      }
    }
  }

  #Users
  # Can be checked with: kubectl describe configmap -n kube-system aws-auth
  map_users = [
    {
      userarn  = "${aws_iam_user.my_addtionaluser.arn}"
      username = "${aws_iam_user.my_addtionaluser.name}"
      groups   = ["system:masters"]
    }
  ]

  tags = {
    managedby   = "terraform"
  }
}

## Configuration of kubernetes provides starts here
data "aws_eks_cluster" "eks-cluster" {  
  name = module.eks_cluster.cluster_id
}

data "aws_eks_cluster_auth" "eks-cluster" {  
  name = module.eks_cluster.cluster_id
}

provider "kubernetes" {  
  host                   = data.aws_eks_cluster.eks-cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks-cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.eks-cluster.token
  load_config_file       = false
  alias   = "eks_cluster"
  version = "~> 1.10"
}

I think this the configuration is more or less self-explanatory.
It starts with EKS module, that takes a list of agruments like version, list of users and List of node groups with Details to machines inside of such group.
Also reference to kubernetes provider is present.
The configuration is the kubernetes provider has to be placed here but is basically reference to the cluster. See EKS Modul documentation for Details.

Alternatives

Of course there are alternatives and you can exchange any of the tools here. Meanwile the EKS has evolved (and got a bit simlier) and also terraform *nativ" Resource eks_cluster hase evolved. I'm interested in your experienced with it... Share if you like.

Recapitulation

Thank you for reading. Now you have some idea how to install Kubernetes cluster on AWS (AWS EKS). You know how to prepare terraform for it and what are the major AWS Resources to create. You have even working code that will provision a cluster if you define Variable ${local.cluster_name} and the addtional user correctly...
Happy to read your experinces and improvement proposals!