8 min read

ํ—˜๋‚œํ•œ Kubernetes ์ „ํ™˜๊ธฐ (2) - Amazon EKS์™€ Karpenter

Table of Contents

์ž‘๋…„๋ถ€ํ„ฐ ์‹œ์ž‘ํ•œ Kubernetes ์ „ํ™˜ ์ž‘์—…์ด ์กฐ๊ธˆ์”ฉ ์ง„ํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฒˆ์—๋Š” Azure Kubernetes Service์— ์ด์–ด Amazon EKS ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•˜๋ฉฐ ๊ฒช์—ˆ๋˜ ์ด์Šˆ๋“ค์„ ์ •๋ฆฌํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.

2๋ฒˆ ํƒ€์ž, AWS

AWS๋Š” ์‚ฌ๋‚ด ์ฃผ ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ ๊ธฐ์กด์—๋„ ๋งŽ์€ ์„œ๋น„์Šค๊ฐ€ AWS ๊ธฐ๋ฐ˜์ด์—ˆ๊ณ , ์•„๋ฌด๋ž˜๋„ AWS ํ™˜๊ฒฝ์ด ๋งŽ์€ ๊ฐœ๋ฐœ์ž์—๊ฒŒ ์ต์ˆ™ํ•˜๊ธฐ ๋•Œ๋ฌธ์— Azure๋ณด๋‹ค๋Š” ๋น„๊ต์  ์ˆ˜์›”ํ•˜๊ฒŒ ์ง„ํ–‰ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ผ ๊ธฐ๋Œ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‹ค๋งŒ ์•ž์—์„œ ์ง„ํ–‰ํ•œ Azure ํ™˜๊ฒฝ์ด ๋” ์šฐ์„ ์ˆœ์œ„๊ฐ€ ๋†’์•˜๊ณ , AWS์—์„œ๋Š” 4์›”๋ถ€ํ„ฐ Infra Team EKS ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‹œ์ž‘์œผ๋กœ Kubernetes ํ™˜๊ฒฝ์„ ๊ตฌ์ถ•ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

Infra Team EKS

Infra Team EKS ํด๋Ÿฌ์Šคํ„ฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ชฉ์ ์„ ๊ฐ€์ง€๊ณ  ๊ตฌ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  1. ์—ญํ•  ๋ถ„๋ฆฌ๋ฅผ ๋ช…ํ™•ํ•˜๊ฒŒ ํ•˜๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค.
  2. ํŒ€์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋„๊ตฌ๋“ค์€ ํŠน์„ฑ์ƒ ๊ฑฐ์˜ ๋‚ด๋ถ€์—์„œ๋งŒ ์‚ฌ์šฉ๋˜๊ณ  ์™ธ๋ถ€์—๋Š” ์ œํ•œ์ ์œผ๋กœ๋งŒ ๋…ธ์ถœ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ๋ถ„๋ฆฌํ•ด ๋‘๋Š” ๊ฒŒ ์ข‹๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค.
  3. ํŒŒํŽธํ™”๋˜์–ด ์žˆ๋˜ ์ธํ”„๋ผ ํŒ€์˜ ๊ด€๋ฆฌ ๋ฒ”์œ„๋ฅผ ์ค„์ด๊ณ  ์‹ถ๊ธฐ๋„ ํ–ˆ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ๋ฐฐ์šด ์ ๋“ค์„ ํ™œ์šฉํ•˜์—ฌ ์ดํ›„ EKS ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋” ์‰ฝ๊ฒŒ ๊ตฌ์„ฑํ•˜๊ณ  ์‹ถ์—ˆ๊ธฐ ๋•Œ๋ฌธ์—, ์ตœ๋Œ€ํ•œ ๋งŽ์€ ๊ฒƒ์„ ์กฐ์‚ฌํ•˜๊ณ  ํ…Œ์ŠคํŠธํ•˜๋Š” ๊ณผ์ •์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

AWS ํ™˜๊ฒฝ์œผ๋กœ ๋„˜์–ด์˜ค๋ฉฐ ๋‹ฌ๋ผ์ง„ ์ ์ด ์žˆ๋‹ค๋ฉด, Azure ์ง„์˜์—์„œ๋Š” ํด๋Ÿฌ์Šคํ„ฐ ๊ตฌ์„ฑ์„ ํ•  ๋•Œ Terraform ์ฝ”๋“œ๋ฅผ ์ง์ ‘ ์ž‘์„ฑํ•˜๊ฑฐ๋‚˜ ๋ชจ๋“ˆ๋„ ์ œํ•œ์ ์œผ๋กœ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ, AWS๋Š” ๋ชจ๋“ˆ์ด ์ž˜ ์ •๋ฆฌ๋˜์–ด ์žˆ์–ด ํ™œ์šฉํ•˜๊ธฐ๊ฐ€ ๋งค์šฐ ์ข‹์•˜์Šต๋‹ˆ๋‹ค. ํŠนํžˆ terraform-aws-eks ๋ชจ๋“ˆ์˜ Karpenter ์˜ˆ์ œ1๋ฅผ ๋ณ€ํ˜•ํ•˜์—ฌ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ, ์•ฝ๊ฐ„์˜ ์ˆ˜์ •๋งŒ์œผ๋กœ ์ดˆ๊ธฐ ํ™˜๊ฒฝ์„ ์‰ฝ๊ฒŒ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๋‹ค๋งŒ Karpenter๋Š” ์ถ”๊ฐ€ ์ž‘์—…์ด ํ•„์š”ํ–ˆ๋Š”๋ฐ ์ด๋Š” ๋‚˜์ค‘์— ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

VPC ๊ตฌ์„ฑํ•˜๊ธฐ

EKS ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๊ตฌ์„ฑํ•  ๋•Œ ๊ฐ€์žฅ ์‹ ๊ฒฝ ์ผ๋˜ ๋ถ€๋ถ„์€ ๋„คํŠธ์›Œํฌ ๊ตฌ์„ฑ์ด์—ˆ์Šต๋‹ˆ๋‹ค.
์šฐ์„  VPC ์„ค์ •์ด ํ•„์š”ํ–ˆ๋Š”๋ฐ, ์ด๋Š” VPC ๋ชจ๋“ˆ์„ ํ™œ์šฉํ•˜์—ฌ ๊ตฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. VPC ๋ชจ๋“ˆ์€ ์ž˜ ์‚ฌ์šฉํ•˜๋ฉด ๋„คํŠธ์›Œํฌ ๊ด€๋ จ ์„ค์ •์„ ์ƒ๋‹นํžˆ ์••์ถ•ํ•  ์ˆ˜ ์žˆ๊ณ , ์˜ต์…˜๋„ ๋งค์šฐ ๋งŽ๊ธฐ ๋•Œ๋ฌธ์— README๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ์ž์„ธํ•˜๊ฒŒ ํ™•์ธํ•ด ๋ณด์‹œ๋Š” ๊ฒƒ์„ ์ถ”์ฒœํ•ฉ๋‹ˆ๋‹ค.

EKS์—์„œ ์‚ฌ์šฉํ•  Public, Private ์„œ๋ธŒ๋„ท์„ ๋ชจ๋‘ ์ •์˜ํ•˜๊ณ , ๋น„์šฉ ํšจ์œจ์„ฑ์„ ์œ„ํ•ด Single NAT ๊ตฌ์„ฑ์„ ์ฑ„ํƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์ด๊ฑด Karpenter ์˜ˆ์ œ์—๋„ ์žˆ๋Š” ๋‚ด์šฉ์ธ๋ฐ, AWS Load Balancer Controller์™€ Karpenter๋ฅผ ์œ„ํ•œ ์„œ๋ธŒ๋„ท ํƒœ๊ทธ ์„ค์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

public_subnet_tags = {
  "kubernetes.io/role/elb" = 1
}

private_subnet_tags = {
  "kubernetes.io/role/internal-elb" = 1
  "karpenter.sh/discovery" = var.name
}

ํŠน๋ณ„ํ•œ ๊ฒฝ์šฐ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด ๋ณ€๊ฒฝ๋  ์ผ์ด ์—†๊ฒ ์ง€๋งŒ ์ค‘์š”ํ•œ ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค. ์ด ๋ถ€๋ถ„์ด ์—†์œผ๋ฉด Load balancer๋‚˜ Karpenter ๋…ธ๋“œ๊ฐ€ ๋ฐฐํฌ๋˜๋ฉฐ ์ž๋™์œผ๋กœ ์„œ๋ธŒ๋„ท์„ ์ฐพ์•„ ์—ฐ๊ฒฐํ•˜์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

ํŠธ๋ž˜ํ”ฝ ํ๋ฆ„ ๊ตฌ์„ฑ

์ €ํฌ๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ 2๊ฐ€์ง€ ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ๊ตฌ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

  1. ์™ธ๋ถ€ ํŠธ๋ž˜ํ”ฝ: AWS Load Balancer Controller๋กœ ์ƒ์„ฑ๋œ ALB(Application Load Balancer) ์‚ฌ์šฉ
    • ์™ธ๋ถ€ ํŠธ๋ž˜ํ”ฝ > ALB > Ingress > Service > Pod
  2. ๋‚ด๋ถ€ ํŠธ๋ž˜ํ”ฝ: NGINX Ingress Controller + NLB(Network Load Balancer) ์‚ฌ์šฉ
    • ๋‚ด๋ถ€ ํŠธ๋ž˜ํ”ฝ > NLB > NGINX Ingress > Service > Pod

Azure ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•  ๋•Œ ์™ธ๋ถ€ ๋กœ๋“œ ๋ฐธ๋Ÿฐ์„œ๋Š” L7 ๋กœ๋“œ ๋ฐธ๋Ÿฐ์„œ๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ 2์„ ๋ฐฐ์› ๊ธฐ ๋•Œ๋ฌธ์— ALB๋ฅผ ์‚ฌ์šฉํ•˜์˜€๊ณ , ๋‚ด๋ถ€ ํŠธ๋ž˜ํ”ฝ์€ ๊ฐ„ํŽธํ•˜๊ฒŒ NLB + NGINX Ingress Controller๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ–ˆ์Šต๋‹ˆ๋‹ค.

์ด์™ธ์— ์‚ฌ๋‚ด VPN์„ ๋‚ด๋ถ€ ํ†ต์‹ ์— ์—ฐ๋™ํ•˜๋Š” ์ž‘์—…๋„ ์žˆ์—ˆ์ง€๋งŒ ์ด ๊ธ€์˜ ๋ฒ”์œ„๋ฅผ ๋ฒ—์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์— ์ƒ๋žตํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

Karpenter ๊ฐœ์„ ํ•˜๊ธฐ

๊ธฐ๋ณธ์ ์ธ ๊ตฌ์„ฑ์„ ํ•˜๊ณ , ํ…Œ์ŠคํŠธ ์•ฑ์„ ๋ฐฐํฌํ•ด ๋ณด๋ฉด์„œ ๋„คํŠธ์›Œํฌ ํ…Œ์ŠคํŠธ์™€ ๊ธฐ๋ณธ์ ์ธ Karpenter ๋™์ž‘์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.
์ด๋Œ€๋กœ ๋ฌธ์ œ๊ฐ€ ์—†๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์ง€๋งŒ, ๋ณธ๊ฒฉ์ ์œผ๋กœ ์‹ค์ œ ์„œ๋น„์Šค ๋ฐฐํฌ ๊ณผ์ •์—์„œ ๋ถ€์กฑํ•œ ๋ถ€๋ถ„๋„ ์žˆ์—ˆ๊ณ , ์‹œํ–‰์ฐฉ์˜ค๋„ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์ค‘์‹ฌ์—๋Š” ์˜ˆ์ œ ์ฝ”๋“œ๋Œ€๋กœ๋งŒ ๊ตฌ์„ฑํ–ˆ๋˜ Karpenter๊ฐ€ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

Karpenter๋ž€?

์•ž์—์„œ ๋ช‡ ๋ฒˆ ์–ธ๊ธ‰ํ•˜๊ธด ํ–ˆ์ง€๋งŒ, ์—ฌ๊ธฐ์„œ๋ถ€ํ„ฐ ์ œ๋Œ€๋กœ ์ •๋ฆฌํ•ด ๋‘๋ ค ํ•ฉ๋‹ˆ๋‹ค.
Karpenter๋Š” Kubernetes ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋…ธ๋“œ๋ฅผ ๋™์ ์œผ๋กœ ํ”„๋กœ๋น„์ €๋‹ํ•˜๋Š” ์˜คํ† ์Šค์ผ€์ผ๋Ÿฌ์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ Cluster Autoscaler์™€ ๋‹ฌ๋ฆฌ, Karpenter์˜ ๋…ธ๋“œ ๊ด€๋ฆฌ ๋ฐฉ์‹์€ ๋” ํšจ์œจ์ ์ด๊ณ  ์œ ์—ฐํ•ฉ๋‹ˆ๋‹ค.

  1. Karpenter๋Š” ์ง€์†์ ์œผ๋กœ Pod์˜ ์ƒํƒœ๋ฅผ ๊ด€์ฐฐํ•˜์—ฌ ์Šค์ผ€์ค„๋ง๋œ ํ˜„ ์ƒํƒœ์™€ ์Šค์ผ€์ค„๋ง์ด ๋˜์ง€ ์•Š๋Š” ๋ฆฌ์†Œ์Šค๋ฅผ ํŒŒ์•…ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์กฐ๊ฑด ๋‚ด์˜ ์ตœ์ ์˜ VM์„ ์„ ์ •ํ•˜์—ฌ ํ”„๋กœ๋น„์ €๋‹ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ตœ๋Œ€ํ•œ ๋ฒ„๋ฆฌ๋Š” ๋ฆฌ์†Œ์Šค๋ฅผ ์ค„์ด๊ณ , Spot ์ธ์Šคํ„ด์Šค๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋“ฑ ๋น„์šฉ ๋ฉด์—์„œ๋„ ์ด์ ์„ ๊ฐ€์ง‘๋‹ˆ๋‹ค.
  2. ๋˜ํ•œ Karpenter๋Š” EC2 API๋ฅผ ์ง์ ‘ ํ˜ธ์ถœํ•˜์—ฌ ์ธ์Šคํ„ด์Šค๋ฅผ ํ”„๋กœ๋น„์ €๋‹ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์†๋„๋„ ๋งค์šฐ ๋น ๋ฆ…๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ ์ผ๋ฐ˜์ ์ธ ์ƒํ™ฉ์—์„œ Karpenter๊ฐ€ ํ•„์š”ํ•œ ์ธ์Šคํ„ด์Šค๋ฅผ ํŒ๋‹จํ•˜๊ณ  ํ”„๋กœ๋น„์ €๋‹ํ•˜๋Š”๋ฐ 1๋ถ„์ด ๊ฑธ๋ฆฌ์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. Azure์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ Cluster Autoscaler๊ฐ€ 3~4๋ถ„ ์ •๋„์˜ ์‹œ๊ฐ„์ด ๊ฑธ๋ ธ๋˜ ๊ฒƒ์„ ๊ฐ์•ˆํ•˜๋ฉด ๋งค์šฐ ๋น ๋ฅด๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ €ํฌ ํšŒ์‚ฌ๋Š” ํ˜„์žฌ๋„ ํด๋ผ์šฐ๋“œ ๋น„์šฉ์œผ๋กœ ์ธํ•œ ์ง€์ถœ์ด ๋งŽ์•„ ๋น„์šฉ์— ๋งค์šฐ ๋ฏผ๊ฐํ–ˆ๊ณ , EKS๋ฅผ ์“ด๋‹ค๋ฉด Karpenter๋Š” ๊ผญ ์‚ฌ์šฉํ•ด์•ผ ํ•œ๋‹ค๋Š” ํŒ€์›๋“ค์˜ ๊ธ์ •์ ์ธ ์˜๊ฒฌ๋„ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

Karpenter์˜ ๋™์ž‘ ๋ฐฉ์‹

Karpenter ๊ตฌ์กฐ

Image: Karpenter Overview from Karpenter official homepage, Apache 2.0

์ž์„ธํ•œ ๊ตฌ์กฐ๋Š” ๋” ๋ณต์žกํ•˜๊ฒ ์ง€๋งŒ, Karpenter์˜ ๊ตฌ์กฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ๋…ธ๋“œ๋ฅผ ์กฐ์œจํ•˜๋Š” Karpenter controller๊ฐ€ ์žˆ๊ณ ,
  • Karpenter controller๊ฐ€ ์ •์˜๋œ EC2NodeClass์™€ NodePool์„ ๋ฐ”ํƒ•์œผ๋กœ Self-managed ๋…ธ๋“œ ํ’€์„ ์ •์˜ํ•˜๊ณ  ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ Karpenter์˜ ๋™์ž‘ ๋ฐฉ์‹์„ ์ฒ˜์Œ์—๋Š” ์ดํ•ดํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ์˜คํ•ดํ–ˆ๋˜ ๋‚ด์šฉ์€ ํฌ๊ฒŒ 2๊ฐ€์ง€์˜€์Šต๋‹ˆ๋‹ค.

  • Karpenter controller๊ฐ€ ๋ฐฐ์น˜๋˜๋Š” ๊ณณ์€ ๊ด€๋ฆฌ ๋…ธ๋“œ ํ’€์ด๋ผ๋Š” ๊ฒƒ, ๊ทธ๋ž˜์„œ ์‹œ์Šคํ…œ์—์„œ ๊ด€๋ฆฌํ•˜๊ณ , Auto-scaling๋„ Managed ๋…ธ๋“œ ํ’€์˜ ์„ค์ •์„ ๋”ฐ๋ผ๊ฐ„๋‹ค๋Š” ์ 
  • Karpenter๊ฐ€ ์ƒ์„ฑํ•˜๋Š” ๋…ธ๋“œ๋Š” ๋ณ„๋„์˜ NodePool๊ณผ EC2NodeClass ๋ฆฌ์†Œ์Šค ์ •์˜๋ฅผ ์ฐธ์กฐํ•˜๊ณ , ๊ด€๋ฆฌ ๋…ธ๋“œ ํ’€๊ณผ๋Š” ๊ด€๋ จ์ด ์—†๋‹ค๋Š” ์ 
    • ์ฐธ๊ณ ๋กœ ์ด๋ ‡๊ฒŒ Karpenter ๋…ธ๋“œ ํ’€์ด Self-managed ํ˜•ํƒœ๋กœ ๊ด€๋ฆฌ๋˜๋Š” ๊ฒƒ์€ ์˜๋„๋œ ๊ฒƒ์ด๊ณ , System์—์„œ ๊ด€๋ฆฌ๋˜์ง€ ์•Š๋Š” ๊ฒƒ์ด ์ •์ƒ์ž…๋‹ˆ๋‹ค.
    • ํ•˜์ง€๋งŒ ์˜ˆ์ œ์˜ ๋ฆฌ์†Œ์Šค๋Š” ์ผ๋ฐ˜์ ์ธ ์ผ€์ด์Šค๋กœ ์ •์˜๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ตœ์ ํ™”์™€ ์ˆ˜์ • ์ž‘์—…์ด ํ•„์š”ํ–ˆ์Šต๋‹ˆ๋‹ค.

์•ฑ์„ ๋ฐฐํฌํ•˜๋Š” ๊ณผ์ •์—์„œ Karpenter ๋…ธ๋“œ๊ฐ€ ๋ณผ๋ฅจ ์„ค์ •๊ณผ Auto-scaling์ด ์›ํ™œํ•˜๊ฒŒ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ๋ฐœ๊ฒฌํ–ˆ์„ ๋•Œ, ์ฒ˜์Œ์—๋Š” Managed ๋…ธ๋“œ ํ’€์˜ ์„ค์ •์„ ๋ณ€๊ฒฝํ•˜๋ฉด ํ•ด๊ฒฐ๋  ๊ฒƒ์ด๋ผ ์ƒ๊ฐํ•ด์„œ ํ˜ผ์„ ์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ์กฐ์‚ฌ๋ฅผ ํ•˜๊ณ  ๋‚˜์„œ์•ผ ๊ณ ์ณ์•ผ ํ•  ๋ถ€๋ถ„์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๊ณ , ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์„ค์ •์„ ๋ณ€๊ฒฝํ–ˆ์Šต๋‹ˆ๋‹ค.

gp3 ๋ณผ๋ฅจ ์‚ฌ์šฉํ•˜๊ธฐ

gp3 ๋ณผ๋ฅจ์€ ์ฐจ์„ธ๋Œ€ EBS ๋ณผ๋ฅจ์œผ๋กœ, ๊ธฐ์กด gp2 ๋ณผ๋ฅจ์— ๋น„ํ•ด IOPS๊ฐ€ ๋†’๊ณ  ๋น„์šฉ๋„ ๋” ์ €๋ ดํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ณดํ†ต์€ gp3 ๋ณผ๋ฅจ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์€๋ฐ, EKS๋Š” ๋ณ„๋„ ์„ค์ •์„ ํ•˜์ง€ ์•Š์„ ๊ฒฝ์šฐ ๊ธฐ๋ณธ์ ์œผ๋กœ gp2 ๋ณผ๋ฅจ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
์ด์ ๋„ ๋งŽ๊ณ  ์„œ๋น„์Šค๋ฅผ ๋ฐฐํฌํ•  ๋•Œ gp3 ๋ณผ๋ฅจ์„ ๊ฐ€์ •ํ•˜๊ณ  StorageClass๋ฅผ ์„ค์ •ํ•˜๋Š” ๊ฒฝ์šฐ๋„ ๋งŽ๊ธฐ ๋•Œ๋ฌธ์—, gp3 ๋ณผ๋ฅจ์„ ๋ช…์‹œ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋„๋ก ์„ค์ •ํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค.

์œ„์—์„œ ์„ค๋ช…ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ Karpenter ๋…ธ๋“œ์— gp3 ๋ณผ๋ฅจ์„ ์‚ฌ์šฉํ•˜๋„๋ก ์„ค์ •ํ•˜๋ ค๋ฉด ํ•ด๋‹น ๋…ธ๋“œ๊ฐ€ ์ฐธ๊ณ ํ•˜๋Š” EC2NodeClass ๊ฐ์ฒด๋ฅผ ์ˆ˜์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
์˜ˆ๋ฅผ ๋“ค์–ด, ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ค์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      volumeSize: 100Gi
      volumeType: gp3
      iops: 3000
      throughput: 125
      encrypted: true
      deleteOnTermination: true
  - deviceName: /dev/xvdb
    ebs:
      volumeSize: 20Gi
      volumeType: gp3
      iops: 3000
      throughput: 125
      encrypted: true
      deleteOnTermination: true

์—ฌ๊ธฐ์„œ /dev/xvda, /dev/xvdb๋Š” ๊ฐ๊ฐ ํŒŒ์ผ ์‹œ์Šคํ…œ ํŒŒํ‹ฐ์…˜์„ ์˜๋ฏธํ•˜๋Š”๋ฐ,

  • /dev/xvda๋Š” ๋ฃจํŠธ ํŒŒ์ผ ๋ณผ๋ฅจ,
  • /dev/xvdb๋Š” ๋ฐ์ดํ„ฐ ํŒŒ์ผ ๋ณผ๋ฅจ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

์ด ๋ณผ๋ฅจ์€ ์ ์ ˆํ•˜๊ฒŒ ์„ค์ •ํ•ด ์ฃผ์–ด์•ผ ํ•˜๋Š”๋ฐ, ๋ณดํ†ต์€ ๋ฃจํŠธ ํŒŒ์ผ ๋ณผ๋ฅจ์ธ dev/xvda๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ Bottlerocket AMI์ฒ˜๋Ÿผ ๋ณผ๋ฅจ์„ ๋ถ„๋ฆฌํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ Docker ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ ธ์˜ค๋Š” ์ž‘์—… ๋“ฑ์— ๋ฐ์ดํ„ฐ ํŒŒ์ผ ๋ณผ๋ฅจ์ธ dev/xvdb๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.3

์ฐธ๊ณ ๋กœ Managed ๋…ธ๋“œ ํ’€์— gp3 ๋ณผ๋ฅจ์„ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

block_device_mappings = {
  xvda = {
    device_name = "/dev/xvda"
    ebs = {
      volume_size           = 10
      volume_type           = "gp3"
      iops                  = 3000
      throughput            = 125
      encrypted             = true
      delete_on_termination = true
    }
  }
  xvdb = {
    device_name = "/dev/xvdb"
    ebs = {
      volume_size           = 20
      volume_type           = "gp3"
      iops                  = 3000
      throughput            = 125
      encrypted             = true
      delete_on_termination = true
    }
  }
}

(06.09 ์ถ”๊ฐ€) ๋ณผ๋ฅจ์„ ์„ค์ •ํ•œ ๋’ค ๋‹ค์Œ๊ณผ ๊ฐ™์ด Storage Class๋ฅผ ์„ค์ •ํ•ด์•ผ ๊ธฐ๋ณธ Storage Class๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
  type: gp3
  encrypted: "true"

์—ฌ๊ธฐ์„œ Provisioner๋Š” ebs.csi.aws.com์ด์–ด์•ผ EC2์—์„œ ์ œ๋Œ€๋กœ PVC๊ฐ€ ํ• ๋‹น๋ฉ๋‹ˆ๋‹ค.

IAM Role ์„ค์ •ํ•˜๊ธฐ

Karpenter๊ฐ€ ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์—ฌ๋Ÿฌ IAM Role ์„ค์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

  1. EBS CSI ๋“œ๋ผ์ด๋ฒ„์— IRSA Role์„ ๋ถ€์—ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด Role์ด ์—†์œผ๋ฉด Karpenter ๋…ธ๋“œ ์ƒ์„ฑ, ๋˜๋Š” ๋…ธ๋“œ๊ฐ€ ์‚ฌ๋ผ์ง€๊ณ  ์ž”์—ฌ ๋ณผ๋ฅจ์„ ์ •๋ฆฌํ•  ๋•Œ ๋ฌธ์ œ๊ฐ€ ์ƒ๊น๋‹ˆ๋‹ค.
    • ์‹ค์ œ๋กœ EBS ๋ณผ๋ฅจ์„ ์‚ฌ์šฉํ•˜๋Š” PVC๊ฐ€ Karpenter ๋…ธ๋“œ๊ฐ€ ์—†์–ด์ง„ ๋’ค์—๋„ ํšŒ์ˆ˜๋˜์ง€ ์•Š๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
    • Role์€ ๋ชจ๋“ˆ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋‹จํžˆ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.4
module "ebs_csi_irsa_role" {
  source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"

  role_name             = "${var.name}-ebs-csi-role"
  attach_ebs_csi_policy = true

  oidc_providers = {
    ex = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"]
    }
  }

  tags = var.tags
}
  1. Karpenter controller์— IRSA Role์„ ๋ถ€์—ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด Role์€ Karpenter๊ฐ€ EC2 ์ธ์Šคํ„ด์Šค๋‚˜ Launch template ๋“ฑ์„ ์ƒ์„ฑํ•˜๊ณ  ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด ์ค๋‹ˆ๋‹ค.
    • Karpenter ๋ชจ๋“ˆ์—์„œ enable_irsa๋ฅผ true๋กœ ์„ค์ •ํ•˜๊ณ  ๊ด€๋ จ ์„ค์ •์„ ์ถ”๊ฐ€ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
  2. Karpenter node IAM role์— EBS CSI ๋“œ๋ผ์ด๋ฒ„ ์ •์ฑ…์„ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
    • ์ด๋ฅผ ํ†ตํ•ด Karpenter ๋…ธ๋“œ์—์„œ EBS ๋ณผ๋ฅจ์„ ์ƒ์„ฑํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
module "karpenter" {
  create = var.enable_karpenter
  source = "terraform-aws-modules/eks/aws//modules/karpenter"

  enable_irsa = true
  irsa_oidc_provider_arn = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = [
    "kube-system:karpenter"
  ]

  node_iam_role_additional_policies = {
    AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
    AmazonEBSCSIDriverPolicy     = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
  }

  tags = var.tags
}
  1. ๋์œผ๋กœ Karpenter ์„œ๋น„์Šค ๊ณ„์ •์— ์—ญํ•  ARN์„ ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/karpenter-controller-role

Spot ์ธ์Šคํ„ด์Šค ์‚ฌ์šฉํ•˜๊ธฐ

Karpenter ๋…ธ๋“œ๊ฐ€ Auto-scaling์„ ํ•˜์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ™•์ธํ•ด ๋ณด๋‹ˆ, Spot ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜์ง€ ๋ชปํ•˜๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. Karpenter๋Š” Spot ์ธ์Šคํ„ด์Šค๋ฅผ ์šฐ์„ ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ณ , ๋ณ„๋„ ๊ถŒํ•œ์ด ์—†์œผ๋ฉด Spot ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜์ง€ ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‘ ๊ฐ€์ง€ ์„ค์ •์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค:

  1. ์„œ๋น„์Šค ์—ฐ๊ฒฐ ์—ญํ• ๋กœ Spot ์ธ์Šคํ„ด์Šค ๊ถŒํ•œ์„ ๋ถ€์—ฌํ–ˆ์Šต๋‹ˆ๋‹ค.
resource "aws_iam_service_linked_role" "spot" {
  aws_service_name = "spot.amazonaws.com"
}
  1. NodePool ๋ฆฌ์†Œ์Šค์— karpenter.sh/capacity-type ์†์„ฑ์œผ๋กœ ์š”๊ตฌ์‚ฌํ•ญ์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜๋Š” ์ œํ•œ ์—†์ด ๋ชจ๋“  ์ธ์Šคํ„ด์Šค ํƒ€์ž…์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค์ •ํ•œ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.
requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot", "on-demand", "reserved"]

๋งˆ์น˜๋ฉฐ

์ด๋ ‡๊ฒŒ ๋ชจ๋“  ์„ค์ •์„ ์™„๋ฃŒํ•œ ๋’ค์—๋Š” ์‹ค์ œ ์„œ๋น„์Šค ๋ฐฐํฌ์—๋„ ๋ฌธ์ œ๊ฐ€ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.
๋ฌด์ž‘์ • ๋„์ž…ํ–ˆ๋‹ค๊ฐ€ ์ œ๋Œ€๋กœ ์‚ฌ์šฉ์„ ๋ชป ํ•  ์ˆ˜๋„ ์žˆ์—ˆ๋Š”๋ฐ, ๋ฏธ๋ฆฌ ๋ฌธ์ œ๋ฅผ ๋ฐœ๊ฒฌํ•˜๊ณ  ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์–ด ๋‹คํ–‰์ด์—ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋„ค์š”.

๋‹น์—ฐํžˆ ํ˜„์žฌ Karpenter ์„ค์ •์ด ์™„๋ฒฝํ•œ ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ๋ช‡ ๊ฐ€์ง€ ์„ค์ •์„ ์ถ”๊ฐ€ํ•œ ๊ฒƒ ์™ธ์—๋Š” ํฌ๊ฒŒ ๋ณ€๊ฒฝ์„ ๊ฑฐ์น˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์— ์šด์˜ ํ™˜๊ฒฝ์—์„œ๋Š” ์ธ์Šคํ„ด์Šค ํƒ€์ž…์ด๋‚˜ ์•„ํ‚คํ…์ฒ˜ ๋“ฑ์„ ์ œํ•œํ•˜๋Š” ์„ค์ •์„ ์ถ”๊ฐ€ํ•ด์•ผ ํ•  ๊ฒƒ์ด๊ณ , ๊ถ๊ทน์ ์œผ๋กœ๋Š” Weighted NodePool ์„ค์ •์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋“ฑ5 ์ข€ ๋” ์ตœ์ ํ™”๋œ ์„ค์ •์ด ํ•„์š”ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Karpenter๋ฅผ ํฌํ•จํ•ด EKS๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ณผ์ •์—์„œ ๊ทธ๋ž˜๋„ Azure์— ๋น„ํ•ด ์ฐธ๊ณ ํ•  ์ž๋ฃŒ๊ฐ€ ๋งŽ์•„์กŒ์ง€๋งŒ, ์—ฌ์ „ํžˆ ์šด์˜ ํ™˜๊ฒฝ์€ ๋‹ค๋ฅด๋‹ค๋Š” ๊ฒƒ์„ ํ•œ ๋ฒˆ ๋” ๋А๋ผ๊ธฐ๋„ ํ–ˆ์Šต๋‹ˆ๋‹ค.
ํŒ€ ์ฐจ์›์—์„œ Cloud ํ™˜๊ฒฝ์— Kubernetes ์ ์šฉ์ด ์ฒ˜์Œ์ด๋ผ๋Š” ์ , ๊ทธ๋ฆฌ๊ณ  ์—ฌ๊ธฐ์— ๋ชจ๋‘ ์ ์ง€๋Š” ์•Š์•˜์ง€๋งŒ ๊ธฐ์กด ์„œ๋น„์Šค์™€ ๋™์ผํ•œ ์กฐ๊ฑด์„ ๊ฐ€์ ธ๊ฐ€๊ธฐ ์œ„ํ•ด ์„ค์ •ํ•  ๋ถ€๋ถ„๋„ ๋งŽ์•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋„ EKS ํด๋Ÿฌ์Šคํ„ฐ ํ•˜๋‚˜๊ฐ€ ์˜จ์ „ํžˆ ๊ตฌ์„ฑ๋˜์—ˆ๊ณ , ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์šด์˜ ํ™˜๊ฒฝ EKS๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์ ์€ ์‹œ๊ฐ„์ด ๊ฑธ๋ ธ์Šต๋‹ˆ๋‹ค.

์ด์ œ ์ƒˆ๋กœ ๋ฐฐํฌํ•˜๋Š” ์„œ๋น„์Šค๋Š” EKS ํ™˜๊ฒฝ์—์„œ ๋ฐฐํฌ๋ฅผ ๊ธฐ๋ณธ์œผ๋กœ ํ•˜๊ณ , ๊ธฐ์กด ์„œ๋น„์Šค๋“ค๋„ ์˜ฎ๊ธฐ๋Š” ์ž‘์—…์„ ์ง„ํ–‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ ํŒ€ ๋‚ด์—์„œ๋„ Argo CD, Harbor ๋“ฑ ๊ธฐ์กด์— ์‚ฌ์šฉํ•˜์ง€ ๋ชปํ–ˆ๋˜ ๋„๊ตฌ๋“ค์„ ๊ฒ€์ฆํ•˜๊ณ  ์ ์šฉํ•˜์—ฌ Cloud Native ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•˜๋Š” ๊ฒƒ์ด ์ตœ์ข…์ ์ธ ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.

๋‹ค์†Œ ์†๋„๊ฐ€ ๋А๋ ค์งˆ ์ˆ˜๋Š” ์žˆ์ง€๋งŒ, ์•ˆ์ •์ ์ด๊ณ  ํƒ„๋ ฅ์ ์ธ ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก ์—ฌ์œ ๋ฅผ ๊ฐ€์ง€๊ณ  ์ง„ํ–‰ํ•˜๋ ค ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ ์™ธ ์ฐธ๊ณ  ์ž๋ฃŒ

Footnotes

  1. https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/karpenter โ†ฉ

  2. https://blog.haulrest.me/blog/250225-01/#azure-load-balancer%EC%99%80-azure-application-gateway โ†ฉ

  3. https://awslabs.github.io/data-on-eks/docs/bestpractices/scalability/preload-container-images โ†ฉ

  4. https://github.com/terraform-aws-modules/terraform-aws-iam/tree/master/examples/iam-role-for-service-accounts-eks โ†ฉ

  5. https://karpenter.sh/docs/concepts/scheduling/#weighted-nodepools โ†ฉ