Demystifying AWS Load Balancing and Auto Scaling in ECS Fargate

1. Introduction

The introduction provides an overview of the article's focus, explaining that it aims to demystify AWS load balancing and auto scaling concepts in the context of ECS Fargate. It may briefly touch on the significance of these features for scalable and highly available applications.

2. Purpose

This section states the purpose of the article, which is to educate readers about AWS load balancing and auto scaling, specifically in the context of ECS Fargate. It sets the reader's expectations for the content that follows.

3. What is AWS Load balancer?

Amazon Web Services (AWS) provides several types of load balancers to distribute incoming traffic across multiple targets, such as Amazon EC2 instances, containers, and IP addresses, to ensure high availability, fault tolerance, and efficient resource utilization. Load balancers play a crucial role in improving the performance and reliability of applications hosted in the cloud. Here are the main types of AWS load balancers:

Application Load Balancer (ALB):

ALB operates at the application layer (Layer 7) and is best suited for routing HTTP/HTTPS traffic.
It supports advanced routing and content-based routing using rules and listeners.
ALB can distribute traffic to multiple targets, such as EC2 instances, ECS tasks, and Lambda functions.
It can be integrated with AWS WAF (Web Application Firewall) for security enhancements.

Network Load Balancer (NLB):

NLB operates at the transport layer (Layer 4) and is designed to handle high levels of traffic efficiently.
It's suited for TCP/UDP-based traffic and provides ultra-low latencies.
NLB can distribute traffic to targets within the same or different Availability Zones.

Gateway Load Balancer (GWLB):

GWLB is used for routing traffic to third-party appliances, such as virtual firewalls, intrusion detection systems, and more.
It operates at the application layer and supports Layer 7 policies.

Benefits of using AWS load balancers include:

High Availability: Load balancers help distribute traffic to healthy targets, ensuring that your applications remain available even if some targets experience failures.
Auto Scaling: Load balancers work seamlessly with auto scaling groups, enabling your application to scale up or down based on demand.
Security: Load balancers can offload SSL/TLS encryption and decryption, enhancing the security of your applications.
Health Checks: Load balancers perform regular health checks on targets to ensure they are responsive and healthy.
Efficient Traffic Distribution: Load balancers intelligently distribute traffic to achieve optimal resource utilization and performance.
Centralized Management: Load balancers are managed services, reducing the operational overhead for configuring and maintaining complex load balancing setups.

4. What is auto scaling in ECS (Fargate launch type)?

Auto Scaling in Amazon Elastic Container Service (ECS) with AWS Fargate is a feature that automatically adjusts the number of running Fargate tasks or services based on the defined scaling policies and conditions. It ensures that your application has the right amount of compute resources to handle varying levels of traffic or workload demands without manual intervention. With ECS Fargate Auto Scaling, you can set up rules and conditions that determine when to scale your tasks or services up or down. This allows you to maintain application performance and availability while optimizing costs by dynamically adjusting the number of tasks or services running in response to changes in load. Here's how Auto Scaling works in ECS Fargate: Scaling Policies:

You define scaling policies that determine when and how to scale. These policies are based on conditions such as CPU utilization, memory utilization, or custom CloudWatch metrics.

Target Values:

For each scaling policy, you set target values for the metric you're monitoring (e.g., CPU utilization). When the metric breaches these thresholds, the Auto Scaling mechanism is triggered.

Scaling Actions:

When the defined metric breaches the thresholds, Auto Scaling takes action based on your scaling policies. If the metric exceeds the target, Auto Scaling can scale out by adding more Fargate tasks or services. If the metric falls below the target, it can scale in by reducing the number of tasks or services.

Integration with CloudWatch Alarms:

Auto Scaling policies can be associated with CloudWatch alarms that monitor the specified metrics. When an alarm state changes, such as crossing a threshold, the Auto Scaling policy is triggered.

Adjusting Capacity:

Based on the scaling actions, ECS Fargate will automatically adjust the number of tasks or services to match the desired capacity set in your scaling policy.

Health Checks:

Before performing any scaling actions, ECS Fargate ensures that the newly launched tasks are healthy and ready to accept traffic.

ECS Fargate Auto Scaling offers several benefits:

Optimal Resource Utilization: Auto Scaling ensures that you have the right amount of resources to handle the load, preventing overprovisioning or underutilization.
Application Performance: By scaling out during traffic spikes, you maintain application performance and responsiveness for users.
Cost Optimization: Auto Scaling helps you reduce costs by automatically scaling in during periods of low demand, minimizing the use of unnecessary resources.
Operational Efficiency: Auto Scaling eliminates the need for manual intervention in adjusting resources, allowing your applications to adapt to workload changes without human involvement.

5. Creating AWS Load balancing to Fargate with AWS console

This section provides a step-by-step guide on how to set up AWS Load Balancing for Fargate using the AWS Management Console. It covers creating load balancers, defining target groups, and integrating them with Fargate services. Step 1. Create target group Each target group is used to route requests to one or more registered targets. When a rule condition is met, traffic is forwarded to the corresponding target group. Go to Load balancer feature in EC2 service → go to Target group → click to Create target group:

For Choose a target type,Instances to register targets by instance ID, IP addresses to register targets by IP address, or Lambda function to register a Lambda function as a target. If your service's task definition uses the awsvpc network mode (which is required for the Fargate launch type), you must choose IP addresses as the target type This is because tasks that use the awsvpc network mode are associated with an elastic network interface, not an Amazon EC2 instance. In this context, we choose IP addresses type.

You can skip the target selection, we will update later. Then, create target group. Step 2. Create Load balancer Navigate to Load Balancers → click to Create load balancer:

In this blog, we use Application Load Balancer, please choose it:

Configure basic configuration:

For Scheme, choose Internet-facing or Internal.

An internet-facing load balancer routes requests from clients to targets over the internet. An internal load balancer routes requests to targets using private IP addresses.

For IP address type, choose the IP adressing for the containers subnets.

In Network mapping, choose the VPC and the subnets for your load balancer.

For VPC, select the same VPC that you used for the container instances on which you intend to run your service.
For Mappings, select the Availability Zones to use for your load balancer. If there is one subnet for that Availability Zone, it is selected. If there is more than one subnet for that Availability Zone, select one of the subnets. You can select only one subnet per Availability Zone. Your load balancer subnet configuration must include all Availability Zones that your container instances reside in.

Choose Security group and Target group: Note: Please make sure your security group has enough rules for access from Internet like port 80 and port 443. Then, create load balancer. Waiting for the state of Load balancer from “Provisioning” to “Active”. Step 3. Create ECS Fargate service with Load balancer Assuming you already have an ECS Fargate cluster and task definition. Now, you just need to create a new service in your cluster. Go to your ECS cluster → In the Services tab → Choose “Create”. In Deployment configuration, choose your task definition, enter the service name and desired tasks. You should enter the number of tasks is greater than one because we have ALB here, we want to load balancing between many tasks.

In Networking, you can leave them by default or configure with another VPC:

In Load Balancing, please choose the ALB you just created in the previous step, choose the existing target group also:

Now, the service is ready to create, click to create. Waiting for the service is up, here is the result:

With DevOps knowledge: 8-10 hours Without DevOps knowledge: 4-5 days

6. Creating AWS Load balancing to Fargate with PrismScaler

Prism Scaler provides you with an intuitive model and a concise form, you just need to fill in the necessary information and press create, now PrismScaler will automatically build a simple web application/system that moves on a container on AWS.

On AWS, PrismScaler will help you create these resources:

One load balancer, public
Fargate container N (variable), private
Role, SecurityGroup, cluster associated with the above

PrismScaler provides many architectural patterns suitable for many use cases:

5-15 minutes (without DevOps knowledge)

7. Explain auto scaling and no auto scaling cases

Auto Scaling in ECS Fargate: Auto scaling in ECS Fargate refers to the dynamic adjustment of the number of running tasks or services based on the defined scaling policies and conditions. This feature helps ensure that your applications can handle varying levels of traffic or workload demands while optimizing resource usage and maintaining high availability. Here's how auto scaling works in ECS Fargate:

Scaling Policies: You define scaling policies that specify under which conditions the number of tasks or services should be increased or decreased. These conditions are usually based on metrics like CPU utilization, memory usage, or custom CloudWatch metrics.
Target Values: For each scaling policy, you set target values for the specified metrics. When these metrics breach the thresholds, the auto scaling mechanism is triggered.
Scaling Actions: Once the defined metrics cross the thresholds, auto scaling takes action. If the metric exceeds the target, ECS Fargate scales out by adding more tasks or services. If the metric falls below the target, it scales in by reducing the number of tasks or services.
Integration with CloudWatch Alarms: Auto scaling policies can be linked to CloudWatch alarms that monitor the specified metrics. When an alarm state changes, such as crossing a threshold, the auto scaling policy is triggered.
Dynamic Adjustment: Based on scaling actions, ECS Fargate automatically adjusts the number of tasks or services to match the desired capacity set in your scaling policy.

No Auto Scaling Case in ECS Fargate: In a non-auto scaling scenario, you manually determine the number of tasks or services running in your ECS Fargate environment. There are no dynamic adjustments based on metrics or conditions. You explicitly configure and maintain the desired task count. Here's how a non-auto scaling case works:

Manual Configuration: You manually set the number of tasks or services you want to run in your ECS Fargate cluster. This value remains static unless you manually change it.
No Dynamic Adjustments: There is no automatic adjustment of tasks or services based on workload changes or metrics. If the demand on your application changes, you need to manually adjust the task count.
Resource Provisioning: You're responsible for ensuring that you have enough resources provisioned to handle expected workloads. This could lead to either underutilization during periods of low demand or overutilization during traffic spikes.
Simple Configuration: Non-auto scaling configurations are often simpler to set up, but they require regular manual monitoring and adjustment to ensure optimal resource usage and performance.

8. Reference

The reference section lists the sources or external materials used in the article, allowing readers to explore further or verify the information provided. https://docs.aws.amazon.com/AmazonECS/latest/userguide/create-application-load-balancer.html#alb-configure-routing

AWS

Azure

GCP

目次