Scenario / Questions

In the ecs cluster, I have a service running with 2 ec2 instances. And i update the task definition to take the new docker image. But the old task definition is still running even though there is a new task definition.

I have used the following commands to update the task definition and service.

aws ecs register-task-definition --family service90-task --cli-input-json file://service90-task.json

aws ecs update-service --cluster service90-cluster --service service90-service --desired-count 0

TASK_REVISION=`aws ecs describe-task-definition --task-definition service90-task | egrep "revision" | tr "/" " " | awk '{print $2}' | sed 's/"$//'`

aws ecs update-service --cluster service90-cluster --service service90-service --task-definition service90-task:${TASK_REVISION} --desired-count 2

I tried several times but can’t figure out where i went wrong. I want to get the ecs service to run the new task definition instead of the old one.

Find below all possible solutions or suggestions for the above questions..

Suggestion: 1

As I found out later on, the reason for not updating the task is that the desired count is set to 2 and there are only 2 EC2 instances available. So the ECS agent tries to retain the desired count even though the task has been updated.

Solution – Have one extra EC2 instance (in this case 3 EC2 instances). Or have one extra instance than the preferred number of tasks.

In this way the new task definition can run on the extra instance. After it is stabilized on the extra EC2 instance, the ECS agent will drain the connection on the other two instances for the old task definition, while the load-balancer redirect the traffic to the updated instance. The ECS agent replaces the old task definition with the new ones. And then it maintains the desired count as 2.

Suggestion: 2

An alternative solution is to set the Minimum healthy percent deployment option of the service to 0, which results in the existing tasks being stopped prior to the new version being deployed.

This allows single ec2 instance clusters to be used, with the associated cost savings, etc

Not suitable for production as you will have downtime between deployments

Suggestion: 3

To update a task-definition in the “tasks” running in the service
You need to delete the tasks and Start a new task.

In this way, I solve the problem of updating task-definition in tasks

I have written the following code :

    # Register a new Task definition 
    aws ecs register-task-definition --family testing-cluster --cli-input-json file://scripts/taskdefinition/testingtaskdef.json --region $AWS_REGION

    # Update Service in the Cluster
    aws ecs update-service --cluster $CLUSTER_NAME --service $SERVICE --task-definition testing-cluster --desired-count 1 --region $AWS_REGION 



    DECRIBED_SERVICE=$(aws ecs describe-services --region $AWS_REGION --cluster $CLUSTER_NAME --services $SERVICE);
    CURRENT_DESIRED_COUNT=$(echo $DECRIBED_SERVICE | jq --raw-output ".services[0].desiredCount")
    #    - echo $CURRENT_DESIRED_COUNT

    CURRENT_TASK_REVISION=$(echo $DECRIBED_SERVICE | jq -r ".services[0].taskDefinition")
    echo "Current Task definition in Service" + $CURRENT_TASK_REVISION

    CURRENT_RUNNING_TASK=$(echo $DECRIBED_SERVICE | jq -r ".services[0].runningCount")
    echo $CURRENT_RUNNING_TASK

    CURRENT_STALE_TASK=$(echo $DECRIBED_SERVICE | jq -r ".services[0].deployments | .[] | select(.taskDefinition != \"$CURRENT_TASK_REVISION\") | .taskDefinition")
    echo "Task defn apart from current service Taskdefn" +  $CURRENT_STALE_TASK
    #   - echo $CURRENT_STALE_TASK

    tasks=$(aws ecs --region $AWS_REGION list-tasks --cluster $CLUSTER_NAME | jq -r '.taskArns | map(.[40:]) | reduce .[] as $item (""; . + $item + " ")')
    echo "Tasks are as follows" 
    echo $tasks
    TASKS=$(aws ecs --region $AWS_REGION describe-tasks --cluster $CLUSTER_NAME --task $tasks);
    #    - echo $TASKS
    OLDER_TASK=$(echo $TASKS | jq -r ".tasks[] | select(.taskDefinitionArn!= \"$CURRENT_TASK_REVISION\") | .taskArn | split(\"/\") | .[1] ")
    echo "Older Task running  " + $OLDER_TASK
    for old_task in $OLDER_TASK; do
        aws ecs --region us-east-1 stop-task --cluster $CLUSTER_NAME --task $old_task
    done    

    # Run new tasks with the updated new Task-definition
    aws ecs --region $AWS_REGION run-task --cluster $CLUSTER_NAME --task-definition $CURRENT_TASK_REVISION

Suggestion: 4

I’ve been scratching my head around this for a long time and never found a viable solution anywhere until last week.

AWS has just release its new API where they have –force option for service removal.
Target Group issue you’re facing is simply because the Target Group registered with your your task is already deleted and you cannot bind a new Target Group to it. So since this task and service is corrupted now, only way to deal is to delete it, you cannot update it anymore.

You can use this command to delete your service now; it was impossible last week!

aws ecs delete-service --service my-http-service --force true

Hope this helps