Geekflare is supported by our audience. We may earn affiliate commissions from buying links on this site.
In DevOps Last updated: August 14, 2023
Share on:
Invicti Web Application Security Scanner – the only solution that delivers automatic verification of vulnerabilities with Proof-Based Scanning™.

Canary deployment is a technique of software development and deployment that executes a gradual release of new features or updates to a small subset of users before rolling out to the entire user base.

This approach involves creating a new version of the software and deploying it to a small group of users while keeping the old version running for the rest of the users. The development team monitors the new version closely to ensure that it is stable and performing as expected.

If everything goes well, the new version rolls out to more users until it eventually hits the entire user base. In this way, the project team minimizes the risk of introducing bugs or other issues that could impact all users at once.

YouTube video

The purpose of Canary deployment is to reduce the risk of introducing new features to a large user base. By gradually rolling out changes to users, developers can monitor the performance and stability of the new version. They make any necessary adjustments before deploying to the entire user base. Transition to the new version is therefore done much smoother.

Key Principles and Benefits

image-16
Source: martinfowler.com

The key principles of Canary deployment include the following:

  1. Deploy the new version to a small subset of users first and then gradually roll it out to more users over time.
  2. Closely monitor the new version to ensure it is stable and performing as expected.
  3. If any issues arise, roll back the deployment to the previous version quickly and easily.
  4. Automate the deployment process as much as possible to reduce the risk of human error.

The benefits of Canary deployment in DevOps include:

  1. By gradually rolling out changes, you minimize the risk of introducing bugs or other issues that could impact all users at once.
  2. Developers can receive feedback on the new version more quickly, allowing them to make any necessary adjustments before deploying to the entire user base.
  3. By monitoring the performance and stability of the new version, developers can ensure that it meets the necessary quality standards before deploying to the entire user base.
  4. Canary deployment helps to increase the confidence of developers and stakeholders in the deployment process, as it reduces the risk of introducing issues that could impact the user experience.

Canary Deployment Based on Concept and Terminology

Canary-Traffic-Shift
Source: cncf.io

Let’s go through the typical lifecycle of the process.

It all starts with Canary, that is, “early adopters” of the new version of the system. In parallel to that, there is the Baseline group. Here belong all the rest of the users not inside Canary.

As the Canary users continue to use the new version, the Canary deployment extends to more and more users. This is Traffic Shifting. The Canary group grows while the Baseline group shrinks, so the system performs Gradual Rollout.

Along the way, the monitoring process logs all the activities and usage outcomes and generates Metrics that developers need as feedback. Developers then react and fix what is necessary. Or they Rollback to the Baseline if they can’t fix the issues at this moment.

Automate all the monitoring and deployment activities. This gives the developers exclusive focus on issue fixing.

It might be the Canary group will find out that some features of the new version are bad while others are great. So the developers will Flag the Features that have problems to disable them from deployment processes.

Developers keep an eye on both groups simultaneously – the Canary and the Baseline. The users are generating A/B Testing results. That is the behavior of the old system and the new system under the same conditions. But also, there are automatic tests running constantly on the new version of the system to ensure the Canary group Health Check is stable.

How it Differs from the Traditional Deployment Strategies

After understanding the high-level life cycle process, the differences between this and the traditional deployment processes are quite obvious.

  • You deploy gradually and in better control rather than deploying all at once to everybody and waiting for the issues affecting the whole production.
  • You limit the risk of new version bugs to the Canary group only versus exposing the whole world to the issues simultaneously.
  • You monitor the new version before the users have it, rather than monitoring it after that and investing some substantial amount of time and resources into the hyper-care phase of the release process.
  • You can decide on rollback way before you deploy the new version completely to production. On the other side is scheduling another release window to undo the production just after the production release completion.
  • Having Canary deployment naturally forces you to invest in automated tools and processes where possible. On the other side, sticking with traditional deployment strategies naturally deprioritizes all the automation initiatives to the end of the backlog list.

CI/CD Pipelines in Canary Deployment

AWS-Canary
Source: aws.amazon.com

In a typical CI/CD pipeline, changes are automatically built, tested, and deployed to a staging environment for further testing before being deployed to production. And also, it’s a perfect use case inside a Canary deployment.

Once the changes have been deployed to the staging environment and have passed all necessary tests, the CI/CD pipeline will automatically deploy the canary version to a small subset of users in the production environment.

If something goes wrong, just run another pipeline for a rollback. Or flag problematic features, and it will never appear again in the deployment process of the deployment pipeline. All automatic, and you don’t need to care about it anymore.

Since the canary version is full of automated health check tests, all of those are naturally incorporated into the basic features of the CI/CD Pipelines. They are a must-have part of every good CI/CD Pipeline anyway.

Workflow and the Phases of Canary Deployment

Summarizing the information together, this is the usual workflow of a typical Canary deployment that you can use on your project.

#1. Planning and Preparation

In this phase, the development team plans and prepares for the canary deployment. This includes identifying the changes or updates to be made, creating a new version of the software, and defining the metrics and health checks that will be used to monitor the performance of the new version. The team also identifies the subset of users who will receive the new version first and defines the rollout plan.

#2. Implementing Traffic Routing and Monitoring

The new version of the software is deployed to the subset of users identified in the planning phase. Traffic routing is implemented to direct a portion of the user traffic to the new version while keeping the old version running for the rest of the users. The performance and stability of the new version are closely monitored using metrics and health checks to ensure that it is performing as expected.

#3. Analysing and Evaluating Deployment Performance

The performance of the new version is analyzed and evaluated based on the metrics and health checks defined in the planning phase. If the new version is performing well, the rollout is gradually increased to more users over time. If any issues arise with the new version, the deployment can be rolled back quickly to the previous version.

#4. Promoting or Rolling Back the Deployment

The development team decides whether to promote the new version to the entire user base or roll back to the previous version. If the new version performs well and meets the necessary quality standards, promote it to the entire user base. If any issues arise with the new version, roll back the deployment to the previous version quickly and easily.

Canary-Deploy
Source: aws.amazon.com

Best Practices and Strategies

When implementing Canary Deployment into your platform, start with defining clear goals and what the success looks like at the end. You might help here with things like performance metrics, user feedback criteria, and business impact.

Create a small subset of users to test the new (Canary) version of the software. The larger group at the beginning is not really an advantage. You want to be as flexible as possible, especially at the beginning.

As mentioned already a few times, monitor the performance and stability of the new version using metrics and health checks. React whenever you see anything suspicious. It’s better to over-react than to under-react when it goes to a gradual rollout.

Gradually increase the rollout of the new version to more users over time. This ensures a smoother transition to the new version.

Use automation tools and processes where possible to streamline the deployment and monitoring process. Include them into the CI/CD Pipelines and make them scheduled deployment processes triggered automatically. This reduces the risk of human error and ensures that the deployment process is consistent and repeatable.

Implement feature flags to enable or disable specific features in the software. You will gain control over the future deployment processes without the necessity to always amend or update manually. You will give more focus to developers on areas that matter – fixing the bugs.

Use A/B testing to compare the performance of two different versions of the software. Assign random users to one version or the other. Identify which version performs better and react to that with future development decisions.

Ensure that you can roll back the deployment quickly and anytime if any issues arise with the new version. It will reduce the impact of any issues and allows for a quick recovery.

Challenges and Case Studies

There are still some challenges that associate with Canary deployment, despite its clear advantages.

One challenge with Canary Deployment is network latency, which can impact the performance of the new version of the software. To address this challenge, developers can use tools such as load balancers and content delivery networks to improve network performance. It’s not only latency for the system from external use. But also latency for internal processes like deployments or CI/CD Pipelines executions. Those must complete as fast as possible. Otherwise, you will have a line of developers in an idle state waiting for the pipelines to finish their run.

Another challenge is ensuring data consistency between the old and new versions of the software. To address this challenge, developers can use techniques such as database replication and synchronization to ensure that data is consistent across all versions. Having production users operating in both old and new versions all at the same time increases the expectations that you will make sure both versions are in total sync all the time and users are not losing any production data just because they are in the Canary/Baseline group. This might be really challenging expectation to meet, so back yourself with solid background processes.

Netflix is a well-known example of a company that uses Canary Deployment to roll out changes to its streaming service. The company uses a combination of automated testing, feature flags, and A/B testing to slowly roll out changes.

Google is another example of a company that uses Canary Deployment to roll out changes to its cloud services. Similarly, the company uses the benefits of automated testing, traffic splitting, and monitoring inclusion to gradually roll out changes to a small subset of users before deploying to all users. This approach has helped Google to improve the quality and stability of its services.

Final Words

As with all the processes, approaches, or strategies, Canary deployment is not a solution for every problem of the world. There are cases where it’s just near impossible to implement due to environmental restrictions, people’s knowledge, or a general lack of conceptual understanding. I

t is much more suitable for the projects of the new age. Where an agile mindset is the rock-solid basic property, the automation of every process is an undoubtful priority, and a maximum level of reliability is a strong expectation from the stakeholders.

In that case, Canary deployment is in some way the next level of agile development practices. It can elevate the teams into a territory the project was never before.

Next, check out scaling and optimizing CI/CD.

  • Michal Vojtech
    Author
    Delivery-oriented architect with implementation experience in data/data warehouse solutions with telco, billing, automotive, bank, health, and utility industries. Certified for AWS Database Specialty and AWS Solution Architect… read more
Thanks to our Sponsors
More great readings on DevOps
Power Your Business
Some of the tools and services to help your business grow.
  • Invicti uses the Proof-Based Scanning™ to automatically verify the identified vulnerabilities and generate actionable results within just hours.
    Try Invicti
  • Web scraping, residential proxy, proxy manager, web unlocker, search engine crawler, and all you need to collect web data.
    Try Brightdata
  • Monday.com is an all-in-one work OS to help you manage projects, tasks, work, sales, CRM, operations, workflows, and more.
    Try Monday
  • Intruder is an online vulnerability scanner that finds cyber security weaknesses in your infrastructure, to avoid costly data breaches.
    Try Intruder