Saving costs and improving reliability by utilizing AWS services.
A SaaS in the Human Resources tech space had steadily grown its client base by being able to iterate fast and deliver new features quickly. However, their single VPS began driving up costs due to frequent vertical scaling for memory and disk needs. At the same time, their manual deployment process introduced 15–30 minutes of downtime per release. This led to a culture of “deployment fear” where developers try to postpone deployments as much as possible, which snowballs to bigger batch sizes that further amplifies the problem.
In this document, I’ll describe how I helped rearchitect their application and deployment process, with the main goals of improving reliability and cost efficiency.
Disclaimer: This post looks at the architecture in a high-level manner, and is intentionally generic to not disclose any client-specific information. Certain details such as networking specifics and code-level architecture are omitted for brevity.
The SaaS application is a monolithic Django application backed by a PostgreSQL database. The app is hosted in a single VPS box, which also contains auxiliary services. The following diagram illustrates the components of the original architecture:

cron.git pull, and runs a script to restart the application. This constitutes a deployment.As their client base grew, the company needed a more cost-effective way to host their SaaS application. They also needed to move away from their current deployment process as downtime had increased due to frequent human error when executing the manual steps. The goal was to be able to iterate on new features without the system architecture getting in the way.
Let’s look at the migration in two sections: the system architecture and the CI/CD pipeline.
We’ve decided to move their setup onto Amazon Web Services (AWS) to better utilize specialized services, reducing the responsibility of a single VPS.

cron schedule was moved to Amazon EventBridge.After a few months of operation, the new AWS setup had comparable costs with the old VPS setup (~$230 vs ~$200) as we had enough data to right-size the EC2 instance. This compensates for the additional cost of the RDS database. The decision to keep Redis on EC2 was primarily driven by cost. Cache durability was not a business requirement, making a managed service like Amazon ElastiCache unnecessary at this stage.
While horizontal scaling was not a business requirement at the time, we deliberately kept the application layer on a single EC2 instance. This preserved the team’s familiar vertical scaling model while still allowing us to decouple stateful components like the database and file storage. As the services are now running on Docker, this should provide a foundation for further containerization efforts (such as moving to Amazon ECS) if needed.
We leveraged an existing Lambda pattern used in another project to minimize implementation time and avoid introducing new scheduling complexity inside the application layer.
While not totally removing the impact of the EC2 instance being a “single point of failure”, this reduced the operational blast radius and enabled future horizontal scaling by decoupling the persistence layer.
Having only a production environment and developers doing manual commands to deploy is a very risky practice for an organization. Developers can only rely on their local environment to verify their work, and a single misstep on the deployment procedure could pull down the SaaS application. A reliable CI/CD pipeline is crucial in ensuring quality work flows from development to production.

dev can be used for general dev testing.staging is used for official QA work.master equates to the production environment.Terraform allows infrastructure to be defined as code, which allows code management techniques (version control, code reviews) to be applied. As BitBucket is already used, it is the natural choice for CI/CD capabilities. As Ansible is already used in an adjacent project, we decide to use it for instance configuration. This leads to clear separation between infrastructure provisioning and configuration.
Merge request rules are added to control the quality of new features, ensuring a baseline quality that can be built on top of. As the organization structure is flat at that time, we decided to keep a simple “requires 3 people to approve” rule.
Distinct, specialized environments for various purposes without affecting prod are now available. Manual steps (aside from merge request approval) are also completely removed. This results in less downtime due to deployment issues (15-30 minutes per deploy to none reported in the 3-month pilot window), which led to increased confidence from the developers and stakeholders.
In this post, we explored how rearchitecting a traditional VPS-based application onto AWS can provide cost-savings and enhanced reliability by utilizing specialized services to handle non-core workloads. We also explored how a CI/CD pipeline can improve code quality and accelerate development by providing a safe mechanism for deployment.
The SaaS now has reduced deployment risks, elevated organization confidence, and future readiness to handle scaling needs.
At a glance:
Leading this initiative had given me valuable insight on how to migrate a traditional VPS application. Because cost-efficiency and reliability were core requirements, careful tradeoff analysis was critical to ensure long-term operational simplicity.
One such tradeoff was deliberately postponing ECS adoption, even though containerization would make it a natural next step. I’ve considered that while ECS could provide a lot of heavy lifting as a managed service, horizontal scaling is not a top priority at the time. Plus, it is a different model than what the team is confident with while an EC2 is still like a “familiar VPS”.
As I’ve observed the team’s workflow, the special branch deployment model had been an obvious choice at the time. If time had permitted, I should’ve considered a version-based workflow where build artifacts can be generated on-demand per merge request. These versioned artifacts can then be deployed independently to each environment, reducing the need to maintain long-lived special branches.