Why we chose running PostgreSQL on EC2 instead of Amazon RDS for our Web Shop’s Backend

BatCat
5 min readMar 25, 2024

--

My partner and I at CyberHut are thrilled to be launching our very first application called NeatPick! We’ve chosen PostgreSQL as our DBMS thanks to our positive experiences with it. Now, we’re tackling the infrastructure decision. Many companies leverage Amazon RDS, and it certainly caught our eye for its user-friendliness, automated backups, and auto-scaling capabilities. As with any powerful tool, there’s a potential downside if not managed carefully.

My partner leans towards managing our own EC2 instance, while I would rather launch AWS RDS and spend my time shaping our product. We both agree on one point though: we need to be mindful about the costs.

I will not touch topics like setup and scalability because AWS RDS won in these two cases: you set everything up quite quickly and scale horizontally and vertically with a few buttons.

NeatPick

I can compare our application — NeatPick — to a web shop: people will search on a mobile app, browse through pages of products and view details about an item they are interested in. Eventually they will stumble upon something they like enough to start the process to purchase the item. Sometimes they will apply updates to their own profiles or check out other people’s profiles to learn more about the seller or the buyer.

NeatPick — A place to trade your game

Replication

Our application will be read heavy, meaning that retrieving data from the database will be significantly more frequent than modifying or adding data to the database.

We were thinking about boosting performance by distributing the workload across multiple instances. We would start small, having one master and one replica which mirrors the data from the master. Amazon RDS offers creating read replicas of the master database instance to offload read-only workloads and scale read capacity. We are talking about asynchronous replication, where the replicas receive the updates with a delay. Due to AWS abstracting the underlying infrastructure and configuration, our control is limited over certain aspects of replication.
Achieving a similar function with a self-managed PostgreSQL on an EC2 would require launching another instance for our replica, installing the same version of PostgreSQL on it, extra configuration steps on the master and the replica to enable replication, initializing the replicas with the latest backup and more. Tuning Postgres for this purpose requires expertise.

We both consider ourselves experts in PostgreSQL, and we appreciate the tools provided by the community to support replication. We also believe in maintaining independence from cloud ecosystems like AWS. Our priority is to retain flexibility and freedom of choice, allowing us to optimize costs without being tied to a single cloud provider.

Backups

The daily automated backups of Amazon RDS are quite beneficial. Additionally, we have the option to create user-initiated backups, known as snapshot backups, at any time. In the event of needing to recover from an unwanted incident, we can use Point-in-Time Recovery: Amazon RDS will automatically select the appropriate backup and apply the relevant transaction logs to roll the database forward or backward to our chosen timestamp.
If we choose PostgreSQL on an EC2 instance, with a few commands and cron jobs, we can achieve automated backups, enable WAL for transaction logs, schedule snapshot backups, and set up a basic point-in-time recovery mechanism. We do not have the fancy UI but this setup will still do the job!

Reliability

AWS is renowned for its reliability, offering a proven track record of minimal outages and bugs. If a bug occurs, AWS will resolve it autonomously ensuring seamless operations. However, if an error occurs in our self-managed Postgres on an EC2 instance, it could result in significant disruptions and operational challenges. Such an event might lead to data loss, prolonged downtime, and spending time to troubleshoot and resolve the issue.

Choosing RDS in this case would impact reliability, maximize our operational efficiency and allow us to have more time for development.

Monitoring, Alarms

Amazon RDS allows us to track database health and performance by providing monitoring metrics and alarms through Amazon CloudWatch. Setting up alarms lets us detect potential issues early and take action to resolve them as soon as possible. Getting started with CloudWatch is free but it is only a matter of traffic and volume. AWS claims that many applications should be able to operate within their free tier limits which sounds quite attractive.

Deploying Prometheus alongside Postgres on an EC2 instance offers us a good alternative. We can visualize and analyze the data using Grafana. Prometheus can be used for monitoring and alerting, and it is open-source. There are extensions like pg_stat_monitor and built-in views like pg_stat_activity that can be queried to get more information about the current DB activity.

I like to liken CloudWatch to the drug dealer’s strategy. First, they hook you with a taste, and before you know it, you’re emptying your pockets for the next fix.

Final Thoughts

As always, the choice depends on the project’s unique requirements: speed, budget, and features (scope). When speed is critical, managed services like AWS RDS can be a lifesaver. Its wide range of features allows for quick setup and deployment, making it a compelling option. However, for projects with tighter budgets, exploring alternative solutions or open-source options like self-managed PostgreSQL might be a better fit.

Take NeatPick for example. We have the benefit of time and expertise in PostgreSQL, allowing us to manage the infrastructure ourselves. This gives us more control and potentially reduces costs compared to a fully managed service. But for other projects where speed is paramount and budget is flexible, AWS RDS could be the smarter choice.

If you’re seeking software engineers who prioritize cost-efficiency, adhere to timelines, and stand by their estimates, feel free to reach out. We’re readily available to offer our services either as a team or individually.

Ragnar will enjoy an entire chicken each time a contract is signed:

Ragnar

Check out more about Ragnar’s contribution on our website: CyberHut

Terminology

DBMS — Database Management System
AWS RDS — Amazon Relational Database Service
EC2 — Elastic Compute Cloud is a service offered by AWS that allows you to rent instances in the cloud, offering flexibility and scalability
PITR — Point-in-Time Recovery is a feature of Amazon RDS that allows DB restoration to any point in time within the backup retention period
Prometheus — a popular open-source software used for monitoring systems and alerting
Grafana — an open-source platform specifically designed for visualizing and analyzing data. It works well with monitoring tools like Prometheus

--

--

BatCat
BatCat

Written by BatCat

My mission is to share some of the solutions I find during my journey as a data engineer. I mostly write about PostgreSQL and Python.

No responses yet