How to choose between Database Options in Amazon AWS

Image From Amazon
Image From Amazon

Every DB admin knows the painful process of patching and updating DB servers and not to mention maintaining the backup and high availability of your database.

The importance of this subject comes from the fact that your data is probably the most important part of your IT asset. If you lose your servers, instances, or network devices there is a good chance that you can replace them, but if you lose your data the cost of loss can be irreparable.

Another limitation of using in-house database is your intention to stick to one platform because of the large maintenance overhead and licensing cost of dealing with multiple platforms, so you intentionally impose another limitation to your application.

Plus you probably need to consider scalability and availability factors for each platform before you start your deployment because your database server is not something that you can easily shutdown for upgrade and reconfiguration.

The better way, though, would be utilising AWS managed services which are ready for you to start your database server. Using a managed database service gives you several benefits:

  • Super easy to setup through wizard
  • Simple to operate and reduce the admin tasks
  • Cost efficient
  • Easy to scale to a larger instance
  • Built-in disaster recovery into multiple availability zones
  • Choose between SQL or No-SQL database option (DynamoDB)
  • Flexible to use multiple RDS platform (Amazon Aurora, Oracle, SQL Server, PostgreSQL, MySQL, MariaDB)

Amazon provides a fully managed Relational  (Amazon RDS) and NoSQL (Amazon DynamoDB) database services along with in-memory caching (ElastiCache) and data warehouse service (Amazon Redshift);

Which database to choose highly depends on your application and data types; Do you need to maintain data models? How many concurrent users do you want to support? What is the size and type of objects? How big is your data and what is your growth rate?

 

Amazon Relational Database Service (RDS)

A relational database is a database that organises data into tables( or “relations”) with columns and rows, and unique key, identifying each row. Relational Databases provide many benefits:

  • You can run complex queries and have flexible indexing
  • Data duplication is eliminated by design
  • More granular security as the data is split into tables

With Amazon RDS offerings you can start and run your choice of database platform in minutes and be worry free on maintenance and operations overhead; Here are different RDS options from Amazon:

  • Amazon Aurora: MySQL-compatible relational database engine with the simplicity and cost-effectiveness of open source databases and up to five times better performance than MySQL.
  • Amazon RDS for MySQL: A managed MySQL database with full features and capabilities of a MySQL.
  • Amazon RDS for MariaDB: Scalable and resizable managed MariaDB deployments in the cloud.
  • Amazon RDS for PostgreSQL: A full featured PostgreSQL database with all the capabilities of the open source installation.
  • Amazon RDS for Oracle: Deploy multiple editions of Oracle Database in minutes with cost-efficiency and resizable hardware capacity.
  • Amazon RDS for SQL Server: Deploy multiple editions of SQL Server (2008 R2, 2012 and 2014) including ,Web, Standard and Enterprise (2008 R2 and 2012 only for Enterprise).

If you need more capacity, you can Scale your Amazon RDS vertically or horizontally:

  • Vertical Scaling: through updating to a larger instance and/or faster storage
  • Horizontal Scaling: You can create read-only replicas of your production and horizontally scale your database. If you want to distribute the load between multiple instances you may need to use  a data partitioning approach in which your application needs to be aware of this configuration type.

All Amazon instances including databases are running on a highly available and durable infrastructure; If you want to achieve higher availability at data centre level you can run your database under Amazon RDS Multi-AZ deployment which creates a synchronous instance of your production database to another availability zone (in the same region) as a standby version. Amazon automatically failover to this standby instance should your primary database experience an outage.

 

Amazon NoSQL database (Dynamo DB)

NoSQL is a type of database that doesn’t use relational tables as you see in relational databases. NoSQL databases are mainly used in big data and real-time web applications. In NoSQL databases, you can use a variety of data models, like Key-Value pairs, graphs … .

Comparing with RDS here are some benefits of using NoSQL databases:

  • Can handle Large volumes of structured, semi-structured and unstructured data
  • Can be easily designed and deployed as there can be no structure in the data
  • Low latency and high performance of accessing larger data types (Document and Key-Value store)
  • Highly scalable through horizontal scaling

Amazon DynamoDB is a fully managed NoSQL database offering from Amazon that can provide all the benefits of NoSQL database in a fully managed service that is flexible, fast and scalable.

Scalability of a NoSQL database is achieved through data partitioning that can scale the read/write capacity by adding more instances horizontally. The Amazon DynamoDB scalability is built-in to the service and will grow based on you database load.

The high availability of Amazon DynamoDB is achieved through synchronisation of data replicas across three facilities in an AWS region.

 

Data Warehouse

Data Warehouse is a type of database that is used for data analysis and reporting on the larger amount of data. Data Warehouse is a core component of business intelligence which collects and integrates data from different sources in business, for example, IT, Sales, Marketing, etc and provides the required data for your reporting service.

Data warehouse system collects and process data from multiple sources, so in most situation the rate of data growths and the need for scaling the server is unavoidable. As the data warehouse is relational by nature the scalability process would be a complicated and costly task.

Amazon Redshift is a fully managed data warehouse system which removes all the pains of maintaining an in-house data warehouse system.

Here are some features/benefits of using Amazon Redshift:

  • It is Fast: Optimised for data warehouse by utilising parallel processing architecture, reduce the required I/O for queries with data compression, zone maps …
  • It is Scalable: If you need to add more nodes or increase capacity it’s a matter of a few clicks. your database will be in read-only mode during the upgrade and you have your new data warehouse server with more capacity.
  • It is Cheap: Ther is no upfront cost and you just pay for the resources / Instances that you use.
  • It is Fully managed: It means it is easy to start, operate and maintain. Built-in fault tolerance and automated backup give you a complete peace of mind.

Leave a comment

Your email address will not be published. Required fields are marked *