How to choose between Database Options in Amazon AWS

Image From Amazon
Image From Amazon

Every DB admin knows the painful process of patching and updating DB servers and not to mention maintaining the backup and high availability of your database.

The importance of this subject comes from the fact that your data is probably the most important part of your IT asset. If you lose your servers, instances, or network devices there is a good chance that you can replace them, but if you lose your data the cost of loss can be irreparable.

Another limitation of using in-house database is your intention to stick to one platform because of the large maintenance overhead and licensing cost of dealing with multiple platforms, so you intentionally impose another limitation to your application.

Plus you probably need to consider scalability and availability factors for each platform before you start your deployment because your database server is not something that you can easily shutdown for upgrade and reconfiguration.

The better way, though, would be utilising AWS managed services which are ready for you to start your database server. Using a managed database service gives you several benefits:

  • Super easy to setup through wizard
  • Simple to operate and reduce the admin tasks
  • Cost efficient
  • Easy to scale to a larger instance
  • Built-in disaster recovery into multiple availability zones
  • Choose between SQL or No-SQL database option (DynamoDB)
  • Flexible to use multiple RDS platform (Amazon Aurora, Oracle, SQL Server, PostgreSQL, MySQL, MariaDB)

Amazon provides a fully managed Relational  (Amazon RDS) and NoSQL (Amazon DynamoDB) database services along with in-memory caching (ElastiCache) and data warehouse service (Amazon Redshift);

Which database to choose highly depends on your application and data types; Do you need to maintain data models? How many concurrent users do you want to support? What is the size and type of objects? How big is your data and what is your growth rate?

 

Amazon Relational Database Service (RDS)

A relational database is a database that organises data into tables( or “relations”) with columns and rows, and unique key, identifying each row. Relational Databases provide many benefits:

  • You can run complex queries and have flexible indexing
  • Data duplication is eliminated by design
  • More granular security as the data is split into tables

With Amazon RDS offerings you can start and run your choice of database platform in minutes and be worry free on maintenance and operations overhead; Here are different RDS options from Amazon:

  • Amazon Aurora: MySQL-compatible relational database engine with the simplicity and cost-effectiveness of open source databases and up to five times better performance than MySQL.
  • Amazon RDS for MySQL: A managed MySQL database with full features and capabilities of a MySQL.
  • Amazon RDS for MariaDB: Scalable and resizable managed MariaDB deployments in the cloud.
  • Amazon RDS for PostgreSQL: A full featured PostgreSQL database with all the capabilities of the open source installation.
  • Amazon RDS for Oracle: Deploy multiple editions of Oracle Database in minutes with cost-efficiency and resizable hardware capacity.
  • Amazon RDS for SQL Server: Deploy multiple editions of SQL Server (2008 R2, 2012 and 2014) including ,Web, Standard and Enterprise (2008 R2 and 2012 only for Enterprise).

If you need more capacity, you can Scale your Amazon RDS vertically or horizontally:

  • Vertical Scaling: through updating to a larger instance and/or faster storage
  • Horizontal Scaling: You can create read-only replicas of your production and horizontally scale your database. If you want to distribute the load between multiple instances you may need to use  a data partitioning approach in which your application needs to be aware of this configuration type.

All Amazon instances including databases are running on a highly available and durable infrastructure; If you want to achieve higher availability at data centre level you can run your database under Amazon RDS Multi-AZ deployment which creates a synchronous instance of your production database to another availability zone (in the same region) as a standby version. Amazon automatically failover to this standby instance should your primary database experience an outage.

 

Amazon NoSQL database (Dynamo DB)

NoSQL is a type of database that doesn’t use relational tables as you see in relational databases. NoSQL databases are mainly used in big data and real-time web applications. In NoSQL databases, you can use a variety of data models, like Key-Value pairs, graphs … .

Comparing with RDS here are some benefits of using NoSQL databases:

  • Can handle Large volumes of structured, semi-structured and unstructured data
  • Can be easily designed and deployed as there can be no structure in the data
  • Low latency and high performance of accessing larger data types (Document and Key-Value store)
  • Highly scalable through horizontal scaling

Amazon DynamoDB is a fully managed NoSQL database offering from Amazon that can provide all the benefits of NoSQL database in a fully managed service that is flexible, fast and scalable.

Scalability of a NoSQL database is achieved through data partitioning that can scale the read/write capacity by adding more instances horizontally. The Amazon DynamoDB scalability is built-in to the service and will grow based on you database load.

The high availability of Amazon DynamoDB is achieved through synchronisation of data replicas across three facilities in an AWS region.

 

Data Warehouse

Data Warehouse is a type of database that is used for data analysis and reporting on the larger amount of data. Data Warehouse is a core component of business intelligence which collects and integrates data from different sources in business, for example, IT, Sales, Marketing, etc and provides the required data for your reporting service.

Data warehouse system collects and process data from multiple sources, so in most situation the rate of data growths and the need for scaling the server is unavoidable. As the data warehouse is relational by nature the scalability process would be a complicated and costly task.

Amazon Redshift is a fully managed data warehouse system which removes all the pains of maintaining an in-house data warehouse system.

Here are some features/benefits of using Amazon Redshift:

  • It is Fast: Optimised for data warehouse by utilising parallel processing architecture, reduce the required I/O for queries with data compression, zone maps …
  • It is Scalable: If you need to add more nodes or increase capacity it’s a matter of a few clicks. your database will be in read-only mode during the upgrade and you have your new data warehouse server with more capacity.
  • It is Cheap: Ther is no upfront cost and you just pay for the resources / Instances that you use.
  • It is Fully managed: It means it is easy to start, operate and maintain. Built-in fault tolerance and automated backup give you a complete peace of mind.

How to Design for a Loosely Coupled System

Loose CouplingYou may have heard the term Loose Coupling in the SOA (Service Oriented Architecture) in software design or IaaS (Infrastructure as a Service) or PaaS (Platform as a Service) in the cloud automation topics. For me, it was a bit hard to catch at the beginning, but this simple explanation helped me better understand the actual meaning of Loose Coupling:

If your Apple IPOD’s battery dies, you need to replace the whole device as you can’t  simply change the battery! That can be an example of a tightly coupled system which the health of your IPOD is tightly depended on the battery which is not replacable. An example of a loosely coupled in that aspect, I’m sure you can make a lot of examples.

When designing in the cloud (or any complex system) Loose Coupling plays an important role towards scalability and reliability of your applications which result in an easier automation. Loose Coupling is an approach against having a big complex (IT) system or applications; Instead  design for smaller and simpler elements which can cooperate with each other to provide the same service.

In a loosely coupled system:

  • Each component/element can scale independently if needed.
  • Each can be modified separately.
  • Failure in an element does not affect the rest of a system and
  • Recovery from a failure is by far easier comparing a tightly coupled complex system.

All of these make a loosely coupled system to be more manageable, reliable and scalable and as a result, the operation’s tasks can be easily automated.

A very simple example of decoupling is when you remove a database service from an application server and deploy it on a dedicated server; The Same concept applies when you want to design your loosely coupled application, or decouple your existing complex system.

If you want to create a loosely coupled application you first need to make sure you Separate the Components, then define Standard Communication Interfaces for your elements; You may need to design an  Automatic Service Discovery at each layer or design based on Asynchronous Integration method; Another aspect of a loosely coupled system is how Graceful is a Failure handled which leads to minimum/no effect to the rest of the system. These concepts are explained in this Amazon best practice document, And here is a summary:

 

Loosening as a Mindset

Whenever you built a component (can be a physical, virtual, or a piece of code ,.. ) stop for a moment and ask these questions?

  • What happens if this element fails and what would be the effect on the service?
  • Can I scale/change/recover this object without touching another part of the system?

Based on your answer to above questions, you need to rethink your design. Here are some strategies that can help you create a Loosely Coupled applications:

 

Build on top of a Standard Communication Interface

Instead of creating heavily customised connections between your components, create standard interfaces and APIs (like RESTful API) and make the communications through these interfaces; This way, you reduce/remove dependencies between your components as they are all communicating through your standard communication  APIs; As they are less dependent to each other, failure in any part doesn’t prevent others from doing the normal operations.

One of the tools that can help you develop/build standard Interfaces is Amazon API Gateway. With this fully managed service, you can create APIs that can act as a “front door” for you applications to access their data and decouple them from the rest of the system.

 

Create a Built-in Service Discovery

When your components are decoupled and separated, they simply need a way to find and communication with each other. In this situation you have two option:

Hard code the communication information (e.g., the static IP address of the database server in the application server)
Develop an automatic service discovery built into your system

The latter is preferable as your sub-application can consume without prior knowledge about other components; as a result, you can add/remove without any service outage or required change.

One way you can achieve service discovery is through Amazon ELB (Elastic Load Balancer); With the unique DNS name of each load balancer you can assure a reliable service discovery from your application/web server whenever they want to consume another service (e.g, database). You can even add more abstraction layer by creating a DNS CNAME to decouple at load balancer from the DNS name.

 

Asynchronous Integration

If you have applications that don’t need an immediate response in the communication, you can decouple them from directly interacting each other by utilising and an intermediate storage layer (Like Amazon SQS). In this communication, normally one component generates events and another  one consumes that event. So the source system sends the message to the external queuing system – instead of directly to the target application- and the target (consumer) consumed the message from the queue:

Image From Amazon
Image From Amazon

As you can see if any of the controllers fails other controllers can continue operating as per normal by putting/getting the message from the queues. you can also scale up/down each controller without affecting other layers.

 

Graceful Failure (Detect/Recovery)

A loosely coupled system can tolerate fault by gracefully recovery from failure.  Here are some of the strategies to develop a graceful failure in your system:

  • Amazon Route 53 DNS failover feature can monitor and detect a failed server and stop referring traffic to the failed component.
  • You can utilise front-end caching systems which can redirect users to the cache content when the main wen site fails
  • A failed task can be stored in a queue to be processed later when the system is healthy.

 

Automating the IT Operations Tasks in Amazon AWS! Reality or Illusion!

Image from Amazon
Image from Amazon

Automation is one of the Principles of Architecting in the Cloud and for most IT operations engineers is the end goal. But can we really get to the point that all the operation’s tasks became automated?

The answer is yes!

When you properly design your applications and infrastructure and make them stateless and reusable resources by the techniques we’ve discussed here and here, you’ve already started your automation journey:

  • Making your applications and components stateless you can easily scale them with minimal/no manual process involved which you can take it as automation.
  • When you make your servers stateless and reusable you’ve already utilised some automation to bootstrap your computing resources.
  • Having your Infrastructure available as a code is also a step towards automation on your infrastructure resources.

Now that you did your homework to design an automation-ready application, servers and infrastructure, you can sit back and let AWS do most of the manual jobs for you.

Autoscaling, AWS Elastic Beanstalk, CloudWatch, OpsWork and Lambda are some of the amazing Amazon’s tools that can help you increase the efficiency and reliability of you IT service as well as increasing the productivity of your IT operations.

These services can help you automate responding to multiple events that you would manually react in traditional IT environments.

Auto-Scaling

Scalability has always been an important factor in designing at any layer of your IT infrastructure. If your applications and infrastructure have been designed for scalability you can move to the next step and automate your scalability tasks.

Using Auto Scaling feature in AWS you can dynamically add more server instances when there is a demand for it -triggered by CPU, Memory and Load – so you maintain the availability of your service. You can remove your resources on quiet times automatically  and save some money.

I believe this is one the coolest AWS features which gives you the peace of mind on your application availability and reliability and also remove the cost of underutilised servers.

AWS Elastic Beanstalk

Let’s imagine you have an application and you want to deploy and provision required resources for it. Yes, you need to provision the servers. load balancers, database, application servers (PHP, Java, Python,…), configure all of the components and connections, patch your server and start uploading you application code to your environment. This will take hours if  not days assuming you have all the resource available.

With AWS Elastic Beanstalk all you need is to do is upload your application code and Elastic Beanstalk does the hard job for you. It handles all the resource deployments, capacity provisioning, load balancing, auto-scaling and health monitoring for you so your application will be up and running in minutes.

You may ask what application platform is supported? the answer is almost all of the well-known platform are supported by AWS Elastic Beanstalk:

Java, Net, PHP, Node.js, Python, Ruby, Go and Docker

And it’s free!! There is no extra charge for Elastic Beanstalk and you just pay for the AWS resources needed to store and run you application.

Amazon CloudWatch

CloudWatch is a monitoring service for AWS Cloud services. You can collect and track metrics/logs from AWS resources and set alarms based on the status of your service. And most importantly you can automatically respond to any change in your AWS resource.

For example, you can Automatically Recover your EC2 Instance in case of failure. EC2 instances are similar you your (virtual) server is AWS world. you can create a CloudWatch alarm to monitor the health of your EC2 instance and automatically recover it in case of any hardware/software failure. The new instance is completely identical to the original one with same ID, private/public IP ,… Note that this feature is not supported by all the instance types.

You can also call/run functions (through Amazon SNS and Lambda) when a specific CloudWatch Alarm triggers as a result of a metric change.

AWS OpsWorks

One of the features of AWS OpsWork is that you can automatically update your configuration based on the life cycle events of instances. For example when a new database instance is created in your server farm – a life cycle event -, OpsWork can call for a Chef recipe that is responsible for updating your application servers so that they can use this new database server. This is a very common example of Continuous Configuration of instances.

Why Your Servers are Not Important Anymore

Image from Amazon AWS
Image from Amazon AWS

Unlike the old days that you had to build server from scratch and spend hours to install the software, patch the servers and  setup the static configuration (IP, Name, …), in cloud computing, you can dynamically create lots of servers with all the required components and configurations pre-deployed; As a result:

Your servers become just a temporary computing resources to do the processing for you.

Any update, patching or fix required for the server, not a problem in the cloud! The old server will be removed and replaced with a new updated, patched and healthier version.

So making your servers as Disposable Compute Resource is one of the Principles of Architecting in the Cloud; But how to convert the time-consuming task of a server build to a smooth reusable task/ process?

 

Bootstrapping

Even with the dynamic provisioning of resources in the cloud, your servers still come with a default configuration and installed applications. Bootstrapping is:

Pushing some script to the server to customise your OS from top to bottom including installing and setting up a piece of software.

There are multiple ways to bootstrap servers:

  • You can push power shell /bash scripts to the server, or
  • You may use configuration management tools like Puppet or Chef recipes;
  • Cloud-init and user data scripts are also other ways to auto-config servers during the boot process.
  • AWS CloudFormation and AWS OpsWork are two main tools in AWS that can help you bootstrap your server

 

Golden Images

If you need some faster approach with fewer dependencies on external components, you can apply all your nice customization to a server and prepare it to be your golden image.

Now with your golden image, you can deploy as many servers as you want from your image which includes all the pre-build software and configurations.

 

Hybrid (Bootstrapping & Golden Image)

You can use the combination of the two methods to get the best out of your auto provisioning. The question is when/where to use each method over another!? It all depends on your deployment but as a general rule:

  • Things that are less likely to change between your instances (i.e, software installations) are best items to be put in your golden image; Installing software even automatically can be a time-consuming job.
  • On the other hand, things that are more likely to change in different deployments better to be deployed by bootstrapping. (i.e, Minor software updates and application specific configuration like database configuration.)

The good example of this is AWS Elastic Beanstalk that provides pre-configured servers with all the required software but lets you also use bootstrapping to customise your environment variables.

 

Bootstrapping, Golden Image or the combination of the two, are part of Server Instantiating Approach by which you make your server provisioning an automated/repeatable process.

Let’s move one step forward on this. What if you want to extend your automation job beyond your servers and make your entire infrastructure acting as  programmable resources that itself can be converted to a reproduced  process.

 

Infrastructure as Code

When you transform your whole infrastructure to codes, a new window of possibilities opens to you:

Anything that can be converted to software, can be programmed, and anything that can be programmed can be reused,reproduced and automated.

An example of a tool that can help you move towards to an infrastructure as code enabled environment is AWS CloudFormation; With AWS CloudFormation you can create, manage and develop your AWS resource as codes (Networks, Load Balances, Security Policies, ….); Multiple AWS resources can be programmed together and attached to your application to enable creating a reusable end-to-end environment including your server resources as well as your infrastructure resources.

 

How to Design Scalable Applications in Amazon AWS

Image from Amazon AWS
Image from Amazon AWS

Almost every applications and systems grow over time (users, data, traffic, …); You may have different application with different growth rate, but how to make sure your infrastructure can catch up with your application/data getting bigger and bigger:

You must build your applications on top of a scalable infrastructure that can grow with your system and handle extra loads over time.

This is why building a highly scalable system is one of the Important Factors when Designing in the Cloud.

When thinking about scalability in IT world you have two general options:

  • Vertical
  • Horizontal

Vertical Scaling (Scale Up)

Vertical scaling is basically adding more CPU, RAM, I/O or networking capacity to a Physical or Virtual server , storage or networking device. But there is a problem with this kind of scaling:

You have a limit on the amount of resource you can add  to a server, memory and CPU for example; Plus you put your application atvthe risk of a big server goes down.

Horizontal Scaling (Scale Out)

When you scale horizontally you increase the number of resources – for example adding more server or network devices-. The benefit for this kind of scaling is:

Technically you can scale with no limit and start creating an Elastic system that can scale up/down based on your application requirements.

But you need to consider the fact that not all the applications support distributing the load on multiple resources; In other words, your application needs to be Stateless.

Stateless Application

Stateless Application doesn’t need to know about user’s session or any previous user’s interaction with an application so any node can be gracefully added/removed with no end-user disruption. For example, a static web website that doesn’t provide any user login feature so doesn’t need to keep the user session information.

Stateless Component

It’s too good to have everything stateless but in reality, you always have some sort of session/state you need to maintain in your applications, here are two situations:

  • You may have your user’s session/login information in your web application that you need to maintain.
  • In multi-step data processes, you need to keep track of previous tasks/activities in your process flow.

In these situations you can remove these state data from your nodes and store them somewhere else and let the Components be Stateless:

  • You can save user’s session data in the Managed Databases (Amazon DynamoDB) and detach session data from your servers and make them stateless.
  • If you need to keep your user’s files (pictures, data, batch processing results …) you can put them on a highly available  Managed Shared Storage ( Amazon S3, Amazon EFS).
  • You can use a Managed Workflow Services (Amazon Simple Workflow Service (SWF)) to store the execution history in a central shared location when you want to keep track of multi-step workflow process and make it stateless.

Stateful Applications/Components

What if you can’t (or don’t  want to) make your component stateless:

  • Like some legacy applications that doesn’t support this by nature.
  • Or you may not want your users to move between your nodes. (i.e, some multi-player gaming applications which require very low latency when users playing the game.)

There is a way to scale these Stateful Applications/Components which is what we know as Session Affinity strategies. when using session affinity there are some limitations on how well you can distribute the load when you add/remove the node to the cluster as users are attached to a specific node; As a result of users not being able to move between servers you may not have a fully load balanced servers.

After you create your beautifully designed scalable systems you need to distribute the load to these nodes, there are two high-level Load Distribution Strategies you can use: Push Method / Pull Method.

Distributed Processing

Imagine a case which you need to process a massive amount of data and hardly you can find a single server that is capable of handling the processing load. Here is another type of Horizontal Scaling comes into play:

Distributed Processing is splitting a big task (and it’s big data) to many smaller jobs, and process them in parallel and in multiple/many servers.

Here are two general solutions to handle distributed processing:

  • You can use a Distributed Data Processing Engine (e.g, Apache Hadoop, Amazon Elastic MapReduce(EMR)) to manage and process a massive amount of distributed data.
  • In case you have a large stream of real-time data you can use Amazon Kinesis to divide your data into multiple portions and process them by multiple computing resources in your server farm.

3 Ways to Achieve Session Persistence when Load Balancing

post-it-1275586_1280

When you want to distribute the load between multiple nodes, most of the time you need to maintain the user’s session or keep them on the same server to avoid unreliable experience,

This is called Session Persistence and there are multiple ways to achieve this based on your application situation.

You can maintain user session by Sharing the Session information on a database or file service; You can use Sticky Sessions in your load balancer; Or you may use Client Side Load Balancing by which your clients choose the correct server to connect to.

 

1- User’s Session Sharing

In this method, you share the session information between servers so session data is always available to any server that user  may connect to. But how to share the user’s session data between multiple servers?

  • You can save sessions information in a shared file system (NFS, CIFS, ..) or in a database; I recommend to use a fast and reliable Managed Database (Amazon DynamoDB) for this purpose.
  • You may even keep your user’s files (pictures, data, batch processing results …) on a highly available Managed Shared Storage ( Amazon S3, Amazon EFS) and create a stateless server farm.

When sharing user’s session between server, you create a stateless environment so users can connect to any server in your server farm and your application maintains the user’s session during their connection time.

But what if you have a legacy application or a gaming application that you don’t want (can’t let) users to move between servers; Here you need to use Session Affinity methods which are known as Sticky Sessions.

 

2- Sticky Sessions

When using Sticky Sessions you setup your Network Load Balancer to directs the user’s traffic to the same/correct server based on the affinity rule you define; There are different types of affinity rules out there : Partition-based (IP or Username) or Cookie based

IP Based Affinity

One of the easiest ways to stick a user session to a server is using client’s IP address as an affinity rule; When using IP-based affinity you always direct a user with the same source IP address to the same server.

This method is the easiest but not the most accurate ones as the service provider or a proxy service may change your user’s IP address hence making your affinity rule not working properly

User Based Affinity 

If you are authenticating users and if your web application supports this function, you may be able to partition your users based on their usernames;

Cookie Based Affinity

This is an application level affinity rule by which you use a session cookie to identify users and direct them to the correct server. (set by your load balancer or through your application cluster feature)

 

3- Client Side Load Balancing

When you do client side load balancing your clients (not your load balancer) discover the correct server to connect to.  You can use DNS or a discovery API that passes this information to the software running on the client. The later is mostly use in gaming application

 

The 2 Load Distribution Design Model for your Applications

Image from AWS
Image from AWS

In general, most of the load distribution design strategies for your applications fall into these two general models:

 

1- Push Method:

In the push model basically, you push the load to a server or a node through an external service like a Network Load Balancer (Amazon ELB), DNS Load Balancer (Amazon Route 53).

The user sends requests to a Network Load Balancer or DNS, and the Load Balancer distributes the traffic to the next available node based on different distribution algorithm. (Round Robin, Load, Performance, Delay, ..).

When using the DNS Load Balancer be mindful of client-side caching of DNS which is usually out of your control; This may cause your users to be directed to a server/node that is unavailable at the time of request so causing a service timeout; In such a situation they don’t receive the correct DNS record (based on the available nodes) until their local cache expires.

This is why the DNS load balancing itself is not a very accurate way unless combined with a live server health check (Amazon Route 53)

 

2- Pull Method:

In a pull model, there is no need for any external load balancing service; This type of distribution is more suitable in your back-end data processing services when processing a particular task.

When you need to process your Asynchronous Event-Driven Workload, you put them as a message in a queue (Amazon SQS) and multiple processing nodes can access/pull these messages successively and process them based on their load.  Your workload can also be a stream of data instead of the message that come from a Data Streaming Service (Amazon Kinesis)

10 Design Principles when Architecting in the Cloud

Based on AWS Best Practice on Architecting for the cloud you need to consider these 10 design principles when architecting in the cloud:

  1. Scalability
  2. Disposable Servers/Resources
  3. Think Automation
  4. Loose Coupling
  5. Think about Services, Not Servers
  6. DataBase
  7. No Single Point of Failure
  8. Cost Optimization
  9. Caching
  10. Security

In this Article I explain what each subject means in cloud computing design and how you can achieve each principle:

 

1-Scalability

Almost every applications and systems grow over time (users, data, traffic, …); You may have different application with different growth rate, but how do you make sure your infrastructure can catch up with your application/data getting bigger and bigger:

You must build your applications on top of a scalable infrastructure that can grow with your system and handle extra loads over time.

When thinking about scalability in IT world you have two general options:

  • Vertical
  • Horizontal

read more

 

2- Servers as Disposable Computing Resource

Unlike the old days that you had to build server from scratch and spend hours to install the software, patch the servers and  setup the static configuration (IP, Name, …), in cloud computing, you can dynamically create lots of servers with all the required components and configurations pre-deployed; As a result:

Your servers become just a temporary computing resources to do the processing for you

Any update, patching or fix required for the server, not a problem in the cloud! The old server will be removed and replaced with a new updated, patched and healthier version.

But how to convert the time-consuming task of a server build to a smooth reusable task/ process?

read more

 

3-Automation

When you properly design your applications and infrastructure and make them stateless and reusable by the techniques we’ve discussed before, you’ve already started your automation journey. Making your applications and components stateless you can easily scale them with minimal/no manual process involved. When you make your servers stateless and reusable you’ve already utilised some automation to bootstrap your computing resources. Infrastructure as a code is also an important foundation for any automation on your infrastructure resources.

Now that you did your homework to design your automation-ready applications, servers, and infrastructure you can sit back and let AWS do most of the manual job for you.

Autoscaling, AWS Elastic Beanstalk, CloudWatch, OpsWorks, and Lambda are some of amazing Amazon tools that can help you increase the efficiency and reliability of you IT service as well as increasing the productivity of your IT operations. These services can help you automate responding to multiple events that you would manually react in traditional IT environments.

read more

 

4-Loose Coupling

Loose Coupling plays an important role towards scalability and reliability of your applications which result in easier automation. Loose Coupling is an approach against having a big complex (IT) system or applications; Instead  design for smaller and simpler elements which can cooperate with each other to provide the same service.

In a loosely coupled system, each component/element can scale independently if needed. Each can be modified separately; Failure in an element does not affect the rest of a system and Recovery from a failure is by far easier comparing a tightly coupled complex system. All of these make a loosely coupled system to be more manageable, reliable and scalable and as a result, the operations can be easily automated.

A very simple example of decoupling is when you remove a database service from an application server and run it on a dedicated server; The Same concept applies when you want to design your loosely coupled application or decouple your existing complex system.

To create a loosely coupled application you first need to make sure you plan to Separate the Components, then define Standard Communication Interfaces for your elements; You may need to design an  Automatic Service Discovery at each layer or design based on Asynchronous Integration method; Another aspect of a loosely coupled system is how Graceful is a Failure handled which leads to minimal/no effect on the rest of the system.

 

5-Think about your Services not Servers

This is the next step towards building more efficient IT services. When you think about starting an application (in the cloud) the usual mindset is to fire up some web, application and database servers and start deploying your applications.

But there may be a better way to do this!

What if you don’t even need a Server to provide your Service? You can design around Serverless Infrastructure  by using AWS Managed Services.

Amazon offers a broad range of managed services that you can consume without firing up a server; This means more savings on underutilised resources and fewer operations and maintenance for your applications. Here are  some of the services that you can consume without starting a server:

  • Compute: AWS Lambda
  • Storage: Amazon S3, Glacier, CloudFront
  • Database: Amazon DynamoDB, Redshift
  • Application: Amazon Simple Queue Service (SQS), Simple WorkFlow(SWF)
  • Analytics: Amazon Kinesis, Elastic Map Reduce (EMR), Machine Learning
  • Mobile Apps: Amazon Cognito
  • IoT: Amazon IoT (Internet of Things)

 

6-Word of Database

One approach to deploy databases in the cloud is creating you own by starting your servers/instances and installing your own version of database and platform.

The better way, though, would be utilising AWS managed databases which are ready for you to start your service. Amazon provides a fully managed Relational  (Amazon RDS) and NoSQL (Amazon DynamoDB) database services along with in-memory caching (ElastiCache) and data warehouse service (Amazon Redshift);

Which database to choose highly depends on your application and data types; Do you need to maintain data models? How many concurrent users do you want to support? What is the size and type of objects? How big is your data and what is your growth rate?

 

7-No Single Point of Failure

Single Point of Failure is a very well known term when designing a highly available system. A system with no single point of failure can tolerate the failure on one or multiple components.(e.g., network, memory module, hard disk, server, …).

When you intend to remove single point of failure you need to consider all the services and component that your system relies on from end-to-end; For example, a high available web server may rely on the following components and elements:

Datacenters, DNS servers, internet providers, routers, switches, routeing protocols, DB/WEB/Application servers, ethernet cards, motherboards, memory modules, CPUs, hard disks, Power Modules, UPSs, Power Sources, Generators, SAN Switches, storage arrays, backup devices, …

To remove a single point of failure from your system you need to:

  • Make your components redundant.
  • Make sure you can quickly detect a failure in your system.
  • Utilise reliable storages with built-in data integrity.
  • You may consider distributing your services in multiple data centres or even in multiple regions.

 

8-Cost Optimization

Building a very high available and reliable service that has a built-in automation is not the end of your architecting journey. One of the most important aspects of any design is to make sure it is optimised for the cost.

There is no doubt that moving to the cloud reduces the capital expenses from your IT budget but you don’t want to spend all you money on your newly introduced operations expenses neither.

AWS provides tonnes of options/offerings that can extremely reduce your costs if you select/deploy them correctly based on your service requirements.

There are different resource types for different purposes with different price/performance:

  • Computing resource options: nano/micro/small/medium/large EC2 instance sizes, AWS Lambda ,…
  • Multiple database options  (Amazon RDS, Redshift, DynamoDB ,…)
  • Different Storage options (S3, Glasier, EBS, ..)

There are also different purchasing options to choose from:

  • On-Demand Instances
  • Reserved Instances
  • Spot Instance
  • Dedicated Hosts

Another way to significantly reduce your resource costs is to use Auto-Scaling in your deployment wherever possible; This will shut down the resources that are underutilised and save you lots of bucks.

 

9-Caching

Adding more resources (CPU, RAM, Network, ..)  to your servers or adding more servers to your cluster may increase the processing power but there is a limit on the level of performance that you can achieve by adding more resources. Caching is a smarter approach to increase performance and advancing you user’s experience to the next level. But what exactly cashing means in the Cloud?

Caching is storing a previously accessed/processed data for future faster access.

You can add caching in different layer of your application design and increase the performance and efficiency of that layer:

  • Edge Caching is a technique by which you save the static (or dynamic) content of your website (Images, Videos, …) in multiple global locations close to your users in a CDN (Content Delivery Network) and retrieve with a high performance when needed. (Amazon CloudFront)
  • Application Data Caching is a caching technique in the application layer. In short, it saves the processed information or DB Query results in memory so that later your application can re-use them instead of re-querying to DB or recalculating the processings. Amazon ElastiCache is a web service utilising two open-source in-memory caching engines (Memcached and Redis) to provide this feature for you.

 

10-Design for Security

When moving to the AWS cloud you can take all of your in-house security tools and systems with you. In parallel on what you already have you can get benefit from the features that AWS provides to bring more security to you environment:

  • Utilising Amazon VPC (Virtual Private Cloud) to secure and isolate your servers deep in from the network
  • Using security groups, security policies and access list to add another layer of filtering to your servers/services.
  • Deploying an Application-level Firewall AWS WAF (Web Application Firewall) to protect your web application for internet threats.
  • AWS IAM (Identity and Access Mangement) to control the granular access to your AWS services

Amazon uses the Shared Security Responsibility Model which means you are responsible for your parts in the cloud while Amazon takes the responsibility of the underlying infrastructure. One way to move some of the security responsibilities to Amazon is to use AWS managed services wherever possible; This way Amazon has to  look after all the maintenance, patching and security measure of its own managed services while you are focusing on your applications.

Similar to Infrastructure as a Code you can convert your Security as Codes which makes your security to be reproduced and reused. Using AWS CloudFormation you can create templates for your firewall rules, access lists, security policies, subnets ,… and reuse them in multiple deployments. Security as Code leads to more accurate security along with all the automation and efficiency that it provides.