Based on AWS Best Practice on Architecting for the cloud you need to consider these 10 design principles when architecting in the cloud:
- Scalability
- Disposable Servers/Resources
- Think Automation
- Loose Coupling
- Think about Services, Not Servers
- DataBase
- No Single Point of Failure
- Cost Optimization
- Caching
- Security
In this Article I explain what each subject means in cloud computing design and how you can achieve each principle:
Topics in this section:
1-Scalability
Almost every applications and systems grow over time (users, data, traffic, …); You may have different application with different growth rate, but how do you make sure your infrastructure can catch up with your application/data getting bigger and bigger:
You must build your applications on top of a scalable infrastructure that can grow with your system and handle extra loads over time.
When thinking about scalability in IT world you have two general options:
- Vertical
- Horizontal
2- Servers as Disposable Computing Resource
Unlike the old days that you had to build server from scratch and spend hours to install the software, patch the servers and setup the static configuration (IP, Name, …), in cloud computing, you can dynamically create lots of servers with all the required components and configurations pre-deployed; As a result:
Your servers become just a temporary computing resources to do the processing for you
Any update, patching or fix required for the server, not a problem in the cloud! The old server will be removed and replaced with a new updated, patched and healthier version.
But how to convert the time-consuming task of a server build to a smooth reusable task/ process?
3-Automation
When you properly design your applications and infrastructure and make them stateless and reusable by the techniques we’ve discussed before, you’ve already started your automation journey. Making your applications and components stateless you can easily scale them with minimal/no manual process involved. When you make your servers stateless and reusable you’ve already utilised some automation to bootstrap your computing resources. Infrastructure as a code is also an important foundation for any automation on your infrastructure resources.
Now that you did your homework to design your automation-ready applications, servers, and infrastructure you can sit back and let AWS do most of the manual job for you.
Autoscaling, AWS Elastic Beanstalk, CloudWatch, OpsWorks, and Lambda are some of amazing Amazon tools that can help you increase the efficiency and reliability of you IT service as well as increasing the productivity of your IT operations. These services can help you automate responding to multiple events that you would manually react in traditional IT environments.
4-Loose Coupling
Loose Coupling plays an important role towards scalability and reliability of your applications which result in easier automation. Loose Coupling is an approach against having a big complex (IT) system or applications; Instead design for smaller and simpler elements which can cooperate with each other to provide the same service.
In a loosely coupled system, each component/element can scale independently if needed. Each can be modified separately; Failure in an element does not affect the rest of a system and Recovery from a failure is by far easier comparing a tightly coupled complex system. All of these make a loosely coupled system to be more manageable, reliable and scalable and as a result, the operations can be easily automated.
A very simple example of decoupling is when you remove a database service from an application server and run it on a dedicated server; The Same concept applies when you want to design your loosely coupled application or decouple your existing complex system.
To create a loosely coupled application you first need to make sure you plan to Separate the Components, then define Standard Communication Interfaces for your elements; You may need to design an Automatic Service Discovery at each layer or design based on Asynchronous Integration method; Another aspect of a loosely coupled system is how Graceful is a Failure handled which leads to minimal/no effect on the rest of the system.
…
5-Think about your Services not Servers
This is the next step towards building more efficient IT services. When you think about starting an application (in the cloud) the usual mindset is to fire up some web, application and database servers and start deploying your applications.
But there may be a better way to do this!
What if you don’t even need a Server to provide your Service? You can design around Serverless Infrastructure by using AWS Managed Services.
Amazon offers a broad range of managed services that you can consume without firing up a server; This means more savings on underutilised resources and fewer operations and maintenance for your applications. Here are some of the services that you can consume without starting a server:
- Compute: AWS Lambda
- Storage: Amazon S3, Glacier, CloudFront
- Database: Amazon DynamoDB, Redshift
- Application: Amazon Simple Queue Service (SQS), Simple WorkFlow(SWF)
- Analytics: Amazon Kinesis, Elastic Map Reduce (EMR), Machine Learning
- Mobile Apps: Amazon Cognito
- IoT: Amazon IoT (Internet of Things)
…
6-Word of Database
One approach to deploy databases in the cloud is creating you own by starting your servers/instances and installing your own version of database and platform.
The better way, though, would be utilising AWS managed databases which are ready for you to start your service. Amazon provides a fully managed Relational (Amazon RDS) and NoSQL (Amazon DynamoDB) database services along with in-memory caching (ElastiCache) and data warehouse service (Amazon Redshift);
Which database to choose highly depends on your application and data types; Do you need to maintain data models? How many concurrent users do you want to support? What is the size and type of objects? How big is your data and what is your growth rate?
…
7-No Single Point of Failure
Single Point of Failure is a very well known term when designing a highly available system. A system with no single point of failure can tolerate the failure on one or multiple components.(e.g., network, memory module, hard disk, server, …).
When you intend to remove single point of failure you need to consider all the services and component that your system relies on from end-to-end; For example, a high available web server may rely on the following components and elements:
Datacenters, DNS servers, internet providers, routers, switches, routeing protocols, DB/WEB/Application servers, ethernet cards, motherboards, memory modules, CPUs, hard disks, Power Modules, UPSs, Power Sources, Generators, SAN Switches, storage arrays, backup devices, …
To remove a single point of failure from your system you need to:
- Make your components redundant.
- Make sure you can quickly detect a failure in your system.
- Utilise reliable storages with built-in data integrity.
- You may consider distributing your services in multiple data centres or even in multiple regions.
…
8-Cost Optimization
Building a very high available and reliable service that has a built-in automation is not the end of your architecting journey. One of the most important aspects of any design is to make sure it is optimised for the cost.
There is no doubt that moving to the cloud reduces the capital expenses from your IT budget but you don’t want to spend all you money on your newly introduced operations expenses neither.
AWS provides tonnes of options/offerings that can extremely reduce your costs if you select/deploy them correctly based on your service requirements.
There are different resource types for different purposes with different price/performance:
- Computing resource options: nano/micro/small/medium/large EC2 instance sizes, AWS Lambda ,…
- Multiple database options (Amazon RDS, Redshift, DynamoDB ,…)
- Different Storage options (S3, Glasier, EBS, ..)
There are also different purchasing options to choose from:
- On-Demand Instances
- Reserved Instances
- Spot Instance
- Dedicated Hosts
Another way to significantly reduce your resource costs is to use Auto-Scaling in your deployment wherever possible; This will shut down the resources that are underutilised and save you lots of bucks.
…
9-Caching
Adding more resources (CPU, RAM, Network, ..) to your servers or adding more servers to your cluster may increase the processing power but there is a limit on the level of performance that you can achieve by adding more resources. Caching is a smarter approach to increase performance and advancing you user’s experience to the next level. But what exactly cashing means in the Cloud?
Caching is storing a previously accessed/processed data for future faster access.
You can add caching in different layer of your application design and increase the performance and efficiency of that layer:
- Edge Caching is a technique by which you save the static (or dynamic) content of your website (Images, Videos, …) in multiple global locations close to your users in a CDN (Content Delivery Network) and retrieve with a high performance when needed. (Amazon CloudFront)
- Application Data Caching is a caching technique in the application layer. In short, it saves the processed information or DB Query results in memory so that later your application can re-use them instead of re-querying to DB or recalculating the processings. Amazon ElastiCache is a web service utilising two open-source in-memory caching engines (Memcached and Redis) to provide this feature for you.
…
10-Design for Security
When moving to the AWS cloud you can take all of your in-house security tools and systems with you. In parallel on what you already have you can get benefit from the features that AWS provides to bring more security to you environment:
- Utilising Amazon VPC (Virtual Private Cloud) to secure and isolate your servers deep in from the network
- Using security groups, security policies and access list to add another layer of filtering to your servers/services.
- Deploying an Application-level Firewall AWS WAF (Web Application Firewall) to protect your web application for internet threats.
- AWS IAM (Identity and Access Mangement) to control the granular access to your AWS services
Amazon uses the Shared Security Responsibility Model which means you are responsible for your parts in the cloud while Amazon takes the responsibility of the underlying infrastructure. One way to move some of the security responsibilities to Amazon is to use AWS managed services wherever possible; This way Amazon has to look after all the maintenance, patching and security measure of its own managed services while you are focusing on your applications.
Similar to Infrastructure as a Code you can convert your Security as Codes which makes your security to be reproduced and reused. Using AWS CloudFormation you can create templates for your firewall rules, access lists, security policies, subnets ,… and reuse them in multiple deployments. Security as Code leads to more accurate security along with all the automation and efficiency that it provides.