
Almost every applications and systems grow over time (users, data, traffic, …); You may have different application with different growth rate, but how to make sure your infrastructure can catch up with your application/data getting bigger and bigger:
You must build your applications on top of a scalable infrastructure that can grow with your system and handle extra loads over time.
This is why building a highly scalable system is one of the Important Factors when Designing in the Cloud.
When thinking about scalability in IT world you have two general options:
- Vertical
- Horizontal
Topics in this section:
Vertical Scaling (Scale Up)
Vertical scaling is basically adding more CPU, RAM, I/O or networking capacity to a Physical or Virtual server , storage or networking device. But there is a problem with this kind of scaling:
You have a limit on the amount of resource you can add to a server, memory and CPU for example; Plus you put your application atvthe risk of a big server goes down.
Horizontal Scaling (Scale Out)
When you scale horizontally you increase the number of resources – for example adding more server or network devices-. The benefit for this kind of scaling is:
Technically you can scale with no limit and start creating an Elastic system that can scale up/down based on your application requirements.
But you need to consider the fact that not all the applications support distributing the load on multiple resources; In other words, your application needs to be Stateless.
Stateless Application
Stateless Application doesn’t need to know about user’s session or any previous user’s interaction with an application so any node can be gracefully added/removed with no end-user disruption. For example, a static web website that doesn’t provide any user login feature so doesn’t need to keep the user session information.
Stateless Component
It’s too good to have everything stateless but in reality, you always have some sort of session/state you need to maintain in your applications, here are two situations:
- You may have your user’s session/login information in your web application that you need to maintain.
- In multi-step data processes, you need to keep track of previous tasks/activities in your process flow.
In these situations you can remove these state data from your nodes and store them somewhere else and let the Components be Stateless:
- You can save user’s session data in the Managed Databases (Amazon DynamoDB) and detach session data from your servers and make them stateless.
- If you need to keep your user’s files (pictures, data, batch processing results …) you can put them on a highly available Managed Shared Storage ( Amazon S3, Amazon EFS).
- You can use a Managed Workflow Services (Amazon Simple Workflow Service (SWF)) to store the execution history in a central shared location when you want to keep track of multi-step workflow process and make it stateless.
Stateful Applications/Components
What if you can’t (or don’t want to) make your component stateless:
- Like some legacy applications that doesn’t support this by nature.
- Or you may not want your users to move between your nodes. (i.e, some multi-player gaming applications which require very low latency when users playing the game.)
There is a way to scale these Stateful Applications/Components which is what we know as Session Affinity strategies. when using session affinity there are some limitations on how well you can distribute the load when you add/remove the node to the cluster as users are attached to a specific node; As a result of users not being able to move between servers you may not have a fully load balanced servers.
After you create your beautifully designed scalable systems you need to distribute the load to these nodes, there are two high-level Load Distribution Strategies you can use: Push Method / Pull Method.
Distributed Processing
Imagine a case which you need to process a massive amount of data and hardly you can find a single server that is capable of handling the processing load. Here is another type of Horizontal Scaling comes into play:
Distributed Processing is splitting a big task (and it’s big data) to many smaller jobs, and process them in parallel and in multiple/many servers.
Here are two general solutions to handle distributed processing:
- You can use a Distributed Data Processing Engine (e.g, Apache Hadoop, Amazon Elastic MapReduce(EMR)) to manage and process a massive amount of distributed data.
- In case you have a large stream of real-time data you can use Amazon Kinesis to divide your data into multiple portions and process them by multiple computing resources in your server farm.