Often when new application’s development is started, exact number of concurrent users is not known. This is true for B2C apps, but even true to B2B apps sometimes, in growing organizations.
What is the problem ?
Application’s performance is obviously dependent on how efficient the code is. But now a days, everybody performs load tests to ensure that at least certain number of concurrent users are supported by the application. So, any inefficiencies in the application code can be caught in earlier stages of development cycle.
Main difference between load testing and production is nature of load. The nature of load is almost constant in load testing, on production the load may vary depending on many factors.
For ex. number of users on eCommerce website may be very high during morning time before office hours and evening time after office hours and may be extremely high during festival season. On the other hand, the load on applications created for auditors may vary depending on time in financial year.
The performance of application deteriorates if the load is very high and underlying hardware resources are not able to support the load. The application may not respond within expected time or the application itself might not be accessible if the load is very high. This may result in poor user experience and if this happens frequently then it may result into loss in business as well.
What is the solution ?
The obvious solution is to increase the hardware resources for the applications.
For on-premise applications, organizations try to buy highest possible configuration, obviously because business cannot afford loss because of application’s poor performance.
This approach has two drawbacks –
- Firstly, this solution is not most efficient solution if the load is variable, because whole capacity is not in use. But organization has already invested in the hardware and the expenditure already done.
- Secondly, if the application load grows beyond the capacity of purchased hardware after few months of launching the application, you are again needed to buy new hardware, more investment.
Using Azure for your applications, is almost same as renting the infrastructure for your applications. You need very small upfront investment to get your application working.
Then based on usage reports, you can decide if the application servers need more resources or not. There are two types of scaling applicable to Azure services:
- Scale up or Vertical Scaling, meaning resources in given server are increased. Resources of a server are grouped under
Pricing Tiers. Scale down meaning reducing the configurations of server. For ex., Scale up may be applicable if you may want to have more RAM, more processing power for the virtual machine.
- Scale out or Horizontal Scaling, meaning number of servers (or instances of server) are increased, every server has same configurations. Scale-in meaning reducing number of servers. For ex. if your application is hosted on 1 server, horizontal scaling (Scale out) means hosting it on 3 identical servers.
How to scale up?
Vertical scaling or scale up is applicable to almost all Azure services. It is best explained in below diagram.
For ex, applying scale up to virtual machine, means changing size of the virtual machine.
Below snapshot shows how an App service can be scaled up.
- Pricing tier, S1 in below snapshot, which implies the quantity of memory and processing power applied to each server
- Number of instances, 1 in below snapshot, telling how many number of servers are available.
Now here Azure Portal provides information on what will be configuration of server and recommendation on when to use each pricing tier.
As the name suggests, pricing tier also has a cost associated with it. So if you scale up, the monthly costs would increase for the resource.
Why scaling out is needed ?
As stated earlier, Vertical scaling can be done by changing the pricing tiers in Azure. But there is limit on how many times are you going to scale up. Eventually it will hit the limit because there is a limit of memory or processor that a computer can support. That’s why horizontal scaling is really helpful.
autoscaling mostly means horizontal scaling. It means adding or removing instances based on certain preset conditions
Azure Monitor collects metrics and logs from almost all type of resources. This data can be used and
autoscaling can be configured for resources which serve the application. I will try to explain
autoscaling in next article.
I hope you liked this article. Let me know your thoughts.