Cloud computing is the delivery of computing services including Servers, Database, Storage, Networking & others over the internet. Public, Private & Hybrid clouds are different ways of deploying cloud computing.
- In public cloud, the cloud resources are owned by 3rd party cloud service provider
- A private cloud consists of computing resources exclusively by one business or organization
- Hybrid provides the best of both worlds, combines on-premises infrastructure, private cloud with public cloud
Microsoft, Google, Amazon, Oracle, IBM, and others are providing cloud platform to users to host and experience practical business solution. The worldwide public cloud services market is forecast to grow 17% in 2020 to total $266.4 billion and $354.6 billion in 2022, up from $227.8 billion in 2019, per Gartner, Inc.
There are various types of Instances, workloads & options available as part of cloud ecosystem, i.e. IaaS, PaaS, SaaS, Multi-cloud, Serverless.
When very large, large and medium enterprise decides to move their IT environment from on-premise to cloud, they try to move some/most of their on-premises into cloud and keep the rest under their control on-premise. There are various factors that impact the decision, to name a few,
- ROI vs Cost of Cloud Instance, Operation cost
- Architecture dependency of the application, i.e. whether it is monolithic or multi-tier or polyglot or hybrid cloud
- Requirement and need for elasticity and scalability
- Availability of right solution from the cloud provider
- Security of some key data
After crossing all, once the IT environment is cloud-enabled, the challenge comes in ensuring the monitoring of the Cloud-enabled IT environment. Here are some of the business and IT challenges
1. How to ensure the various workloads & Instances are working as expected?
While the cloud provider may give high availability & up time depending on the tier we choose, it is important that our IT team monitors the environment, as in the case of IaaS and to some extent in PaaS as well.
2. How to ensure the Instances are optimally used in terms of compute and storage?
Cloud providers give most of the metrics around the Instances, though it may not provide all metrics that we may need to make decision in all scenarios.
The disadvantage with this model is, cost, latency & not straight forward, e.g. the LOG analytics which comes in Azure involves cost for every MB/GB of data that is stored and the latency in getting the right metrics at right time, if there is latency/delay, you may not get a right result
3. How to ensure the Application or the components of a single solution that are spread across on-premise and Cloud environment is working as expected?
Some cloud providers give tools for integrating the metrics from on-premise to cloud environment to have a shared view.
The disadvantage with this model is, it is not possible to bring in all sorts of data together to get the insights straight. That is, observability is always a question. The ownership of getting the observability lies with the IT team who handles the data.
4. How to ensure the Multi-Cloud + On-Premise environment is effectively monitored & utilized to ensure the best End-user experience?
Multi-Cloud environment – With rapid growing Microservices Architecture & Container based cloud enabled model, it is quite natural that the Enterprise may choose the best from different cloud providers like Azure, AWS, Google & others.
There is little support from cloud provider on this space. In fact, some cloud providers do not even support this scenario.
5. How to get a single panel of view for troubleshooting & root cause analysis?
Especially when problem occurs in Application, Database, Middle Tier, Network & 3rd party layers that are spread across multi-cluster, multi-cloud, elastic environment, it is very important to get a Unified view of entire environment.
ZIF (Zero Incident FrameworkTM), provides a single platform for Cloud Monitoring.
ZIF has Discovery, Monitoring, Prediction & Remediate that seamlessly fits for a cloud enabled solution. ZIF provides the unified dashboard with insights across all layers of IT infrastructure that is distributed across On-premise host, Cloud Instance & Containers.
Core features & benefits of ZIF for Cloud Monitoring are,
1. Discovery & Topology
- Discovers and provides dynamic mapping of resources across all layers.
- Provides real-time mapping of applications and its dependent layers irrespective of whether the components live on-premise, or on cloud or containerized in cloud.
- Dynamically built topology of all layers which helps in taking effective decisions.
2. Observability across Multi-Cloud, Hybrid-Cloud & On-Premise tiers
- It is not just about collecting metrics; it is very important to analyze the monitored data and provide meaningful insights.
- When the IT infrastructure is spread across multiple cloud platform like Azure, AWS, Google Cloud, and others, it is important to get a unified view of your entire environment along with the on-premise servers.
- Health of each layers are represented in topology format, this helps to understand the impact and take necessary actions.
3. Prediction driven decision for resource optimization
- Prediction engine analyses the metrics of cloud resources and predicts the resource usage. This helps the resource owner to make proactive action rather than being reactive.
- Provides meaningful insights and alerts in terms of the surge in the load, the growth in number of VMs, containers, and the usage of resource across other workloads.
- Authorize the Elasticity & Scalability through real-time metrics.
4. Container & Microservice support
- Understand the resource utilization of your containers that are hosted in Cloud & On-Premise.
- Know the bottlenecks around the Microservices and tune your environment for the spikes in load.
- Provides full support for monitoring applications distributed across your local host & containers in cloud in a multi-cluster setup.
5. Root cause analysis made simple
- Quick root cause analysis by analysing various causes captured by ZIF Monitor instead of going through layer by layer. This saves time to focus on problem-solving and arresting instead of spending effort on identifying the root cause.
- Provides insights across your workload including the impact due to 3rd party layers as well.
- Irrespective of whether the workload and instance is on-premise or on Azure or AWS or other provider, the ZIF automation module can automate the basics to complex activities
7. Ensure End User Experience
- Helps to improve the end-user experience who gets served by the workload from cloud.
- The ZIF tracing helps to trace each & every request of each & every user, thereby it is quite natural for ZIF to unearth the performance bottleneck across all layers, which in turn helps to address the problem and thereby improve the User Experience
Cloud and Container Platform Support
ZIF Seamlessly integrates with following Cloud & Container environments,
- Microsoft Azure
- Google Cloud
- Grafana Cloud
About the Author –
Suresh Kumar Ramasamy
Suresh heads the Monitor component of ZIF at GAVS. He has 20 years of experience in Native Applications, Web, Cloud, and Hybrid platforms from Engineering to Product Management. He has designed & hosted the monitoring solutions. He has been instrumental in conglomerating components to structure the Environment Performance Management suite of ZIF Monitor.
Suresh enjoys playing badminton with his children. He is passionate about gardening, especially medicinal plants.