Machine Learning: Building Clustering Algorithms

Gireesh Sreedhar KP


Clustering is a widely-used Machine Learning (ML) technique. Clustering is an Unsupervised ML algorithm that is built to learn patterns from input data without any training, besides being able of processing data with high dimensions. This makes clustering the method of choice to solve a wide range and variety of ML problems.

Since clustering is widely used, for Data Scientists and ML Engineer’s it is critical to understand how to practically build clustering algorithms even though many of us have a high-level understanding of clustering. Let us understand the approach to build a clustering algorithm from scratch.

What is Clustering and how does it work?

Clustering is finding groups of objects (data) such that objects in the same group will be similar (related) to one another and different from (unrelated to) objects in other groups.

Clustering works on the concept of Similarity/Dissimilarity between data points. The higher similarity between data points, the more likely these data points will belong to the same cluster and higher the dissimilarity between data points, the more likely these data points will be kept out of the same cluster.

Similarity is the numerical measure of how alike two data objects are. Similarity will be higher when objects are more alike. Dissimilarity is the numerical measure of how different two data objects. Dissimilarity is lower when objects are more alike.

We create a ‘Dissimilarity Matrix’ (also called Distance Matrix) as an input to a clustering algorithm, where the dissimilarity matrix gives algorithm the notion of dissimilarity between objects. We build a dissimilarity matrix for each attribute of data considered for clustering and then combine the dissimilarity matrix for each data attribute to form an overall dissimilarity matrix. The dissimilarity matrix is an NxN square matrix where N is the number of data points considered for clustering and each element of the NxN square matrix gives dissimilarity between two objects.

Building Clustering Algorithm

Building a clustering algorithm involve the following:

  • Selection of most suited clustering techniques and algorithms to solve the problem. This step needs close collaboration among SMEs, business users, data scientists, and ML engineers. Based on inputs and data study, a possible list of algorithms (one or more) is selected for modeling and development along with tuning parameters are decided (to give algorithm more flexibility for tuning and learning from SME).
  • The selection of data attributes for the formulation of the dissimilarity matrix and methodology for the formation of the dissimilarity matrix (discussed later).
  • Building algorithms and doing the Design of experiments to select the best-suited algorithm and algorithm parameters for implementation.
  • Implementation of algorithm and fine-tuning of parameters as required.

Building a Dissimilarity matrix:

There are different approaches to build a dissimilarity matrix, here we consider building a dissimilarity matrix containing the distance (called Distance Matrix) between data objects (another alternative approach is to feed in coordinate points and let the algorithm compute distance). Let us consider a group of N data objects to be clustered based on three data attributes of each data object. The steps for building a Distance matrix are:

Build a Distance matrix for individual data attributes. Here we build three individual distance matrices (one for each attribute) containing distance between data objects calculated for each attribute. The data is always scaled between [0,1] using one of the standard normalization methods such as Min-Max Scalar. Here is how the distance matrix for an attribute looks like.

Properties of Distance Matrix:

  1. Distance Matrix is NxN square matrix (N – number of objects in clustering space)
  2. Matrix is symmetric with diagonal as zero (zero diagonal as distance of an object from itself is zero)
  3. For categorical data, distance between two points = 0, if both are same; =1 otherwise
  4. For numeric/ordered data, distance between two points = difference between scaled attribute values of two points.

Build Complete Distance matrix. Here we build a complete distance matrix combining distance matrix of individual attributes forming the input for clustering algorithm.

Complete distance matrix = (element-wise sum of individual attribute level matrix)/3;

Generalized Complete distance matrix = (element-wise sum of individual attribute level matrix)/M, where M is the number of attribute level matrix formed.

Considerations for the selection of clustering algorithms:

Before the selection of a clustering algorithm, the following considerations need to be evaluated to identify the right clustering algorithms for the given problem.

  • Partition criteria: Single Level vs hierarchical portioning
  • Separation of clusters: Exclusive (one data point belongs to only one class) vs non-exclusive (one data point can belong to more than one class)
  • Similarity measures: Distance-based vs Connectivity-based
  • Clustering space: Full space (used when low dimension data is processed) vs Subspace (used when high dimension data is processed, where only subspace can be processed and interesting clustering can be formed)
  • Attributes processing: Ability to deal with different types of attributes: Numerical, Categorical, Text, Media, a combination of data types in inputs
  • Discovery of clusters: Ability to form a predefined number of clusters or an arbitrary number of clusters
  • Ability to deal with noise in data
  • Scalability to deal with huge volumes of data, high dimensionality, incremental, or streaming data.
  • Ability to deal with constraints on user preference and domain requirements.

Application of Clustering

There are broadly two applications of clustering.

As an ML tool to get insight into data. Like building Recommendation Systems or Customer segmentation by clustering like-minded users or similar products, Social network analysis, Biological data analysis like Gene/Protein sequence analysis, etc.

As a pre-processing or intermediate step for other classes of algorithms. Like some Pattern-mining algorithms use clustering to group patterns mined and select most representative patterns instead of selecting entire patterns mined.

Conclusion

Building ML algorithm is teamwork with a team consisting of SMEs, users, data scientists, and ML engineers, each playing their part for success. The article gives steps to build a clustering algorithm, this can be used as reference material while attempting to build your algorithm.

About the Author:

Gireesh is a part of the projects run in collaboration with IIT Madras for developing AI solutions and algorithms. His interest includes Data Science, Machine Learning, Financial markets, and Geo-politics. He believes that he is competing against himself to become better than who he was yesterday. He aspires to become a well-recognized subject matter expert in the field of Artificial Intelligence.

Assess Your Organization’s Maturity in Adopting AIOps

Artificial Intelligence for IT operations (AIOps) is adopted by organizations to deliver tangible Business Outcomes. These business outcomes have a direct impact on companies’ revenue and customer satisfaction.

A survey from AIOps Exchange 2019, reports that 84% of Business Owners who attended the survey, confirmed that they are actively evaluating AIOps to be adopted in their organizations.

So, is AIOps just automation? Absolutely NOT!!

Artificial Intelligence for IT operations implies the implementation of true Autonomous Artificial Intelligence in ITOps, which needs to be adopted as an organization-wide strategy. Organizations will have to assess their existing landscape, processes, and decide where to start. That is the only way to achieve the true implementation of AIOps.

Every organization trying to evaluate AIOps as a strategy should read through this article to understand their current maturity, and then move forward to reach the pinnacle of Artificial Intelligence in IT Operations.

The primary Success Factor in adopting AIOps is derived from the Business Outcomes the organization is trying to achieve by implementing AIOps –that is the only way to calculate ROI.

There are 4 levels of Maturity in AIOps adoption. Based on our experience in developing an AIOps platform and implementing the platform across multiple industries, we have arrived at these 4 levels. Assessing an organization against each of these levels helps in achieving the goal of TRUE Artificial Intelligence in IT Operations.

Level 1: Knee-jerk

Events, logs are generated in silos and collected from various applications and devices in the infrastructure. These are used to generate alerts that are commissioned to command centres to escalate as per the SOPs (standard operating procedures) defined. The engineering teams work in silos, not aware of the business impact that these alerts could potentially create. Here, operations are very reactive which could cost the organization millions of dollars.

Level 2: Unified

Have integrated all events, logs, and alerts into one central locale. ITSM process has been unified. This helps in breaking silos and engineering teams are better prepared to tackle business impacts. SOPs have been adjusted since the process is unified, but this is still reactive incident management.

Level 3: Intelligent

Machine Learning algorithms (either supervised or unsupervised) have been implemented on the unified data to derive insights. There are baseline metrics that are calibrated and will be used as a reference for future events. With more data, the metrics get richer. IT operations team can correlate incidents/events with business impacts by leveraging AI & ML. If Mean Time To Resolve (MTTR) an incident has been reduced by automated identification of the root cause, then the organization has attained level 3 maturity in AIOps.

Level 4: Predictive & Autonomous

The pinnacle of AIOps is level 4. If incidents and performance degradation of applications can be predicted by leveraging Artificial Intelligence, it implies improved application availability. Autonomousremediation bots can be triggered spontaneously based on the predictive insights, to fix incidents that are prone to happen in the enterprise. Level 4 is a paradigm shift in IT operations – moving operations entirely from being reactive, to becoming proactive.

Conclusion:

As IT operations teams move up each level, the essential goal to keep in mind is the long-term strategy that needs to be attained by adopting AIOps. Artificial Intelligence has matured over the past few decades, and it is up to AIOps platforms to embrace it effectively. While choosing an AIOps platform, measure the maturity of the platform’s artificial intelligent coefficient.

About the Author:

Anoop Aravindakshan (Principal Consultant Manager) at GAVS Technologies.


An evangelist of Zero Incident FrameworkTM, Anoop has been a part of the product engineering team for long and has recently forayed into product marketing. He has over 14 years of experience in Information Technology across various verticals, which include Banking, Healthcare, Aerospace, Manufacturing, CRM, Gaming, and Mobile.

Prediction for Business Service Assurance

Artificial Intelligence for IT operations or AIOps has exploded over the past few years. As more and more enterprises set about their digital transformation journeys, AIOps becomes imperative to keep their businesses running smoothly. 

AIOps uses several technologies like Machine Learning and Big Data to automate the identification and resolution of common Information Technology (IT) problems. The systems, services, and applications in a large enterprise produce volumes of log and performance data. AIOps uses this data to monitor the assets and gain visibility into the behaviour and dependencies among these assets.

According to a Gartner publication, the adoption of AIOps by large enterprises would rise to 30% by 2023.

ZIF – The ideal AIOps platform of choice

Zero Incident FrameworkTM (ZIF) is an AIOps based TechOps platform that enables proactive detection and remediation of incidents helping organizations drive towards a Zero Incident Enterprise™.

ZIF comprises of 5 modules, as outlined below.

At the heart of ZIF, lies its Analyze and Predict (A&P) modules which are powered by Artificial Intelligence and Machine Learning techniques. From the business perspective, the primary goal of A&P would be 100% availability of applications and business processes.

Let us understand more about thePredict module of ZIF.

Predictive Analytics is one of the main USP of the ZIF platform. ZIF encompassesSupervised, Unsupervised and Reinforcement Learning algorithms for realization of various business use cases (as shown below).

How does the Predict Module of ZIF work?

Through its data ingestion capabilities, the ZIF platform can receive and process all types of data (both structured and unstructured) from various tools in the enterprise. The types of data can be related to alerts, events, logs, performance of devices, relations of devices, workload topologies, network topologies etc. By analyzing all these data, the platform predicts the anomalies that can occur in the environment. These anomalies get presented as ‘Opportunity Cards’ so that suitable action can be taken ahead of time to eliminate any undesired incidents from occurring. Since this is ‘Proactive’ and not ‘Reactive’, it brings about a paradigm shift to any organization’s endeavour to achieve 100% availability of their enterprise systems and platforms. Predictions are done at multiple levels – application level, business process level, device level etc.

Sub-functions of Prediction Module

How does the Predict module manifest to enterprise users of the platform?

Predict module categorizes the opportunity cards into three swim lanes.

  1. Warning swim lane – Opportunity Cards that have an “Expected Time of Impact” (ETI) beyond 60 minutes.
  2. Critical swim lane – Opportunity Cards that have an ETI within 60 minutes.
  3. Processed / Lost– Opportunity Cards that have been processed or lost without taking any action.

Few of the enterprises that realized the power of ZIF’s Prediction Module

  • A manufacturing giant in the US
  • A large non-profit mental health and social service provider in New York
  • A large mortgage loan service provider in the US
  • Two of the largest private sector banks in India

For more detailed information on GAVS’ Analyze, or to request a demo please visithttps://zif.ai/products/predict/

References:https://www.gartner.com/smarterwithgartner/how-to-get-started-with-aiops/

About the Author:

Vasudevan Gopalan

Vasu heads Engineering function for A&P. He is a Digital Transformation leader with ~20 years of IT industry experience spanning across Product Engineering, Portfolio Delivery, Large Program Management etc. Vasu has designed and delivered Open Systems, Core Banking, Web / Mobile Applications etc.

Outside of his professional role, Vasu enjoys playing badminton and focusses on fitness routines.

Discover, Monitor, Analyze & Predict COVID-19

Uber, the world’s largest taxi company, owns no vehicles. Facebook, the world’s most popular media owner, creates no content. Alibaba, the most valuable retailer, has no inventory. Netflix, the world’s largest movie house, own no cinemas. And Airbnb, the world’s largest accommodation provider, owns no real estate. Something interesting is happening.”

– Tom Goodwin, an executive at the French media group Havas.

This new breed of companies is the fastest growing in history because they own the customer interface layer. It is the platform where all the value and profit is. “Platform business” is a more wholesome termfor this model for which data is the fuel; Big Data & AI/ML technologies are the harbinger of new waves of productivity growth and innovation.

With Big data and AI/ML is making a big difference in the area of public health, let’s see how it is helping us tackle the global emergency of coronavirus formally known as COVID-19.

“With rapidly spreading disease, a two-week lag is an eternity.”

DISCOVERING/ DETECTING

Chinese technology giant Alibaba has developed an AI system for detecting the COVID-19 in CT scans of patients’ chests with 96% accuracy against viral pneumonia cases. It only takes 20 seconds for the AI to decide, whereas humans generally take about 15 minutes to diagnose the illness as there can be upwards of 300 images to evaluate.The system was trained on images and data from 5,000 confirmed coronavirus cases and has been tested in hospitals throughout China. Per a report, at least 100 healthcare facilities are currently employing Alibaba’s AI to detect COVID-19.

Ping An Insurance (Group) Company of China, Ltd (Ping An) aims to address the issue of lack of radiologists by introducing the COVID-19 smart image-reading system. This image-reading system can read the huge volumes of CT scans in epidemic areas.

Ping An Smart Healthcare uses clinical data to train the AI model of the COVID-19 smart image-reading system. The AI analysis engine conducts a comparative analysis of multiple CT scan images of the same patient and measures the changes in lesions. It helps in tracking the development of the disease, evaluation of the treatment and in prognosis of patients.Ultimately it assists doctors to diagnose, triage and evaluate COVID-19 patients swiftly and effectively.

Ping An Smart Healthcare’s COVID-19 smart image-reading system also supports AI image-reading remotely by medical professionals outside the epidemic areas.Since its launch, the smart image-reading system has provided services to more than 1,500 medical institutions. More than 5,000 patients have received smart image-reading services for free.

The more solutions the better. At least when it comes to helping overwhelmed doctors provide better diagnoses and, thus, better outcomes.

MONITORING

  • AI based Temperature monitoring & scanning

In Beijing, China, subway passengers are being screened for symptoms of coronavirus, but not by health authorities. Instead, artificial intelligence is in-charge.

Two Chinese AI giants, Megvii and Baidu, have introduced temperature-scanning. They have implemented scanners to detect body temperature and send alerts to company workers if a person’s body temperature is high enough to constitute a fever.

Megvii’s AI system detects body temperatures for up to 15 people per second andup to 16 feet. It monitors as many as 16 checkpoints in a single station. The system integrates body detection, face detection, and dual sensing via infrared cameras and visible light. The system can accurately detect and flag high body temperature even when people are wearing masks, hats, or covering their faces with other items. Megvii’s system also sends alerts to an on-site staff member.

Baidu, one of the largest search-engine companies in China, screens subway passengers at the Qinghe station with infrared scanners. It also uses a facial-recognition system, taking photographs of passengers’ faces. If the Baidu system detects a body temperature of at least 99-degrees Fahrenheit, it sends an alert to the staff member for another screening. The technology can scan the temperatures of more than 200 people per minute.

  • AI based Social Media Monitoring

An international team is using machine learning to scour through social media posts, news reports, data from official public health channels, and information supplied by doctors for warning signs of the virus across geographies.The program is looking for social media posts that mention specific symptoms, like respiratory problems and fever, from a geographic area where doctors have reported potential cases. Natural language processing is used to parse the text posted on social media, for example, to distinguish between someone discussing the news and someone complaining about how they feel.

The approach has proven capable of spotting a coronavirus needle in a haystack of big data. This technique could help experts learn how the virus behaves. It may be possible to determine the age, gender, and location of those most at risk quicker than using official medical sources.

PREDICTING

Data from hospitals, airports, and other public locations are being used to predict disease spread and risk. Hospitals can also use the data to plan for the impact of an outbreak on their operations.

Kalman Filter

Kalman filter was pioneered by Rudolf Emil Kalman in 1960, originally designed and developed to solve the navigation problem in the Apollo Project. Since then, it has been applied to numerous cases such as guidance, navigation, and control of vehicles, computer vision’s object tracking, trajectory optimization, time series analysis in signal processing, econometrics and more.

Kalman filter is a recursive algorithm which uses time-series measurement over time, containing statistical noise and produce estimations of unknown variables.

For the one-day prediction Kalman filter can be used, while for the long-term forecast a linear model is used where its main features are Kalman predictors, infected rate relative to population, time-depended features, and weather history and forecasting.

The one-day Kalman prediction is very accurate and powerful while a longer period prediction is more challenging but provides a future trend.Long term prediction does not guarantee full accuracy but provides a fair estimation following the recent trend. The model should re-run daily to gain better results.

GitHub Link: https://github.com/Rank23/COVID19

ANALYZING

The Center for Systems Science and Engineering at Johns Hopkins University has developed an interactive, web-based dashboard that tracks the status of COVID-19 around the world. The resource provides a visualization of the location and number of confirmed COVID-19 cases, deaths and recoveries for all affected countries.

The primary data source for the tool is DXY, a Chinese platform that aggregates local media and government reports to provide COVID-19 cumulative case totals in near real-time at the province level in China and country level otherwise. Additional data comes from Twitter feeds, online news services and direct communication sent through the dashboard. Johns Hopkins then confirms the case numbers with regional and local health departments. This kind of Data analytics platform plays a pivotal role in addressing the coronavirus outbreak.

All data from the dashboard is also freely available in the following GitHub repository.

GitHub Link:https://bit.ly/2Wmmbp8

Mobile version: https://bit.ly/2WjyK4d

Web version: https://bit.ly/2xLyT6v

Conclusion

One of AI’s core strengths when working on identifying and limiting the effects of virus outbreaks is its incredibly insistent nature. AIsystems never tire, can sift through enormous amounts of data, and identify possible correlations and causations that humans can’t.

However, there are limits to AI’s ability to both identify virus outbreaks and predict how they will spread. Perhaps the best-known example comes from the neighboring field of big data analytics. At its launch, Google Flu Trends was heralded as a great leap forward in relation to identifying and estimating the spread of the flu—until it underestimated the 2013 flu season by a whopping 140 percent and was quietly put to rest.Poor data quality was identified as one of the main reasons Google Flu Trends failed. Unreliable or faulty data can wreak havoc on the prediction power of AI.

References:

About the Author:

Bargunan Somasundaram

Bargunan Somasundaram

Bargunan is a Big Data Engineer and a programming enthusiast. His passion is to share his knowledge by writing his experiences about them. He believes “Gaining knowledge is the first step to wisdom and sharing it is the first step to humanity.”

AI in Healthcare

The Healthcare Industry is going through a quiet revolution. Factors like disease trends, doctor demographics, regulatory policies, environment, technology etc. are forcing the industry to turn to emerging technologies like AI, to help adapt to the pace of change. Here, we take a look at some key use cases of AI in Healthcare.

Medical Imaging

The application of Machine Learning (ML) in Medical Imaging is showing highly encouraging results. ML is a subset of AI, where algorithms and models are used to help machines imitate the cognitive functions of the human brain and to also self-learn from their experiences.

AI can be gainfully used in the different stages of medical imaging- in acquisition, image reconstruction, processing, interpretation, storage, data mining & beyond. The performance of ML computational models improves tremendously as they get exposed to more & more data and this foundation on colossal amounts of data enables them to gradually better humans at interpretation. They begin to detect anomalies not perceptible to the human eye & not discernible to the human brain!

What goes hand-in-hand with data, is noise. Noise creates artifacts in images and reduces its quality, leading to inaccurate diagnosis. AI systems work through the clutter and aid noise- reduction leading to better precision in diagnosis, prognosis, staging, segmentation and treatment.

At the forefront of this use case is Radio genomics- correlating cancer imaging features and gene expression. Needless to say, this will play a pivotal role in cancer research.

Drug Discovery

Drug Discovery is an arduous process that takes several years from the start of research to obtaining approval to market. Research involves laboring through copious amounts of medical literature to identify the dynamics between genes, molecular targets, pathways, candidate compounds. Sifting through all of this complex data to arrive at conclusions is an enormous challenge. When this voluminous data is fed to the ML computational models, relationships are reliably established. AI powered by domain knowledge is slashing down time & cost involved in new drug development.

Cybersecurity in Healthcare

Data security is of paramount importance to Healthcare providers who need to ensure confidentiality, integrity, and availability of patient data. With cyberattacks increasing in number and complexity, these formidable threats are giving security teams sleepless nights! The main strength of AI is its ability to curate massive quantities of data- here threat intelligence, nullify the noise, provide instant insights & self-learn in the process. Predictive & Prescriptive capabilities of these computational models drastically reduces response time.

Virtual Health assistants

Virtual Health assistants like Chatbots, give patients 24/7 access to critical information, in addition to offering services like scheduling health check-ups or setting up appointments. AI- based platforms for wearable health devices and health apps come armed with loads of features to monitor health signs, daily activities, diet, sleep patterns etc. and provide alerts for immediate action or suggest personalized plans to enable healthy lifestyles.

AI for Healthcare IT Infrastructure

Healthcare IT Infrastructure running critical applications that enable patient care, is the heart of a Healthcare provider. With dynamically changing IT landscapes that are distributed, hybrid & on-demand, IT Operations teams are finding it hard to keep up. Artificial Intelligence for IT Ops (AIOps) is poised to fundamentally transform the Healthcare Industry. It is powering Healthcare Providers across the globe, who are adopting it to Automate, Predict, Remediate & Prevent Incidents in their IT Infrastructure. GAVS’ Zero Incident FrameworkTM (ZIF) – an AIOps Platform, is a pure-play AI platform based on unsupervised Machine Learning and comes with the full suite of tools an IT Infrastructure team would need. Please watch this video to learn more.

READ ALSO OUR NEW UPDATES

Analyze

Have you heard of AIOps?

Artificial intelligence for IT operations (AIOps) is an umbrella term for the application of Big Data Analytics, Machine Learning (ML) and other Artificial Intelligence (AI) technologies to automate the identification and resolution of common Information Technology (IT) problems. The systems, services and applications in a large enterprise produce immense volumes of log and performance data. AIOps uses this data to monitor the assets and gain visibility into the working behaviour and dependencies between these assets.

According to a Gartner study, the adoption of AIOps by large enterprises would rise to 30% by 2023.

ZIF – The ideal AIOps platform of choice

Zero Incident FrameworkTM (ZIF) is an AIOps based TechOps platform that enables proactive detection and remediation of incidents helping organizations drive towards a Zero Incident Enterprise™

ZIF comprises of 5 modules, as outlined below.

At the heart of ZIF, lies its Analyze and Predict (A&P) modules which are powered by Artificial Intelligence and Machine Learning techniques. From the business perspective, the primary goal of A&P would be 100% availability of applications and business processes.

Come, let us understand more about the Analyze function of ZIF.

With Analyzehaving a Big Data platform under its hood, volumes of raw monitoring data, both structured and unstructured, can be ingested and grouped to build linkages and identify failure patterns.

Data Ingestion and Correlation of Diverse Data

The module processes a wide range of data from varied data sources to break siloes while providing insights, exposing anomalies and highlighting risks across the IT landscape. It increases productivity and efficiency through actionable insights.

  • 100+ connectors for leading tools, environments and devices
  • Correlation and aggregation methods uncover patterns and relationships in the data

Noise Nullification

Eliminates duplicate incidents, false positives and any alerts that are insignificant. This also helps reduce the Mean-Time-To-Resolution and event-to-incident ratio.

  • Deep learning algorithms isolate events that have the potential to become incidents along with their potential criticality
  • Correlation and Aggregation methods group alerts and incidents that are related and needs a common remediation
  • Reinforcement learning techniques are applied to find and eliminate false positives and duplicates

Event Correlation

Data from various sources are ingested real-time into ZIF either by push or pull mechanism. As the data is ingested, labelling algorithms are run to label the data based on identifiers. The labelled data is passed through the correlation engine where unsupervised algorithms are run to mine the patterns. Sub-sequence mining algorithms help in identifying unique patterns from the data.

Unique patterns identified are clustered using clustering algorithms to form cases. Every case that is generated is marked by a unique case id. As part of the clustering process, seasonality aspects are checked from historical transactions to derive higher accuracy of correlation.

Correlation is done based on pattern recognition, helping to eliminate the need for relational CMDB from the enterprise. The accuracy of the correlation increases as patterns reoccur. Algorithms also can unlearn patterns based on the feedback that can be provided by actions taken on correlation. As these are unsupervised algorithms, the patterns are learnt with zero human intervention.

Accelerated Root Cause Analysis (RCA)

Analyze module helps in identifying the root causes of incidents even when they occur in different silos. Combination of correlation algorithms with unsupervised deep learning techniques aid in accurately nailing down the root causes of incidents/problems. Learnings from historical incidents are also applied to find root causes in real-time. The platform retraces the user journeys step-by-step to identify the exact point where an error occurs.

Customer Success Story – How ZIF’s A&P transformed IT Operations of a Manufacturing Giant

  • Seamless end-to-end monitoring – OS, DB, Applications, Networks
  • Helped achieve more than 50% noise reduction in 6 months
  • Reduced P1 incidents by ~30% through dynamic and deep monitoring
  • Achieved declining trend of MTTR and an increasing trend of Availability
  • Resulted in optimizingcommand centre/operations head count by ~50%
  • Resulted in ~80% reduction in operations TCO

For more detailed information on GAVS’ Analyze, or to request a demo please visit zif.ai/products/analyze

References: www.gartner.com/smarterwithgartner/how-to-get-started-with-aiops

ABOUT THE AUTHOR

Vasudevan Gopalan


Vasu heads Engineering function for A&P. He is a Digital Transformation leader with ~20 years of IT industry experience spanning across Product Engineering, Portfolio Delivery, Large Program Management etc. Vasu has designed and delivered Open Systems, Core Banking, Web / Mobile Applications etc.

Outside of his professional role, Vasu enjoys playing badminton and focusses on fitness routines.

READ ALSO OUR NEW UPDATES

A Deep Dive into Deep Learning!

The Nobel Prize winner & French author André Gide said, “Man cannot discover new oceans unless he has the courage to lose sight of the shore”. This rings true with enterprises that made bold investments in cutting-edge AI that are now starting to reap rich benefits. Artificial Intelligence is shredding all perceived boundaries of a machine’s cognitive abilities. Deep Learning, at the very core of Artificial Intelligence, is pushing the envelope still further into unchartered territory. According to Gartner, “Deep Learning is here to stay and expands ML by allowing intermediate representations of the data”.

What is Deep Learning?

Deep Learning is a subset of Machine Learning that is based on Artificial Neural Networks (ANN). It is an attempt to mimic the phenomenal learning mechanisms of the human brain and train AI models to perform cognitive tasks like speech recognition, image classification, face recognition, natural language processing (NLP) and the like.

The tens of billions of neurons and their connections to each other form the brain’s neural network. Although Artificial Neural Networks have been around for quite a few decades now, they are now gaining momentum due to the declining price of storage and the exponential growth of processing power. This winning combination of low-cost storage and high computational prowess is bringing back Deep Learning from the woods.

Improved machine learning algorithms and the availability of staggering amounts of diverse unstructured data such as streaming and textual data, are boosting performance of Deep Learning systems. The performance of the ANN depends heavily on how much data it is trained with and it continuously adapts and evolves its learning with time as it is exposed to more & more datasets.

Simply put, the ANN consists of an Input layer, hidden computational layers, and the Output layer. If there is more than one hidden layer between the Input & Output layers, then it is called a Deep Network.

The Neural Network

The Neuron is central to the human Neural Network. Neurons have Dendrites, which are the receivers of information and the Axon which is the transmitter. The Axon is connected to the Dendrites of other neurons, through which signal transmission takes place. The signals that are passed are called Synapses.

While the neuron by itself cannot accomplish much, it creates magic when it forms connections with the other neurons to form an interconnected neural network. In artificial neural networks, the neuron is represented by a node or a unit. There are several interconnected layers of such units, categorized as input, output and hidden, as seen in the figure. 

A Deep Dive into Deep Learning!

The input layer receives the input values and passes them onto the first hidden layer in the ANN, similar to how our senses receive inputs from the environment around us & send signals to the brain. Let’s look at what happens in one node when it receives these input values from the different nodes of the input layer. The values are standardized/normalized-so that they are all within a certain range-and then weighted. Weights are crucial to a neural network since a value’s weight is indicative its impact on the outcome. An activation function is then applied to the weighted sum of values, to help determine if this transformed value needs to be passed on within the network. Some commonly used activation functions are the Threshold, Sigmoid and Rectifier functions.

This gives a very high-level idea of the generic structure and functioning of an ANN. The actual implementation would use one of several different architectures of neural networks that define how the layers are connected together, and what functions and algorithms are used to transform the input data. To give a couple of examples, a Convolutional network uses nonlinear activation functions and is highly efficient at processing nonlinear data like speech, image and video while a Recurrent network has information flowing around recursively, is much more complicated and difficult to train, but that much more powerful. Recurrent networks are closer in representation to the human neural network and are best suited for applications like sequence generation and predicting stock prices.

Deep Learning at work

Deep Learning has been adopted by almost all industry verticals at least at some level. To give some interesting examples, the automobile industry employs it in self-driving vehicles and driver-assistance services, the entertainment industry applies it to auto-addition of audio to silent movies and social media uses deep learning for curation of content feeds in user’s timelines. Alexa, Cortana, Google Assistant and Siri have now invaded our homes to provide virtual assistance!

Deep Learning has several applications in the field of Computer Vision, which is an umbrella term for what the computer “sees”, that is, interpreting digital visual content like images, photos or videos. This includes helping the computer learn & perform tasks like Image Classification, Object Detection, Image Reconstruction, to name a few. Image classification or image recognition when localized, can be used in Healthcare for instance, to locate cancerous regions in an x-ray and highlight them.

Deep Learning applied to Face Recognition has changed the face of research in this area. Several computational layers are used for feature extraction, with the complexity and abstraction of the learnt feature increasing with each layer, making it pretty robust for applications like public surveillance or public security in buildings. But there are still many challenges like the identification of facial features across styles, ages, poses, effects of surgery that need to be tackled before FR can be reliably used in areas like watch-list surveillance, forensic tasks which demand high levels of accuracy and low alarm rates. 

Similarly, there are several applications of deep learning for Natural Language Processing. Text Classification can be used for Spam filtering, Speech recognition can be used to transcribe a speech, or create captions for a movie, and Machine translation can be used for translation of speech and text from one language to another.

Closing Thoughts

As evident, the possibilities are endless and the road ahead for Deep Learn is exciting! But, despite the tremendous progress in Deep Learning, we are still very far from human-level AI. AI models can only perform local generalizations and adapt to new situations that are similar to past data, whereas human cognition is capable of quickly acclimatizing to radically novel circumstances. Nevertheless, this arduous R&D journey has nurtured a new-found respect for nature’s engineering miracle – the infinitely complex human brain!

Is Your Investment in TRUE AI?

Yes, AIOps the messiah of ITOps is here to stay! The Executive decision now is on the who and how, rather than when. With a plethora of products in the market offering varying shades of AIOps capabilities, choosing the right vendor is critical, to say the least.

Exclusively AI-based Ops?

Simply put, AIOps platforms leverage Big Data & AI technologies to enhance IT operations. Gartner defines Acquire, Aggregate, Analyze & Act as the four stages of AIOps. These four fall under the purview of Monitoring tools, AIOps Platforms & Action Platforms. However, there is no Industry-recognized mandatory feature list to be supported, for a Platform to be classified as AIOps. Due to this ambiguity in what an AIOps Platform needs to Deliver, huge investments made in rosy AIOps promises can lead to sub-optimal ROI, disillusionment or even derailed projects. Some Points to Ponder…

  • Quality in, Quality out. The value delivered from an AIOps investment is heavily dependent on what data goes into the system. How sure can we be that IT Asset or Device monitoring data provided by the Customer is not outdated, inaccurate or patchy? How sure can we be that we have full visibility of the entire IT landscape? With Shadow IT becoming a tacitly approved aspect of modern Enterprises, are we seeing all devices, applications and users? Doesn’t this imply that only an AIOps Platform providing Application Discovery & Topology Mapping, Monitoring features would be able to deliver accurate insights?
  • There is a very thin line between Also AI and Purely AI. Behind the scenes, most AIOps Platforms are reliant on CMDB or similar tools, which makes Insights like Event Correlation, Noise Reduction etc., rule-based. Where is the AI here?
  • In Gartner’s Market Guide, apart from support features for the different data types, Automated Pattern Discovery is the only other Capability taken into account for the Capabilities of AIOps Vendors matrix. With Gartner being one of the most trusted Technology Research and Advisory companies, it is natural for decision makers to zero-in on one of these listed vendors. What is not immediately evident is that there is so much more to AIOps than just this, and with so much at stake, companies need to do their homework and take informed decisions before finalizing their vendor.
  • Most AIOps vendors ingest, provide access to & store heterogenous data for analysis, and provide actionable Insights and RCA; at which point the IT team takes over. This is a huge leap forward, since it helps IT work through the data clutter and significantly reduces MTTR. But, due to the absence of comprehensive Predictive, Prescriptive & Remediation features, these are not end-to-end AIOps Platforms.
  • At the bleeding edge of the Capability Spectrum is Auto-Remediation based on Predictive & Prescriptive insights. A Comprehensive end-to-end AIOps Platform would need to provide a Virtual Engineer for Auto-Remediation. But, this is a grey area not fully catered to by AIOps vendors.  

The big question now is, if an AIOps Platform requires human intervention or multiple external tools to take care of different missing aspects, can it rightfully claim to be true end-to-end AIOps?

So, what do we do?

Time for you to sit back and relax! Introducing ZIF- One Solution for all your ITOps ills!

We have you completely covered with the full suite of tools that an IT infrastructure team would need. We deliver the entire AIOps Capability spectrum and beyond.

ZIF (Zero Incident Framework™) is an AIOps based TechOps platform that enables proactive Detection and Remediation of incidents helping organizations drive towards a Zero Incident Enterprise™.

The Key Differentiator is that ZIF is a Pure-play AI Platform powered by Unsupervised Pattern-based Machine Learning Algorithms. This is what sets us a Class Apart.

  • Rightly aligns with the Gartner AIOps strategy. ZIF is based on and goes beyond the AIOps framework
  • Huge Investments in developing various patented AI Machine Learning algorithms, Auto-Discovery modules, Agent & Agentless Application Monitoring tools, Network sniffers, Process Automation, Remediation & Orchestration capabilities to form Zero Incident Framework™
  • Powered entirely by Unsupervised Pattern-based Machine Learning Algorithms, ZIF needs no further human intervention and is completely Self-Reliant
  • Unsupervised ML empowers ZIF to learn autonomously, glean Predictive & Prescriptive Intelligence and even uncover Latent Insights
  • The 5 Modules can work together cohesively or as independent stand-alone components
  • Can be Integrated with existing Monitoring and ITSM tools, as required
  • Applies LEAN IT Principle and is on an ambitious journey towards FRICTIONLESS IT.

Realizing a Zero Incident EnterpriseTM

The future of AIOps

AIOps or Artificial Intelligence based IT operations is the buzzword that’s capturing the CXO’s interest in organizations worldwide. Why? Because data explosion is here, and the traditional tools and processes are unable to completely handle its creation, storage, analysis and management. Likewise, humans are unable to thoroughly analyze this data to obtain any meaningful insights. IT teams also face the challenging task of providing speed, security and reliability in an increasingly mobile and connected world.

Add to this the complex, manual and siloed processes that the legacy IT solutions offer to the organizations. As a result, the productivity for IT remains low due to their inability to find the exact root cause of incidents. Plus, the business leaders don’t have a 360-degree view of all their IT and business services across the organization.

AIOps is the Future for IT Operations

AIOps platforms are the foundation on which the organizations will project their future endeavors. Advanced machine learning and analytics are the building blocks to enhance their IT operations through a proactive approach towards service desk, monitoring and automation. Using effective data collection methods that utilize real time analytic technologies, AIOps provide insights to impact business decisions.

Successful AIOps implementations depend on key parameters Index (KPIs) whose impact can be seen on performance variation, service degradation, revenue, customer satisfaction and brand image.

All these impacts the organization’s services including but not limited to supply chain, online or digital. One way in which AIOps can deliver a predictive and proactive IT is by decreasing the MTBF (Mean time between failure), MTTD (Mean time to detection), MTTR (Mean time to resolution) and MTTI (Mean time to investigate) factors.

The future of AIOps is already on the way in the below mentioned use cases. There is just the surface with scope for many more use cases to be added in the future.

Capacity planning

Enterprise workloads are moving to the cloud with providers such as AWS, Google and Azure setting up various configurations for running them. The complexity involved increases as new configurations are added by the architects involving parameters like disk types, memory, network and storage resources.

AIOps can reduce the guesswork in aligning the correct usage of the network, storage and memory resources with the right configurations of servers and VMs through recommendations.

Optimal resource utilization

Enterprises are leveraging cloud elasticity to improve their application scaling in or scaling out automatically. With AIOps, IT administrators can rely on predictive scaling to take the auto scale cloud to the next level. Based on historical data, the workload will automatically determine the resources required by monitoring itself.

Data store management

AIOps can also be utilized to monitor the network and the storage resources that will impact the applications in the operations. When performance degradation issues are seen, the admin will get notified. By using AI for both network and storage management, mundane tasks such as reconfiguring and recalibration can be automated. Through predictive analytics, storage capacity is automatically adjusted by adding new volumes proactively.

Anomaly detection

Anomaly detection is the most important application of AIOps. This can prevent potential outages and disruptions that can be faced by organizations. As anomalies can occur in any part of the technology stack, pinpointing them in real-time, using advanced analytics and machine learning is crucial. AIOps can accurately detect the actual source which can help IT teams in performing efficient root cause analysis almost in real-time.

Threat detection & analysis

Along with anomaly detection, AIOps will play a critical role in enhancing the security of IT infrastructure. Security systems can use ML algorithms and AI’s self-learning capabilities to help the IT teams detect data breached and violations. By correlating various internal sources like log files, network and event logs, with the external information on malicious IPs and domains, AI can be used to detect anomalies and risk events through analysis. Advanced machine learning algorithms can be used to identify unexpected and potentially unauthorized and malicious activity within the infrastructure.

Although still early in deployment, companies are taking advantage of AI and machine learning to improve tech support and manage infrastructure.  AIOps, the convergence of AI and IT ops, will change the face of infrastructure management.

READ ALSO OUR NEW UPDATES

What chatbots will do for your enterprise?

Gen X, Y or any other fancy term describing the current demographics is tuned to using voice, text and natural language to complete their work. That’s why a new generation of enterprise chatbots is needed at work.

Read over the textbook definition of a chatbot and you’ll understand it’s a computer program designed to hold conversations with humans over the internet. They can understand written and spoken text and interpret its meaning as well. The bot can then look up relevant information and deliver it to the user.

While chatbots reduces time and efforts, it’s not easy to create a chatbot that customers will trust. Businesses will have to consider the overall.

  • Security
  • Team complexity
  • Brand image
  • Scalability/availability
  • Identity and access management
  • Other parameters to fully integrate chatbots in their organizational structure

If correctly implemented enterprise chatbots can perform pre-defined roles and tasks to improve the business processes and activities.

Shortlisting the right chatbot

Automating repetitive and mundane work will increase the productivity, creativity, and efficiency of the organization. Evolution of chatbots will create more business opportunities for enterprises and new companies. Both SMBs and enterprises can improve their customer satisfaction with customized chatbots that help in offloading employee workload or support the various teams in the organization.

Enterprises first need to identify the type of chatbots needed for their organization to kick start their digital transformation. Depending on their requirements, there are two types of chatbots.

  • Standalone applications
  • Built within the messengers

Usually chatbots associated with messengers have an edge over standalone apps. They can be downloaded and used instantly. They are even easy to build and upgrade, faster compared to apps and websites and also cost effective. You also don’t have to worry about memory space.

AI based or machine learning chatbots learn over time from past questions and answers, and evolve their response accordingly.

What’s in it for enterprises?

There are some universal benefits that businesses in any industry or vertical can benefit from.

Streamlining your IT processes

A variety of business processes across your departments can be streamlined using chatbots. Your employees’ mundane, repetitive but essential tasks can be taken up by the chatbots, giving more time for revenue generating activities. For instance, they can be tasked with follow ups with clients or answering the FAQs by customers.

Act as personal assistants

Chatbots are a great help for the time constrained employees to manage, schedule, or cancel their meetings, setting alarms and other tasks. Context sensitive digital assistants help in organizing their daily routine by understanding the context, behaviors and patterns and suggesting recommendations.

24/7 customer support

Customer expectation is high with them demanding instant and quick resolution for their concerns and problems. Enterprise chatbot solutions offer a cost effective 24/7 customer services for you. Advancements in AI, machine learning and natural language processing (NLP) can allow them to understand the context, usage of slangs, and human conversation to a large extent. On a cautionary note, chatbots should easily handover the conversation to humans to avoid any unnecessary customer conflicts.

Generate business insights

The data deluge faced by the enterprises is costing them through lost insights and business opportunities. Vast data generated across the organization by employees, customers and business processes cannot be completely analyzed, and it leaves data gaps. Leveraging chatbots for processing and analyzing the stored data can result in identifying potential problem areas and take preemptive actions to mitigate the risks.

Reduce Opex & Capex costs

Enterprise chatbots are one-time investments, where you pay only for the chatbot, train it and its forever yours. No monthly payrolls, or sick leaves. You have a 24/7 virtual employee managing your routine and repetitive tasks.

Increase efficiency and productivity

The end result of all the above points is increased productivity. By training your employees about the services and products, a chatbot solution helps your employees to tackle the generic queries from customers. Thus, ending the time-consuming customer facing tasks and helping in the sales funnel.

In conclusion, chatbots are changing the working dynamics of enterprises. The best way to ensure a satisfied customer experience is to build bots that act without being supervised and offer the best solutions to their problems. With new advancements like AI, NLP and Machine Learning, it’s safe to say that chatbots are the future of enterprises.

READ ALSO OUR NEW UPDATES