AIOps Myth Busters

The explosion of technology & data is impacting every aspect of business. While modern technologies have enabled transformational digitalization of enterprises, they have also infused tremendous complexities in infrastructure & applications. We have reached a point where effective management of IT assets mandates supplementing human capabilities with Artificial Intelligence & Machine Learning (AI/ML).      

AIOps is the application of Artificial Intelligence (AI) to IT operations (Ops). AIOps leverages AI/ML technologies to optimize, automate, and supercharge all aspects of IT Operations. Gartner predicts that the use of AIOps and digital experience monitoring tools for monitoring applications and infrastructure would increase by 30% in 2023. In this blog, we hope to debunk some common misconceptions about AIOps.

MYTH 1: AIOps mainly involves alert correlation and event management

AIOps can deliver enormous value to enterprises that harness the wide range of use cases it comes with. While alert correlation & management are key, AIOps can add a lot of value in areas like monitoring, user experience enhancement, and automation.  

AIOps monitoring cuts across infrastructure layers & silos in real-time, focusing on metrics that impact business outcomes and user experience. It sifts through monitoring data clutter to intelligently eliminate noise, uncover patterns, and detect anomalies. Monitoring the right UX metrics eliminates blind spots and provides actionable insights to improve user experience. AIOps can go beyond traditional monitoring to complete observability, by observing patterns in the IT environment, and externalizing the internal state of systems/services/applications. AIOps can also automate remediation of issues through automated workflows & standard operating procedures.

MYTH 2: AIOps increases human effort

Forbes says data scientists spend around 80% of their time preparing and managing data for analysis. This leaves them with little time for productive work! With data pouring in from monitoring tools, quite often ITOps teams find themselves facing alert fatigue and even missing critical alerts.

AIOps can effectively process the deluge of monitoring data by AI-led multi-layered correlation across silos to nullify noise and eliminate duplicates & false positives. The heavy lifting and exhausting work of ingesting, analyzing, weeding out noise, correlating meaningful alerts, finding the probable root causes, and fixing them, can all be accomplished by AIOps. In short, AIOps augments human capabilities and frees up their bandwidth for more strategic work.

MYTH 3: It is hard to ‘sell’ AIOps to businesses

While most enterprises acknowledge the immense potential for AI in ITOps, there are some concerns that are holding back widespread adoption. The trust factor with AI systems, the lack of understanding of the inner workings of AI/ML algorithms, prohibitive costs, and complexities of implementation are some contributing factors. While AIOps can cater to the full spectrum of ITOps needs, enterprises can start small & focus on one aspect at a time like say alert correlation or application performance monitoring, and then move forward one step at a time to leverage the power of AI for more use cases. Finding the right balance between adoption and disruption can lead to a successful transition.  

MYTH 4: AIOps doesn’t work in complex environments!

With Machine Learning and Big Data technologies at its core, AIOps is built to thrive in complex environments. The USP of AIOps is its ability to effortlessly sift through & garner insights from huge volumes of data, and perform complex, repetitive tasks without fatigue. AIOps systems constantly learn & adapt from analysis of data & patterns in complex environments. Through this self-learning, they can discover the components of the IT ecosystem, and the complex network of underlying physical & logical relationships between them – laying the foundation for effective ITOps.   

MYTH 5: AIOps is only useful for implementing changes across IT teams

An AIOps implementation has an impact across all business processes, and not just on IT infrastructure or software delivery. Isolated processes can be transformed into synchronized organizational procedures. The ability to work with colossal amounts of data; perform highly repetitive tasks to perfection; collate past & current data to provide rich inferences; learn from patterns to predict future events; prescribe remedies based on learnings; automate & self-heal; are all intrinsic features that can be leveraged across the organization. When businesses acknowledge these capabilities of AIOps and intelligently identify the right target areas within their organizations, it will give a tremendous boost to quality of business offerings, while drastically reducing costs.

MYTH 6: AIOps platforms offer only warnings and no insights

With its ability to analyze and contextualize large volumes of data, AIOps can help in extracting relevant insights and making data-driven decisions. With continuous analysis of data, events & patterns in the IT environment – both current & historic – AIOps acquires in-depth knowledge about the functioning of the various components of the IT ecosystem. Leveraging this information, it detects anomalies, predicts potential issues, forecasts spikes and lulls in resource utilization, and even prescribes appropriate remedies. All of this insight gives the IT team lead time to fix issues before they strike and enables resource optimization. Also, these insights gain increasing precision with time, as AI models mature with training on more & more data.

MYTH 7: AIOps is suitable only for Operations

AIOps is a new generation of shared services that has a considerable impact on all aspects of application development and support. With AIOps integrated into the dev pipeline, development teams can code, test, release, and monitor software more efficiently. With continuous monitoring of the development process, problems can be identified early, issues fixed, and changes rolled back as appropriate. AIOps can promote better collaboration between development & ops teams, and proactive identification & resolution of defects through AI-led predictive & prescriptive insights. This way AIOps enables a shift left in the development process, smarter resource management, and significantly improves software quality & time to market.  

Kappa (κ) Architecture – Streaming at Scale

We are in the era of Stream processing-as-a-service and for any data-driven organization, Stream-based computing has becoming the norm. In the last three parts https://bit.ly/2WgnILP, https://bit.ly/3a6ij2k,  https://bit.ly/3gICm88, I had explored Lambda Architecture and its variants. In this article let’s discover Streaming in the big data. ‘Real-time analytics’, ‘Real-time data’ and ‘Streaming data’ has become mandatory in any big data platform. The aspiration to extend data analysis (predictive, descriptive, or otherwise) to streaming event data has been common across every enterprise and there is a growing interest to find real-time big data architectures. Kappa (K) Architecture is one that deals with streaming. Let’s see why Real-Time Analytics matter more than ever and mandates data streaming and how streaming architecture like Kappa works. Is Kappa an alternative to lambda?

“You and I are streaming data engines.” – Jeff Hawkins

workflow automation software architecture

Questioning Lambda

Lambda architecture fits very well in many real-time use cases, mainly in re-computing algorithms. At the same time, Lambda Architecture has the inherent development and operational complexities like all the algorithms must be implemented twice, once in the cold path, the batch layer, and another execution in the hot path or the real-time layer. Apart from this dual execution path, the Lambda Architecture has the inevitable issue of debugging. Because operating two distributed multi-node services is more complex than operating one.

Given the obvious discrepancies of Lambda Architecture, Jay Kreps, CEO of Confluent, co-creator of Apache Kafka started the discussion on the need for new architecture paradigm which uses less code resource and could perform well in certain enterprise scenarios. This gave rise to Kappa (K) Architecture. The real need Kappa Architecture isn’t about efficiency at all, but rather about allowing people to develop, test, debug, and operate their systems on top of a single processing framework. In fact, Kappa is not taken as competitor to LA on the contrary it is seen as an alternative.

cognitive process automation tools for business

What is Streaming & Streaming Architecture?

Modern business requirements necessitate a paradigm shift from traditional approach of batch processing to real-time data streams. Data-centric organizations mandate the Stream first approach. Real-time data streaming or Stream first approach means at the very moment. So real-time analytics, either On-demand real-time analytics or Continuous real-time analytics, is the capability to process data right at the moment it arrives in the system. There is no possibility of batch processing of data. Not to mention, it enhances the ability to make better decision making and performing meaningful action on a timely basis. At the right place and at the right time, real-time analytics combines and analyzes data. Thus, it generates value from disparate data.

Typically, most of the streaming architectures will have the following 3 components:

  • an aggregator that gathers event streams and batch files from a variety of data sources,
  • a broker that makes data available for consumption,
  • an analytics engine that analyzes the data, correlates values and blends streams together.

Kappa (K) Architecture for Big Data era

Kappa (K) Architecture is one of the new software architecture patterns for the new Data era. It’s mainly used for processing streaming data. Kappa architecture gets the name Kappa from the Greek letter (K) and is attributed to Jay Kreps for introducing this architecture.

The main idea behind the Kappa Architecture is that both the real-time and batch processing can be carried out, especially for analytics, with a single technology stack. The data from IoT, streaming, and static/batch sources or near real-time sources like change data capture is ingested into messaging/ pub-sub platforms like Apache Kafka.

An append-only immutable log store is used in the Kappa Architecture as the canonical store. Following are the pub/sub or message buses or log databases that can be used for ingestion:

  • Amazon Quantum Ledger Database (QLDB)
  • Apache Kafka
  • Apache Pulsar
  • Amazon Kinesis
  • Amazon DynamoDB Streams
  • Azure Cosmos DB Change Feed
  • Azure EventHub
  • DistributedLog
  • EventStore
  • Chronicle Queue
  • Pravega

Distributed Stream processing engines like Apache Spark, Apache Flink, etc. will read the data from the streaming platform and transform it into an analyzable format, and then store it into an analytics database in the serving layer. Following are some of the distributed streaming computation systems

  • Amazon Kinesis
  • Apache Flink
  • Apache Samza
  • Apache Spark
  • Apache Storm
  • Apache Beam
  • Azure Stream Analytics
  • Hazelcast Jet
  • Kafka Streams
  • Onyx
  • Siddhi

In short, any query in the Kappa Architecture is defined by the following functional equation.

Query = λ (Complete data) = λ (live streaming data) * λ (Stored data)

The equation means that all the queries can be catered by applying Kappa function to the live streams of data at the speed layer. It also signifies that the stream processing occurs on the speed layer in Kappa architecture.

Pros and Cons of Kappa architecture

Pros

  • Any architecture that is used to develop data systems that doesn’t need batch layer like online learning, real-time monitoring & alerting system, can use Kappa Architecture.
  • If computations and analysis done in the batch and streaming layer are identical, then using Kappa is likely the best solution.
  • Re-computations or re-iterations is required only when the code changes.
  • It can be deployed with fixed memory.
  • It can be used for horizontally scalable systems.
  • Fewer resources are required as the machine learning is being done on the real-time basis.

Cons

Absence of batch layer might result in errors during data processing or while updating the database that requires having an exception manager to reprocess the data or reconciliation.

On finding the right architecture for any data driven organizations, a lot of considerations were taken in. Like most successful analytics project, which involves streaming first approach, the key is to start small in scope with well-defined deliverables, then iterate. The reason for considering distributed systems architecture (Generic Lambda or unified Lambda or Kappa) is due to minimized time to value.

Sources

About the Author

Bargunan Somasundaram

Bargunan Somasundaram

Bargunan is a Big Data Engineer and a programming enthusiast. His passion is to share his knowledge by writing his experiences about them. He believes “Gaining knowledge is the first step to wisdom and sharing it is the first step to humanity.”

Customize Business Outcomes with ZIFTM

Zero Incident Framework™ (ZIF) is the only AIOps platform that is powered with true machine learning algorithms with the capability to self-learn and adapt to today’s modern IT infrastructure.

ZIF’s goal has always been to deliver the right business outcomes for the stakeholders. Return on investment can be measured based on the outcomes the platform has delivered. Users get to choose what business outcomes are expected from the platform and the respective features are deployed in the enterprise to deliver the chosen outcome.

Single Pane of Action – Unified View across IT Enterprise

The biggest challenge IT Operations teams have been trying to tackle over the years is to get a bird’s eye view on what is happening across their IT landscape. The more complex the enterprise becomes the harder it becomes for the IT Operations team to understand what is happening across their enterprise. ZIF solves this issue with ease.

digital transformation company in usa

The capability to ingest data from any source monitoring or ITSM tool has helped IT organizations to have a real-time view of what is happening across their landscape. Enormous time can be saved by the IT engineers with ZIF’s unified view, who would otherwise be traversing between multiple monitoring tools.

ZIF can integrate with 100+ tools to ingest (static/dynamic) data in real-time via ZIF Universal Connector. This is a low code component of ZIF and dataflows within the connector can also be templatized for reuse. 

AIOps based Analytics Platform

Intelligence – Reduction in MTTR – Correlation of Alerts/Events

Approximately 80% of the time is lost by IT engineers in identifying the problem statement for an incident. This has been costing billions of dollars for enterprises. ZIF, with the help of Artificial Intelligence, can reduce the mean time to identify the probable root cause of the incident within seconds. The high-performance correlation engine that runs under the hood of the platform process millions of patterns that the platform has learned from the historical data and correlates the sequences that are happening in real-time and creates cases. These cases are then assigned to IT engineers with the probable root cause for them to fix the issue. This increases the productivity of the IT engineers resulting in better revenue for organizations.

best aiops solutions in usa

best aiops products tools and products

Intelligence – Predictive Analytics

AIOps platforms are incomplete without the Predictive Analytics capability. ZIF has adopted unsupervised machine learning algorithms to perform predictive analytics on the utilization data that is ingested into the platform. These algorithms can learn trends and understand the symptoms of an incident by analyzing tons of data that the platform had consumed over a period. Based on the analysis, the platform generates opportunity cards that help IT engineers take proactive measures on the forecasted incident. These opportunity cards are generated a minimum of 60 minutes in advance which gives the engineers a lead time to fix an issue before it strikes the landscape.

Visibility – Auto-Discovery of IT Assets & Applications

ZIF agentless discovery is a seamless discovery component, that helps in identifying all the IP assets that are available in an enterprise. Just not discovering the assets, but the component also plots a physical topology & logical map for better consumption of the IT engineers. This gives a very detailed view of every asset in the IT landscape. The logical topology gives in-depth insights into the workload metrics that can be utilized for deep analytics.

predictive analytics using ai applications

ai data analytics monitoring tools

Visibility – Cloud Monitoring

ai devops platform management services

In today’s digital transformation journey, cloud is inevitable. To have a better control over the cloud orchestrated application, enterprises must depend on the monitoring tools provided by cloud providers. The lack of insights often leads to the unavailability of applications for end-users. More than monitoring, insights that help enterprises take better-informed decisions are the need of the hour.

ZIF’s cloud monitoring components can monitor any cloud instance. Data that are generated from the provider provided monitoring tools are ingested into ZIF to further analyze the data. ZIF can connect to Azure, AWS & Google Cloud to derive data-driven insights.

Optimization – Remediation – Autonomous IT Operations

ZIF does not stop by just providing insights. The platform deploys the right automation bot to remediate the incident.

ZIF has 250+ automation bots that can be deployed to fast-track the resolution process by a minimum of 90%. Faster resolutions result in increased uptime of applications and better revenue for the enterprise.

Sample ZIF bots:

  • Service Restart / VM Restart
  • Disk Space Clean-up
  • IIS Monitoring App Pool
  • Dynamic Resource Allocation
  • Process Monitoring & Remediation
  • DL & Security Group Management
  • Windows Event Log Monitoring
  • Automated phishing control based on threat score
  • Service request automation like password reset, DL mapping, etc.
best aiops solutions in usa

For more information on ZIF, please visit www.zif.ai

About the Author –

Anoop Aravindakshan

An evangelist of Zero Incident FrameworkTM, Anoop has been a part of the product engineering team for long and has recently forayed into product marketing. He has over 14 years of experience in Information Technology across various verticals, which include Banking, Healthcare, Aerospace, Manufacturing, CRM, Gaming, and Mobile.

Lambda (λ), Kappa (κ) and Zeta (ζ) – The Tale of 3 AIOps Musketeers (PART-3)

“Data that sit unused are no different from data that were never collected in the first place.” – Doug Fisher

In the part 1 (https://bit.ly/3hDChCH), we delved into Lambda Architecture and in part 2 (https://bit.ly/3hDCg1B) about Generic Lambda. Given the limitations of the Generic lambda architecture and its inherent complexity, the data is replicated in two layers and keeping them in-sync is quite challenging in an already complex distributed system.There is a growing interest to find the simpler alternative to the Generic Lambda, that would bring just about the same benefits and handle the full problem set. The solution is Unified Lambda (λ) Architecture.

Unified Lambda (λ) Architecture

The unified approach addresses the velocity and volume problems of Big Data as it uses a hybrid computation model. This model combines both batch data and instantaneous data transparently.

There are basically three approaches:

  1. Pure Streaming Framework
  2. Pure Batch Framework
  3. Lambdoop Framework

1. Pure streaming framework

In this approach, a pure streaming model is adopted and a flexible framework like Apache Samza can be employed to provide unified data processing model for both stream and batch processing using the same data flow structure.

Pure streaming framework

To avoid the large turn-around times involved in Hadoop’s batch processing, LinkedIn came up with a distributed stream processing framework Apache Samza. It is built on top of distributed messaging bus; Apache Kafka, so that it can be a lightweight framework for streaming platform. i.e. for continuous data processing. Samza has built-in integration with Apache Kafka, which is comparable to HDFS and MapReduce. In the Hadoop world, HDFS is the storage layer and MapReduce, the processing layer. In the similar way, Apache Kafka ingests and stores the data in topics, which is then streamed and processed by Samza. Samza normally computes results continuously as and when the data arrives, thus delivering sub-second response times.

Albeit it’s a distributed stream processing framework, its architecture is pluggable i.e. can be integrated with umpteen sources like HDFS, Azure EventHubs, Kinensis etc. apart from Kafka. It follows the principle of WRITE ONCE, RUN ANYWHERE; meaning, the same code can run in both stream and batch mode. Apache Samza’s streams are re-playable, ordered partitions.

Unified API for Batch & Streaming in pure Streaming

Apache Samza offers a unified data processing model for both real-time as well as batch processing.  Based on the input data size, bounded or unbounded the data processing model can be identified, whether batch or stream.Typically bounded (e.g. static files on HDFS) are Batch data sources and streams are unbounded (e.g. a topic in Kafka). Under the bonnet, Apache Samza’s stream-processing engine handles both types with high efficiency.

Unified API for Batch & Streaming in pure Streaming

Another advantage of this unified API for Batch and Streaming in Apache Samza, is that makes it convenient for the developers to focus on the processing logic, without treating bounded and unbounded sources differently. Samza differentiates the bounded and unboundeddata by a special token end-of-stream. Also, only config change is needed, and no code changes are required, in case of switching gears between batch and streaming, e.g. Kafka to HDFS.Let us take an example of Count PageViewEvent for each mobile Device OS in a 5-minute window and send the counts to PageViewEventPerDeviceOS

Pure Batch framework

This is the reverse approach of pure streaming where a flexible Batch framework is employed, which would offer both the batch processing and real-time data processing ability. The streaming is achieved by using mini batches which is small enough to be close to real-time, with Apache Spark/Spark Streaming or Storm’s Trident. Under the hood, Spark streaming is a sequence of micro-batch processes with the sub-second latency. Trident is a high-level abstraction for doing streaming computation on top of Storm. The core data model of Trident is the “Stream”, processed as a series of batches.

Apache Spark achieves the dual goal of Batch as well as real-time processing by the following modes.

  • Micro-batch processing model
  • Continuous Processing model

Micro-batch processing model

Micro-batch processing is analogous to the traditional batch processing in that data are usually processed as a group. The primary difference is that the batches are smaller and processed more often. In spark streaming, the micro-batches are created based on the time rather than on the accumulated data size. The smaller the time to trigger a micro-batch to process, lesser the latency.

Continuous Processing model

Apache Spark 2.3, introduced Low-latency Continuous Processing Mode in Structured Streaming whichenables low (~1 ms) end-to-end latency with at-least-once fault-tolerance guarantees. Comparing this with the default micro-batch processing engine which can achieve exactly-once guarantees but achieve latencies of ~100 ms at best. Without modifying the application logic i.e. DataFrame/Dataset operations mini-batching or continuous streaming can be chosen at runtime. Spark Streaming also has the abilityto work well with several data sources like HDFS, Flume or Kafka.

Example of Micro-batching and Continuous Batching

3. Lambdoop Approach

In many places, capability of both batch and real time processing is needed.It is cumbersome to develop a software architecture of such capabilities by tailoring suitable technologies, software layers, data sources, data storage solutions, smart algorithms and so on to achieve the good scalable solution. This is where the frameworks like Spring “XD”, Summingbird or Lambdoop comes in, since they already have a combined API for batch and real-time processing.

Lambdoop

Lambdoop is a software framework based on the Lambda architecture which provides an abstraction layer to the developers. This feature makes the developers life easy to develop any Big Data applications by combining real time and batch processing approaches. Developers don’t have to deal with different technologies, configurations, data formats etc. They can use the Lambdoop framework as the only needed API. Also, Lambdoop includes other interesting tools such as input/output drivers, visualization tools, cluster management tools and widely accepted AI algorithms.

The Speed layer in Lambdoop runs on Storm and the Batch layer on Hadoop, Lambdoop (Lambda-Hadoop, with HBase, Storm and Redis) also combines batch/real-time by offering a single API for both processing models.

Summingbird

Summingbird aka‘Streaming MapReduce’ is a hybrid computational system where both the batch/streaming computations can be run at the same time and the results can be merged automatically. In Summingbird, the developer can write the code/job logic once and change the backend as and when needed. Following are the modes in which Summingbird Job/code can be executed.

  • batch mode (using Scalding on Hadoop)
  • real-time mode (using Storm)
  • hybrid batch/real-time mode (offers attractive fault-tolerance properties)

If the model assumes streaming, one-at-a-time semantics, then the code can be run in real-time e.g. Strom or in offline/batch mode e.g. Hadoop, spark etc. It can operate in a hybrid processing mode, when there is a need to transparently integrate batch and online results to efficiently generate up-to-date views over long time spans.

Conclusion

The volume of any Big Data platform is handled by building a batch processing application which requires, MapReduce, spark development, Use of other Hadoop related tools like Sqoop, Zookeeper, HCatalog etc. and storage systems like HBase, MongoDB, HDFS, Cassandra. At the same time the velocity of any Big Data platform is handled by building a real-time streaming application which requires, stream computing development using Storm, Samza, Kafka-connect, Apache Flink andS4, and use of temporal datastores like in-memory data stores, Apache Kafka messaging system etc.

The Unified Lambda handles the both Volume and Velocity if any Big Data platform by the intermixed approach of featuring a hybrid computation model, where both batch and real-time data processing are combined transparently. Also, the limitations of Generic Lambda like Dual execution mode, Replicating and maintaining the data sync between different layers are avoided and in the Unified Lambda, there would be only one system to learn and maintain.

About the Author:

Bargunan Somasundaram

Bargunan Somasundaram

Bargunan is a Big Data Engineer and a programming enthusiast. His passion is to share his knowledge by writing his experiences about them. He believes “Gaining knowledge is the first step to wisdom and sharing it is the first step to humanity.”

Machine Learning: Building Clustering Algorithms

Gireesh Sreedhar KP


Clustering is a widely-used Machine Learning (ML) technique. Clustering is an Unsupervised ML algorithm that is built to learn patterns from input data without any training, besides being able of processing data with high dimensions. This makes clustering the method of choice to solve a wide range and variety of ML problems.

Since clustering is widely used, for Data Scientists and ML Engineer’s it is critical to understand how to practically build clustering algorithms even though many of us have a high-level understanding of clustering. Let us understand the approach to build a clustering algorithm from scratch.

What is Clustering and how does it work?

Clustering is finding groups of objects (data) such that objects in the same group will be similar (related) to one another and different from (unrelated to) objects in other groups.

Clustering works on the concept of Similarity/Dissimilarity between data points. The higher similarity between data points, the more likely these data points will belong to the same cluster and higher the dissimilarity between data points, the more likely these data points will be kept out of the same cluster.

Similarity is the numerical measure of how alike two data objects are. Similarity will be higher when objects are more alike. Dissimilarity is the numerical measure of how different two data objects. Dissimilarity is lower when objects are more alike.

We create a ‘Dissimilarity Matrix’ (also called Distance Matrix) as an input to a clustering algorithm, where the dissimilarity matrix gives algorithm the notion of dissimilarity between objects. We build a dissimilarity matrix for each attribute of data considered for clustering and then combine the dissimilarity matrix for each data attribute to form an overall dissimilarity matrix. The dissimilarity matrix is an NxN square matrix where N is the number of data points considered for clustering and each element of the NxN square matrix gives dissimilarity between two objects.

Building Clustering Algorithm

Building a clustering algorithm involve the following:

  • Selection of most suited clustering techniques and algorithms to solve the problem. This step needs close collaboration among SMEs, business users, data scientists, and ML engineers. Based on inputs and data study, a possible list of algorithms (one or more) is selected for modeling and development along with tuning parameters are decided (to give algorithm more flexibility for tuning and learning from SME).
  • The selection of data attributes for the formulation of the dissimilarity matrix and methodology for the formation of the dissimilarity matrix (discussed later).
  • Building algorithms and doing the Design of experiments to select the best-suited algorithm and algorithm parameters for implementation.
  • Implementation of algorithm and fine-tuning of parameters as required.

Building a Dissimilarity matrix:

There are different approaches to build a dissimilarity matrix, here we consider building a dissimilarity matrix containing the distance (called Distance Matrix) between data objects (another alternative approach is to feed in coordinate points and let the algorithm compute distance). Let us consider a group of N data objects to be clustered based on three data attributes of each data object. The steps for building a Distance matrix are:

Build a Distance matrix for individual data attributes. Here we build three individual distance matrices (one for each attribute) containing distance between data objects calculated for each attribute. The data is always scaled between [0,1] using one of the standard normalization methods such as Min-Max Scalar. Here is how the distance matrix for an attribute looks like.

Properties of Distance Matrix:

  1. Distance Matrix is NxN square matrix (N – number of objects in clustering space)
  2. Matrix is symmetric with diagonal as zero (zero diagonal as distance of an object from itself is zero)
  3. For categorical data, distance between two points = 0, if both are same; =1 otherwise
  4. For numeric/ordered data, distance between two points = difference between scaled attribute values of two points.

Build Complete Distance matrix. Here we build a complete distance matrix combining distance matrix of individual attributes forming the input for clustering algorithm.

Complete distance matrix = (element-wise sum of individual attribute level matrix)/3;

Generalized Complete distance matrix = (element-wise sum of individual attribute level matrix)/M, where M is the number of attribute level matrix formed.

Considerations for the selection of clustering algorithms:

Before the selection of a clustering algorithm, the following considerations need to be evaluated to identify the right clustering algorithms for the given problem.

  • Partition criteria: Single Level vs hierarchical portioning
  • Separation of clusters: Exclusive (one data point belongs to only one class) vs non-exclusive (one data point can belong to more than one class)
  • Similarity measures: Distance-based vs Connectivity-based
  • Clustering space: Full space (used when low dimension data is processed) vs Subspace (used when high dimension data is processed, where only subspace can be processed and interesting clustering can be formed)
  • Attributes processing: Ability to deal with different types of attributes: Numerical, Categorical, Text, Media, a combination of data types in inputs
  • Discovery of clusters: Ability to form a predefined number of clusters or an arbitrary number of clusters
  • Ability to deal with noise in data
  • Scalability to deal with huge volumes of data, high dimensionality, incremental, or streaming data.
  • Ability to deal with constraints on user preference and domain requirements.

Application of Clustering

There are broadly two applications of clustering.

As an ML tool to get insight into data. Like building Recommendation Systems or Customer segmentation by clustering like-minded users or similar products, Social network analysis, Biological data analysis like Gene/Protein sequence analysis, etc.

As a pre-processing or intermediate step for other classes of algorithms. Like some Pattern-mining algorithms use clustering to group patterns mined and select most representative patterns instead of selecting entire patterns mined.

Conclusion

Building ML algorithm is teamwork with a team consisting of SMEs, users, data scientists, and ML engineers, each playing their part for success. The article gives steps to build a clustering algorithm, this can be used as reference material while attempting to build your algorithm.

About the Author:

Gireesh is a part of the projects run in collaboration with IIT Madras for developing AI solutions and algorithms. His interest includes Data Science, Machine Learning, Financial markets, and Geo-politics. He believes that he is competing against himself to become better than who he was yesterday. He aspires to become a well-recognized subject matter expert in the field of Artificial Intelligence.

Assess Your Organization’s Maturity in Adopting AIOps

Artificial Intelligence for IT operations (AIOps) is adopted by organizations to deliver tangible Business Outcomes. These business outcomes have a direct impact on companies’ revenue and customer satisfaction.

A survey from AIOps Exchange 2019, reports that 84% of Business Owners who attended the survey, confirmed that they are actively evaluating AIOps to be adopted in their organizations.

So, is AIOps just automation? Absolutely NOT!!

Artificial Intelligence for IT operations implies the implementation of true Autonomous Artificial Intelligence in ITOps, which needs to be adopted as an organization-wide strategy. Organizations will have to assess their existing landscape, processes, and decide where to start. That is the only way to achieve the true implementation of AIOps.

Every organization trying to evaluate AIOps as a strategy should read through this article to understand their current maturity, and then move forward to reach the pinnacle of Artificial Intelligence in IT Operations.

The primary Success Factor in adopting AIOps is derived from the Business Outcomes the organization is trying to achieve by implementing AIOps –that is the only way to calculate ROI.

There are 4 levels of Maturity in AIOps adoption. Based on our experience in developing an AIOps platform and implementing the platform across multiple industries, we have arrived at these 4 levels. Assessing an organization against each of these levels helps in achieving the goal of TRUE Artificial Intelligence in IT Operations.

Level 1: Knee-jerk

Events, logs are generated in silos and collected from various applications and devices in the infrastructure. These are used to generate alerts that are commissioned to command centres to escalate as per the SOPs (standard operating procedures) defined. The engineering teams work in silos, not aware of the business impact that these alerts could potentially create. Here, operations are very reactive which could cost the organization millions of dollars.

Level 2: Unified

Have integrated all events, logs, and alerts into one central locale. ITSM process has been unified. This helps in breaking silos and engineering teams are better prepared to tackle business impacts. SOPs have been adjusted since the process is unified, but this is still reactive incident management.

Level 3: Intelligent

Machine Learning algorithms (either supervised or unsupervised) have been implemented on the unified data to derive insights. There are baseline metrics that are calibrated and will be used as a reference for future events. With more data, the metrics get richer. IT operations team can correlate incidents/events with business impacts by leveraging AI & ML. If Mean Time To Resolve (MTTR) an incident has been reduced by automated identification of the root cause, then the organization has attained level 3 maturity in AIOps.

Level 4: Predictive & Autonomous

The pinnacle of AIOps is level 4. If incidents and performance degradation of applications can be predicted by leveraging Artificial Intelligence, it implies improved application availability. Autonomousremediation bots can be triggered spontaneously based on the predictive insights, to fix incidents that are prone to happen in the enterprise. Level 4 is a paradigm shift in IT operations – moving operations entirely from being reactive, to becoming proactive.

Conclusion:

As IT operations teams move up each level, the essential goal to keep in mind is the long-term strategy that needs to be attained by adopting AIOps. Artificial Intelligence has matured over the past few decades, and it is up to AIOps platforms to embrace it effectively. While choosing an AIOps platform, measure the maturity of the platform’s artificial intelligent coefficient.

About the Author:

Anoop Aravindakshan (Principal Consultant Manager) at GAVS Technologies.


An evangelist of Zero Incident FrameworkTM, Anoop has been a part of the product engineering team for long and has recently forayed into product marketing. He has over 14 years of experience in Information Technology across various verticals, which include Banking, Healthcare, Aerospace, Manufacturing, CRM, Gaming, and Mobile.

Prediction for Business Service Assurance

Artificial Intelligence for IT operations or AIOps has exploded over the past few years. As more and more enterprises set about their digital transformation journeys, AIOps becomes imperative to keep their businesses running smoothly. 

AIOps uses several technologies like Machine Learning and Big Data to automate the identification and resolution of common Information Technology (IT) problems. The systems, services, and applications in a large enterprise produce volumes of log and performance data. AIOps uses this data to monitor the assets and gain visibility into the behaviour and dependencies among these assets.

According to a Gartner publication, the adoption of AIOps by large enterprises would rise to 30% by 2023.

ZIF – The ideal AIOps platform of choice

Zero Incident FrameworkTM (ZIF) is an AIOps based TechOps platform that enables proactive detection and remediation of incidents helping organizations drive towards a Zero Incident Enterprise™.

ZIF comprises of 5 modules, as outlined below.

At the heart of ZIF, lies its Analyze and Predict (A&P) modules which are powered by Artificial Intelligence and Machine Learning techniques. From the business perspective, the primary goal of A&P would be 100% availability of applications and business processes.

Let us understand more about thePredict module of ZIF.

Predictive Analytics is one of the main USP of the ZIF platform. ZIF encompassesSupervised, Unsupervised and Reinforcement Learning algorithms for realization of various business use cases (as shown below).

How does the Predict Module of ZIF work?

Through its data ingestion capabilities, the ZIF platform can receive and process all types of data (both structured and unstructured) from various tools in the enterprise. The types of data can be related to alerts, events, logs, performance of devices, relations of devices, workload topologies, network topologies etc. By analyzing all these data, the platform predicts the anomalies that can occur in the environment. These anomalies get presented as ‘Opportunity Cards’ so that suitable action can be taken ahead of time to eliminate any undesired incidents from occurring. Since this is ‘Proactive’ and not ‘Reactive’, it brings about a paradigm shift to any organization’s endeavour to achieve 100% availability of their enterprise systems and platforms. Predictions are done at multiple levels – application level, business process level, device level etc.

Sub-functions of Prediction Module

How does the Predict module manifest to enterprise users of the platform?

Predict module categorizes the opportunity cards into three swim lanes.

  1. Warning swim lane – Opportunity Cards that have an “Expected Time of Impact” (ETI) beyond 60 minutes.
  2. Critical swim lane – Opportunity Cards that have an ETI within 60 minutes.
  3. Processed / Lost– Opportunity Cards that have been processed or lost without taking any action.

Few of the enterprises that realized the power of ZIF’s Prediction Module

  • A manufacturing giant in the US
  • A large non-profit mental health and social service provider in New York
  • A large mortgage loan service provider in the US
  • Two of the largest private sector banks in India

For more detailed information on GAVS’ Analyze, or to request a demo please visithttps://zif.ai/products/predict/

References:https://www.gartner.com/smarterwithgartner/how-to-get-started-with-aiops/

About the Author:

Vasudevan Gopalan

Vasu heads Engineering function for A&P. He is a Digital Transformation leader with ~20 years of IT industry experience spanning across Product Engineering, Portfolio Delivery, Large Program Management etc. Vasu has designed and delivered Open Systems, Core Banking, Web / Mobile Applications etc.

Outside of his professional role, Vasu enjoys playing badminton and focusses on fitness routines.

Discover, Monitor, Analyze & Predict COVID-19

Uber, the world’s largest taxi company, owns no vehicles. Facebook, the world’s most popular media owner, creates no content. Alibaba, the most valuable retailer, has no inventory. Netflix, the world’s largest movie house, own no cinemas. And Airbnb, the world’s largest accommodation provider, owns no real estate. Something interesting is happening.”

– Tom Goodwin, an executive at the French media group Havas.

This new breed of companies is the fastest growing in history because they own the customer interface layer. It is the platform where all the value and profit is. “Platform business” is a more wholesome termfor this model for which data is the fuel; Big Data & AI/ML technologies are the harbinger of new waves of productivity growth and innovation.

With Big data and AI/ML is making a big difference in the area of public health, let’s see how it is helping us tackle the global emergency of coronavirus formally known as COVID-19.

“With rapidly spreading disease, a two-week lag is an eternity.”

DISCOVERING/ DETECTING

Chinese technology giant Alibaba has developed an AI system for detecting the COVID-19 in CT scans of patients’ chests with 96% accuracy against viral pneumonia cases. It only takes 20 seconds for the AI to decide, whereas humans generally take about 15 minutes to diagnose the illness as there can be upwards of 300 images to evaluate.The system was trained on images and data from 5,000 confirmed coronavirus cases and has been tested in hospitals throughout China. Per a report, at least 100 healthcare facilities are currently employing Alibaba’s AI to detect COVID-19.

Ping An Insurance (Group) Company of China, Ltd (Ping An) aims to address the issue of lack of radiologists by introducing the COVID-19 smart image-reading system. This image-reading system can read the huge volumes of CT scans in epidemic areas.

Ping An Smart Healthcare uses clinical data to train the AI model of the COVID-19 smart image-reading system. The AI analysis engine conducts a comparative analysis of multiple CT scan images of the same patient and measures the changes in lesions. It helps in tracking the development of the disease, evaluation of the treatment and in prognosis of patients.Ultimately it assists doctors to diagnose, triage and evaluate COVID-19 patients swiftly and effectively.

Ping An Smart Healthcare’s COVID-19 smart image-reading system also supports AI image-reading remotely by medical professionals outside the epidemic areas.Since its launch, the smart image-reading system has provided services to more than 1,500 medical institutions. More than 5,000 patients have received smart image-reading services for free.

The more solutions the better. At least when it comes to helping overwhelmed doctors provide better diagnoses and, thus, better outcomes.

MONITORING

  • AI based Temperature monitoring & scanning

In Beijing, China, subway passengers are being screened for symptoms of coronavirus, but not by health authorities. Instead, artificial intelligence is in-charge.

Two Chinese AI giants, Megvii and Baidu, have introduced temperature-scanning. They have implemented scanners to detect body temperature and send alerts to company workers if a person’s body temperature is high enough to constitute a fever.

Megvii’s AI system detects body temperatures for up to 15 people per second andup to 16 feet. It monitors as many as 16 checkpoints in a single station. The system integrates body detection, face detection, and dual sensing via infrared cameras and visible light. The system can accurately detect and flag high body temperature even when people are wearing masks, hats, or covering their faces with other items. Megvii’s system also sends alerts to an on-site staff member.

Baidu, one of the largest search-engine companies in China, screens subway passengers at the Qinghe station with infrared scanners. It also uses a facial-recognition system, taking photographs of passengers’ faces. If the Baidu system detects a body temperature of at least 99-degrees Fahrenheit, it sends an alert to the staff member for another screening. The technology can scan the temperatures of more than 200 people per minute.

  • AI based Social Media Monitoring

An international team is using machine learning to scour through social media posts, news reports, data from official public health channels, and information supplied by doctors for warning signs of the virus across geographies.The program is looking for social media posts that mention specific symptoms, like respiratory problems and fever, from a geographic area where doctors have reported potential cases. Natural language processing is used to parse the text posted on social media, for example, to distinguish between someone discussing the news and someone complaining about how they feel.

The approach has proven capable of spotting a coronavirus needle in a haystack of big data. This technique could help experts learn how the virus behaves. It may be possible to determine the age, gender, and location of those most at risk quicker than using official medical sources.

PREDICTING

Data from hospitals, airports, and other public locations are being used to predict disease spread and risk. Hospitals can also use the data to plan for the impact of an outbreak on their operations.

Kalman Filter

Kalman filter was pioneered by Rudolf Emil Kalman in 1960, originally designed and developed to solve the navigation problem in the Apollo Project. Since then, it has been applied to numerous cases such as guidance, navigation, and control of vehicles, computer vision’s object tracking, trajectory optimization, time series analysis in signal processing, econometrics and more.

Kalman filter is a recursive algorithm which uses time-series measurement over time, containing statistical noise and produce estimations of unknown variables.

For the one-day prediction Kalman filter can be used, while for the long-term forecast a linear model is used where its main features are Kalman predictors, infected rate relative to population, time-depended features, and weather history and forecasting.

The one-day Kalman prediction is very accurate and powerful while a longer period prediction is more challenging but provides a future trend.Long term prediction does not guarantee full accuracy but provides a fair estimation following the recent trend. The model should re-run daily to gain better results.

GitHub Link: https://github.com/Rank23/COVID19

ANALYZING

The Center for Systems Science and Engineering at Johns Hopkins University has developed an interactive, web-based dashboard that tracks the status of COVID-19 around the world. The resource provides a visualization of the location and number of confirmed COVID-19 cases, deaths and recoveries for all affected countries.

The primary data source for the tool is DXY, a Chinese platform that aggregates local media and government reports to provide COVID-19 cumulative case totals in near real-time at the province level in China and country level otherwise. Additional data comes from Twitter feeds, online news services and direct communication sent through the dashboard. Johns Hopkins then confirms the case numbers with regional and local health departments. This kind of Data analytics platform plays a pivotal role in addressing the coronavirus outbreak.

All data from the dashboard is also freely available in the following GitHub repository.

GitHub Link:https://bit.ly/2Wmmbp8

Mobile version: https://bit.ly/2WjyK4d

Web version: https://bit.ly/2xLyT6v

Conclusion

One of AI’s core strengths when working on identifying and limiting the effects of virus outbreaks is its incredibly insistent nature. AIsystems never tire, can sift through enormous amounts of data, and identify possible correlations and causations that humans can’t.

However, there are limits to AI’s ability to both identify virus outbreaks and predict how they will spread. Perhaps the best-known example comes from the neighboring field of big data analytics. At its launch, Google Flu Trends was heralded as a great leap forward in relation to identifying and estimating the spread of the flu—until it underestimated the 2013 flu season by a whopping 140 percent and was quietly put to rest.Poor data quality was identified as one of the main reasons Google Flu Trends failed. Unreliable or faulty data can wreak havoc on the prediction power of AI.

References:

About the Author:

Bargunan Somasundaram

Bargunan Somasundaram

Bargunan is a Big Data Engineer and a programming enthusiast. His passion is to share his knowledge by writing his experiences about them. He believes “Gaining knowledge is the first step to wisdom and sharing it is the first step to humanity.”

AI in Healthcare

The Healthcare Industry is going through a quiet revolution. Factors like disease trends, doctor demographics, regulatory policies, environment, technology etc. are forcing the industry to turn to emerging technologies like AI, to help adapt to the pace of change. Here, we take a look at some key use cases of AI in Healthcare.

Medical Imaging

The application of Machine Learning (ML) in Medical Imaging is showing highly encouraging results. ML is a subset of AI, where algorithms and models are used to help machines imitate the cognitive functions of the human brain and to also self-learn from their experiences.

AI can be gainfully used in the different stages of medical imaging- in acquisition, image reconstruction, processing, interpretation, storage, data mining & beyond. The performance of ML computational models improves tremendously as they get exposed to more & more data and this foundation on colossal amounts of data enables them to gradually better humans at interpretation. They begin to detect anomalies not perceptible to the human eye & not discernible to the human brain!

What goes hand-in-hand with data, is noise. Noise creates artifacts in images and reduces its quality, leading to inaccurate diagnosis. AI systems work through the clutter and aid noise- reduction leading to better precision in diagnosis, prognosis, staging, segmentation and treatment.

At the forefront of this use case is Radio genomics- correlating cancer imaging features and gene expression. Needless to say, this will play a pivotal role in cancer research.

Drug Discovery

Drug Discovery is an arduous process that takes several years from the start of research to obtaining approval to market. Research involves laboring through copious amounts of medical literature to identify the dynamics between genes, molecular targets, pathways, candidate compounds. Sifting through all of this complex data to arrive at conclusions is an enormous challenge. When this voluminous data is fed to the ML computational models, relationships are reliably established. AI powered by domain knowledge is slashing down time & cost involved in new drug development.

Cybersecurity in Healthcare

Data security is of paramount importance to Healthcare providers who need to ensure confidentiality, integrity, and availability of patient data. With cyberattacks increasing in number and complexity, these formidable threats are giving security teams sleepless nights! The main strength of AI is its ability to curate massive quantities of data- here threat intelligence, nullify the noise, provide instant insights & self-learn in the process. Predictive & Prescriptive capabilities of these computational models drastically reduces response time.

Virtual Health assistants

Virtual Health assistants like Chatbots, give patients 24/7 access to critical information, in addition to offering services like scheduling health check-ups or setting up appointments. AI- based platforms for wearable health devices and health apps come armed with loads of features to monitor health signs, daily activities, diet, sleep patterns etc. and provide alerts for immediate action or suggest personalized plans to enable healthy lifestyles.

AI for Healthcare IT Infrastructure

Healthcare IT Infrastructure running critical applications that enable patient care, is the heart of a Healthcare provider. With dynamically changing IT landscapes that are distributed, hybrid & on-demand, IT Operations teams are finding it hard to keep up. Artificial Intelligence for IT Ops (AIOps) is poised to fundamentally transform the Healthcare Industry. It is powering Healthcare Providers across the globe, who are adopting it to Automate, Predict, Remediate & Prevent Incidents in their IT Infrastructure. GAVS’ Zero Incident FrameworkTM (ZIF) – an AIOps Platform, is a pure-play AI platform based on unsupervised Machine Learning and comes with the full suite of tools an IT Infrastructure team would need. Please watch this video to learn more.

READ ALSO OUR NEW UPDATES

DevOps

Today’s mantra for software delivery is Agility. It is a huge differentiator that gives organizations a competitive edge, and emboldens even fledgling start-ups to challenge giants in the IT industry. Traditional methods of software development have not been able to cope with today’s velocity of delivery and innovation demands, and are screaming for a lighter yet holistic approach. In traditional development models like Waterfall, phases of the development life cycle are followed sequentially: Requirements, Analysis & Design, Development, Integration & Testing, Deployment, and Maintenance, with documentation and sign-off at the end of each phase. This approach is heavy and documentation-intensive and is not quite responsive to requirement or scope changes since it dictates strict adherence to the linear process model. And worse still, the end product of a long-drawn development cycle may not be quite what the customer expected!

Agile development broadly refers to methodologies like Scrum & Kanban that are based on iterative development/testing in short bursts called sprints, continuous feedback, retrospection, course-correction and constant collaboration amongst the teams involved & the customer. This facilitates incremental evolution of the software and adaptability.

What does DevOps bring to the table?

The idea behind DevOps is to foster a collaborative work culture within the organization where the Development, QA & ITOps teams work together as one cross-functional unit towards common goals. When the walls of team silos are broken down; teams integrate well with each other; and there is free flow of communication, it percolates down to quicker, quality deliverables with very low failure rates. It is such a basic but powerful idea that makes us wonder why we didn’t do it all along!

DevOps practices help create a standardized and stable operating environment and eliminate the warring dynamics between Dev, QA & Ops teams; each working on their own agenda; refusing to take ownership and blaming each other when issues occur, because they are now one multi-disciplinary unit working together from day 01, with full control and autonomy over the entire software delivery process.

DevOps is a winning combination of healthy work culture, processes and tools/cloud services. A good DevOps implementation ensures incremental evolution of software through processes like Continuous Integration, Continuous Delivery/Continuous Deployment(CI/CD), automated in a process pipeline with an integrated toolchain, so that human-induced wait-time/lag does not hamper agility. Continuous Testing and Continuous Performance Monitoring are also an integral part of this pipeline. Although there will be differences in the DevOps implementation styles and the tools/services used, here’s a quick look at a typical CI/CD process.

Frequent commits and builds of small blocks of code is a DevOps best practice. This prevents the chaos that usually ensues when code from the different feature branches get integrated into the main code branch, firing merge conflicts. Continuous Integration is a workflow strategy that involves compiling the committed code that is on a source repository like GitHub, Azure Repos or Bitbucket, validating it with automated static code analysis, unit and integration tests. The quality of the test suite will determine the quality of the newly integrated code. Typically this involves an Integration server/CI service like Jenkins, Azure DevOps or GoCD that gets triggered on commit(or a pull request(PR) as the implementation may be), that builds the code and runs automated tests.

Continuous Delivery(CD) is an extension of Continuous Integration(CI) where further tests such as UI/UX, Load, UAT, QA are done in varied environments, and finally deployed to a staging environment. This process makes it deploy-ready to production, based on approval.

Continuous Deployment is the same as Continuous Delivery with the exception that the deployable package is automatically promoted into production without the need for human approval. This is routinely done in highly mature DevOps implementations but is obviously too risky for those just starting out on their DevOps journeys. Such organizations could stay with Continuous Delivery until their DevOps environment stabilizes. For those progressing into Continuous Deployment, DevOps offers granular control for things like users to deploy to and the time of deployment.

Infrastructure as Code

Infrastructure as Code(IaC) is another important DevOps practice used with Continuous Delivery. As the name suggests, this is the management of infrastructure using code where the desired configuration settings of the environment are specified as code. Every time this code is run, the same environment is generated. This solves issues arising out of inconsistencies in environment configurations and the need for manually maintaining the settings of deployment environments. The pipeline executes this code to configure multiple test targets, enabling application testing in simulated production environments. IaC typically follows application code versioning and is validated just like regular code. It also enables dynamic provisioning of environments on demand.

DevOps Benefits

High delivery velocity and predictable quality are automatic outcomes of a good DevOps implementation since iterative development based on feedback & course-correction, continuous testing & monitoring are core to the methodology. No time is lost in big releases, managing the big release mayhem. Continuous customer feedback looped into the process helps avoid go-live surprises(read shocks) for the customer.

DevOps principles foster responsible autonomy and a spirit of collaboration where the entire cross-functional, cross-trained DevOps team works together and takes end-to-end responsibility from start to finish.

Pipeline Automation injects speed, delivery predictability and enhances productivity by freeing human resources from manual tasks and giving them a sense of satisfaction and purpose as they are able to routinely see their work in customer’s hands.

Taking the plunge

Everyone wants to hop on the DevOps bandwagon but not all of them have clarity on where and how to start. DevOps is first and foremost a change in work culture and its implementation is primarily an exercise in changing mindsets and behavioural aspects at the workplace.

Importantly, the organization needs to arrive at clear objectives and expected business outcomes. As with most things, it’s always a good idea to start small, stabilize and use that as a pivot to move forward to the next baby step. Piloting it on a low risk application with few users will help everyone ease into the new style of working and give the DevOps movement some momentum. It would also help to initially run the current and DevOps environments in parallel to reduce risk.

Importantly, the initiative needs to be backed by the right processes, automation tools and employee enablement by cross training software developers, QA and IT personnel, enabling them to take charge of the entire process.

Amazon and Netflix are fantastic examples of DevOps done right. Dave Hahn, SRE manager at Netflix says as a company, they don’t think about DevOps and that DevOps is just the wonderful result of a

healthy culture and healthy thinking! Netflix, according to Dave, has millions of customers across the globe, has hundreds of thousands of customer interactions every second, streams tens of billions of hours of entertainment every quarter and manages it all with just 10s of Ops engineers who are also software engineers! That’s the power of DevOps!

For information on our DevOps offerings, please click here.