Is AR the Future of our Increasingly Digital World?

Imagine a device which, when used to look at something, throws up information on whatever you’re pointing at. Menus for restaurants, dates of establishment for institutes, and so on. These are the sort of possibilities afforded by Augmented Reality (AR).

What is AR?

AR is a science fiction idea that successfully made the transition to reality. The fundamental idea behind augmented reality is to add something extra to your experience of reality. So, if you are watching a movie or playing a game, AR adds to that experience in some way or form to turn the experience immersive and interactive. AR basically superimposes computer-generated information (audio, visual, haptic, etc.) on the real-world objects.

AR can be defined as a system that fulfills three basic features: a combination of real and virtual worlds, real-time interaction, and accurate 3D registration of virtual and real objects.

How does AR work?

A camera-equipped device is essential for an AR experience. Upon pointing the device at an object, computer vision technology is used to recognize it. The device then downloads information about the object from the cloud, in much the same way that a web browser loads a page via a URL. In this case, the information is presented in a 3-D experience.

ai automated root cause analysis solution

AR can provide a view of the real-time data flowing from products and allow users to control them by touchscreen, voice, or gesture. An operator using an AR headset to interact with an industrial robot might see superimposed data about the robot’s performance and gain access to its controls.

The size and orientation of objects viewed through the AR display adjusts/changes in real-time. New graphical or text information comes into view while other information passes out of view as the user moves about. In industrial settings, users in different roles, such as a machine operator and a maintenance technician, can look at the same object but be presented with different AR experiences that are tailored to their needs.

AR – a novel way of shopping

A 2018 Gartner report stated, “By 2020, 100 million consumers will shop in AR online and in-store.” The current global pandemic has put a damper on consumer sentiments worldwide and we may not see those numbers. But AR can help make the in-store shopping experience more secure by reducing the need to touch a lot of objects and surfaces.

Brands like American Apparel, Uniqlo, and Lacoste already have showrooms and fitting rooms that provide try-before-you-buy options in AR spaces. Smart mirror technologies that scan RFID tags also offer the ability to bring recommendations to the brick-and-mortar shopping experience.

ai data analytics monitoring tools

IKEA customers have access to an app that permits them to point their phones at spaces and see what different products would look like in their own homes.

In the current global climate, fashion and lifestyle brands stand to gain from technologies that handle facial recognition, adapt to local lighting conditions, and provide personalized recommendations.

ai devops platform management services

According to a BRP report, 48% consumers said they would be more inclined to buy from a retailer that provided AR experiences. Retailers may be able to attract more customers with an immersive and secure shopping experience in a post-Corona world.

35% of sales on Amazon are derived from its recommendation engine, which is powered by Machine Learning. Leveraging this in the real world also has immense commercial potential.

AR for Navigation Solutions

Map services from Google and Apple have already found mass acceptance, indoor navigation is next. Apps based on ARKit and ARCore can enable navigating inside spaces like airports, malls, hospitals, etc. Gatwick Airport has already deployed its own smartphone solution that provides routes to terminals and gates based on a user’s flight number.

In 2019, a beta version of AR walking directions feature was launched for Google Maps for all AR-compatible iOS and Android mobile devices. You could view information about your surroundings by pointing your phone’s camera towards it.

AR in Automotive Industry

AR can be used in a breadth of ways in the automotive industry. Starting with dashboard-mounted heads-up displays to interactive experiences in showrooms and more.

AR is also employed by some carmakers to help aid in car maintenance (Volkswagen’s Marta app) and car manufacturing and selling processes (Volvo’s project with Microsoft HoloLens).

The heads-up display is one of the most popular uses of AR in this industry. Not only can drivers get directions and alerts on hazards, but also information on landmarks and nearby locations.

Hyundai has been a leader in AR research that goes beyond the cockpit-style view of the motorist’s experience. They have reimagined maintenance manuals with AR and has apps to point their phones at their cars to get information. Mercedes has a similar app, but its version adds a chatbot to provide virtual assistance.

AR in Healthcare

Applications of AR is opening up new opportunities in the healthcare industry. It’s expected that the global market will reach a value of $1.5B. By enabling healthcare workers with real-time data and patient information, AR can aid in more accurate diagnoses and more precise surgeries.

application performance management solutions

AR can also bring huge value to practicing medicine and education by allowing students and trainee physicians to better visualize health issues and scenarios that they one day will be treating. The benefit that AR can bring to the healthcare industry can be ground-breaking and we are just witnessing the beginning of what is to come from AR in the field of medicine.

AR-powered Solutions for Enterprises

Smart glasses are quickly gaining popularity. Military, medical and enterprise solutions, however, are beginning to prove the value of combining AR with headsets and smart glasses.

Microsoft HoloLens 2 was likely the most anticipated product in this space in 2019. The company hopes to roll out its technology to great fanfare by demonstrating improvements in raw processing power, battery life, and wear ability. The U.S. Army has awarded a $480 contract to Microsoft, and they are also working with the industrial IoT firm PTC to streamline the development of both augmented and mixed reality products.

applications of predictive analytics in business

Walmart and Tyson are testing programs that will transition traditional training methods into mixed reality (MR) settings. This will bring about new ways to learn about compliance and safety issues by looking around mixed-reality environments and identifying problems in a way that’s practical and engaging. Integration with other recent workplace training trends, especially gamification, may compound the returns that AR and MR solutions generate. Per ABI Research, AR-based training in enterprise will be a $6 billion industry by 2022.

Improvements in prototyping, testing, troubleshooting, and quality control are expected to emerge from this trend, too, as workers will be able to make on-the-fly comparisons of real-world items against available documentation and specifications. Jobs that call for workers’ hands to be free will also benefit significantly from AR headsets and glasses.

Augmented reality is the next ‘BIG THING’, it will absolutely revolutionize almost every aspect of life. Everything from medicine to education to construction to entertainment. AR application has already started to appear on the world’s laptops, tablets, and smartphones.

References

https://www.sciencedirect.com/topics/computer-science/augmented-reality

https://www.mantralabsglobal.com/blog/disruptive-augmented-reality-use-cases/

https://www.vxchnge.com/blog/augmented-reality-statistics

About the Author

Kalpana Vijayakumar

Kalpana is a database developer. She strongly believes that “It’s not that we use technology, we live technology.”
Outside of her professional role, Kalpana is passionate about travelling and watching movies.

Algorithmic Alert Correlation

Today’s always-on businesses and 24×7 uptime demands have necessitated IT monitoring to go into overdrive. While constant monitoring is a good thing, the downside is that the flood of alerts generated can quickly get overwhelming. Constantly having to deal with thousands of alerts each day causes alert fatigue, and impacts the overall efficiency of the monitoring process.

Hence, chalking out an optimal strategy for alert generation & management becomes critical. Pattern-based thresholding is an important first step, since it tunes thresholds continuously, to adapt to what ‘normal’ is, for the real-time environment. Threshold accuracy eliminates false positives and prevents alerts from getting fired incorrectly. Selective alert suppression during routine IT Ops maintenance activities like backups, patches, or upgrades, is another. While there are many other strategies to keep alert numbers under control, a key process in alert management is the grouping of alerts, known as alert correlation. It groups similar alerts under one actionable incident, thereby reducing the number of alerts to be handled individually.

But, how is alert ‘similarity’ determined? One way to do this is through similarity definitions, in the context of that IT landscape. A definition, for instance, would group together alerts generated from applications on the same host, or connectivity issues from the same data center. This implies that similarity definitions depend on the physical and logical relationships in the environment – in other words – the topology map. Topology mappers detect dependencies between applications, processes, networks, infrastructure, etc., and construct an enterprise blueprint that is used for alert correlation.

But what about related alerts generated by entities that are neither physically nor logically linked? To give a hypothetical example, let’s say application A accesses a server S which is responding slowly, and so A triggers alert A1. This slow communication of A with S eats up host bandwidth, and hence affects another application B in the same host. Due to this, if a third application C from another host calls B, alert A2 is fired by C due to the delayed response from B.  Now, although we see the link between alerts A1 & A2, they are neither physically nor logically related, so how can they be correlated? In reality, such situations could imply thousands of individual alerts that cannot be combined.

Algorithmic Alert Correlation

This is one of the many challenges in IT operations that we have been trying to solve at GAVS. The correlation engine of our AIOps Platform ZIF uses algorithmic alert correlation to find a solution for this problem. We are working on two unsupervised machine learning algorithms that are fundamentally different in their approach – one based on pattern recognition and the other based on spatial clustering. Both algorithms can function with or without a topology map, and work around what is supplied and available. The pattern learning algorithm derives associations based on learnings from historic patterns of alert relationships. The spatial clustering algorithm works on the principle of similarity based on multiple features of alerts, including problem similarity derived by applying Natural Language Processing (NLP), and relationships, among several others. Tuning parameters enable customization of algorithmic behavior to meet specific demands, without requiring modifications to the core algorithms. Time is also another important dimension factored into these algorithms, since the clustering of alerts generated over an extended period of time will not give meaningful results.

Traditional alert correlation has not been able to scale up to handle the volume and complexity of alerts generated by the modern-day hybrid and dynamic IT infrastructure. We have reached a point where our ITOps needs have surpassed the limits of human capabilities, and so, supplementing our intelligence with Artificial Intelligence and Machine Learning has now become indispensable.

About the Authors –

Padmapriya Sridhar

Priya is part of the Marketing team at GAVS. She is passionate about Technology, Indian Classical Arts, Travel, and Yoga. She aspires to become a Yoga Instructor someday!

Gireesh Sreedhar KP

Gireesh is a part of the projects run in collaboration with IIT Madras for developing AI solutions and algorithms. His interest includes Data Science, Machine Learning, Financial markets, and Geo-politics. He believes that he is competing against himself to become better than who he was yesterday. He aspires to become a well-recognized subject matter expert in the field of Artificial Intelligence.

Cloud Adoption, Challenges, and Solution Through Monitoring, AI & Automation

Cloud Adoption

Cloud computing is the delivery of computing services including Servers, Database, Storage, Networking & others over the internet. Public, Private & Hybrid clouds are different ways of deploying cloud computing.  

  • In public cloud, the cloud resources are owned by 3rd party cloud service provider
  • A private cloud consists of computing resources exclusively by one business or organization
  • Hybrid provides the best of both worlds, combines on-premises infrastructure, private cloud with public cloud

Microsoft, Google, Amazon, Oracle, IBM, and others are providing cloud platform to users to host and experience practical business solution. The worldwide public cloud services market is forecast to grow 17% in 2020 to total $266.4 billion and $354.6 billion in 2022, up from $227.8 billion in 2019, per Gartner, Inc.

There are various types of Instances, workloads & options available as part of cloud ecosystem, i.e. IaaS, PaaS, SaaS, Multi-cloud, Serverless.

Challenges

When very large, large and medium enterprise decides to move their IT environment from on-premise to cloud, they try to move some/most of their on-premises into cloud and keep the rest under their control on-premise. There are various factors that impact the decision, to name a few,

  1. ROI vs Cost of Cloud Instance, Operation cost
  2. Architecture dependency of the application, i.e. whether it is monolithic or multi-tier or polyglot or hybrid cloud
  3. Requirement and need for elasticity and scalability
  4. Availability of right solution from the cloud provider
  5. Security of some key data

After crossing all, once the IT environment is cloud-enabled, the challenge comes in ensuring the monitoring of the Cloud-enabled IT environment. Here are some of the business and IT challenges

1. How to ensure the various workloads & Instances are working as expected?

While the cloud provider may give high availability & up time depending on the tier we choose, it is important that our IT team monitors the environment, as in the case of IaaS and to some extent in PaaS as well.

2. How to ensure the Instances are optimally used in terms of compute and storage?

Cloud providers give most of the metrics around the Instances, though it may not provide all metrics that we may need to make decision in all scenarios.

The disadvantage with this model is, cost, latency & not straight forward, e.g. the LOG analytics which comes in Azure involves cost for every MB/GB of data that is stored and the latency in getting the right metrics at right time, if there is latency/delay, you may not get a right result

3. How to ensure the Application or the components of a single solution that are spread across on-premise and Cloud environment is working as expected?

Some cloud providers give tools for integrating the metrics from on-premise to cloud environment to have a shared view.

The disadvantage with this model is, it is not possible to bring in all sorts of data together to get the insights straight. That is, observability is always a question. The ownership of getting the observability lies with the IT team who handles the data.

4. How to ensure the Multi-Cloud + On-Premise environment is effectively monitored & utilized to ensure the best End-user experience?

Multi-Cloud environment – With rapid growing Microservices Architecture & Container based cloud enabled model, it is quite natural that the Enterprise may choose the best from different cloud providers like Azure, AWS, Google & others.

There is little support from cloud provider on this space. In fact, some cloud providers do not even support this scenario.

5. How to get a single panel of view for troubleshooting & root cause analysis?

Especially when problem occurs in Application, Database, Middle Tier, Network & 3rd party layers that are spread across multi-cluster, multi-cloud, elastic environment, it is very important to get a Unified view of entire environment.

ZIF (Zero Incident FrameworkTM), provides a single platform for Cloud Monitoring.

ZIF has Discovery, Monitoring, Prediction & Remediate that seamlessly fits for a cloud enabled solution. ZIF provides the unified dashboard with insights across all layers of IT infrastructure that is distributed across On-premise host, Cloud Instance & Containers.

Core features & benefits of ZIF for Cloud Monitoring are,

1. Discovery & Topology

  • Discovers and provides dynamic mapping of resources across all layers.
  • Provides real-time mapping of applications and its dependent layers irrespective of whether the components live on-premise, or on cloud or containerized in cloud.
  • Dynamically built topology of all layers which helps in taking effective decisions.

2. Observability across Multi-Cloud, Hybrid-Cloud & On-Premise tiers

  • It is not just about collecting metrics; it is very important to analyze the monitored data and provide meaningful insights.
  • When the IT infrastructure is spread across multiple cloud platform like Azure, AWS, Google Cloud, and others, it is important to get a unified view of your entire environment along with the on-premise servers.
  • Health of each layers are represented in topology format, this helps to understand the impact and take necessary actions.

3. Prediction driven decision for resource optimization

  • Prediction engine analyses the metrics of cloud resources and predicts the resource usage. This helps the resource owner to make proactive action rather than being reactive.
  • Provides meaningful insights and alerts in terms of the surge in the load, the growth in number of VMs, containers, and the usage of resource across other workloads.
  • Authorize the Elasticity & Scalability through real-time metrics.

4. Container & Microservice support

  • Understand the resource utilization of your containers that are hosted in Cloud & On-Premise.
  • Know the bottlenecks around the Microservices and tune your environment for the spikes in load.
  • Provides full support for monitoring applications distributed across your local host & containers in cloud in a multi-cluster setup.

5. Root cause analysis made simple

  • Quick root cause analysis by analysing various causes captured by ZIF Monitor instead of going through layer by layer. This saves time to focus on problem-solving and arresting instead of spending effort on identifying the root cause.
  • Provides insights across your workload including the impact due to 3rd party layers as well.

6. Automation

  • Irrespective of whether the workload and instance is on-premise or on Azure or AWS or other provider, the ZIF automation module can automate the basics to complex activities

7. Ensure End User Experience

  • Helps to improve the end-user experience who gets served by the workload from cloud.
  • The ZIF tracing helps to trace each & every request of each & every user, thereby it is quite natural for ZIF to unearth the performance bottleneck across all layers, which in turn helps to address the problem and thereby improve the User Experience

Cloud and Container Platform Support

ZIF Seamlessly integrates with following Cloud & Container environments,

  • Microsoft Azure
  • AWS
  • Google Cloud
  • Grafana Cloud
  • Docker
  • Kubernetes

About the Author

Suresh Kumar Ramasamy-Picture

Suresh Kumar Ramasamy


Suresh heads the Monitor component of ZIF at GAVS. He has 20 years of experience in Native Applications, Web, Cloud, and Hybrid platforms from Engineering to Product Management. He has designed & hosted the monitoring solutions. He has been instrumental in conglomerating components to structure the Environment Performance Management suite of ZIF Monitor.

Suresh enjoys playing badminton with his children. He is passionate about gardening, especially medicinal plants.

Generative Adversarial Networks (GAN)

In my previous article (zif.ai/inverse-reinforcement-learning/), I had introduced Inverse Reinforcement Learning and explained how it differs from Reinforcement Learning. In this article, let’s explore Generative Adversarial Networks or GAN; both GAN and reinforcement learning help us understand how deep learning is trying to imitate human thinking.

With access to greater hardware power, Neural Networks have made great progress. We use them to recognize images and voice at levels comparable to humans sometimes with even better accuracy. Even with all of that we are very far from automating human tasks with machines because a tremendous amount of information is out there and to a large extent easily accessible in the digital world of bits. The tricky part is to develop models and algorithms that can analyze and understand this humongous amount of data.

GAN in a way comes close to achieving the above goal with what we call automation, we will see the use cases of GAN later in this article.

This technique is very new to the Machine Learning (ML) world. GAN is a deep learning, unsupervised machine learning technique proposed by Ian Goodfellow and few other researchers including Yoshua Bengio in 2014. One of the most prominent researcher in the deep learning area, Yann LeCun described it as “the most interesting idea in the last 10 years in Machine Learning”.

What is Generative Adversarial Network (GAN)?

A GAN is a machine learning model in which two neural networks compete to become more accurate in their predictions. GANs typically run unsupervised and use a cooperative zero-sum game framework to learn.

The logic of GANs lie in the rivalry between the two Neural Nets. It mimics the idea of rivalry between a picture forger and an art detective who repeatedly try to outwit one another. Both networks are trained on the same data set.

A generative adversarial network (GAN) has two parts:

  • The generator (the artist) learns to generate plausible data. The generated instances become negative training examples for the discriminator.
  • The discriminator (the critic) learns to distinguish the generator’s fake data from real data. The discriminator penalizes the generator for producing implausible results.

GAN can be compared with Reinforcement Learning, where the generator is receiving a reward signal from the discriminator letting it know whether the generated data is accurate or not.

Generative Adversarial Networks

During training, the generator tries to become better at generating real looking images, while the discriminator trains to be better classify those images as fake. The process reaches equilibrium at a point when the discriminator can no longer distinguish real images from fakes.

Generative Adversarial Networks

Here are the steps a GAN takes:

  • The input to the generator is random numbers which returns an image.
  • The output image of the generator is fed as input to the discriminator along with a stream of images taken from the actual dataset.
  • Both real and fake images are given to the discriminator which returns probabilities, a number between 0 and 1, 1 meaning a prediction of authenticity and 0 meaning fake.

So, you have a double feedback loop in the architecture of GAN:

  • We have a feedback loop with the discriminator having ground truth of the images from actual training dataset
  • The generator is, in turn, in a feedback loop along with the discriminator.

Most GANs today are at least loosely based on the DCGAN architecture (Radford et al., 2015). DCGAN stands for “deep, convolution GAN.” Though GANs were both deep and convolutional prior to DCGANs, the name DCGAN is useful to refer to this specific style of architecture.

Applications of GAN

Now that we know what GAN is and how it works, it is time to dive into the interesting applications of GANs that are commonly used in the industry right now.

Generative Adversarial Networks

Can you guess what’s common among all the faces in this image?

None of these people are real! These faces were generated by GANs, exciting and at the same time scary, right? We will focus about the ethical application of the GAN in the article.

GANs for Image Editing

Using GANs, appearances can be drastically changed by reconstructing the images.

GANs for Security

GANs has been able to address the concern of ‘adversarial attacks’.

These adversarial attacks use a variety of techniques to fool deep learning architectures. Existing deep learning models are made more robust to these techniques by GANs by creating more such fake examples and training the model to identify them.

Generating Data with GANs

The availability of data in certain domains is a necessity, especially in domains where training data is needed to model learning algorithms. The healthcare industry comes to mind here. GANs shine again as they can be used to generate synthetic data for supervision.

GANs for 3D Object Generation

GANs are quite popular in the gaming industry. Game designers work countless hours recreating 3D avatars and backgrounds to give them a realistic feel. And, it certainly takes a lot of effort to create 3D models by imagination. With the incredible power of GANs, wherein they can be used to automate the entire process!

GANs are one of the few successful techniques in unsupervised machine learning and it is evolving quickly and improving our ability to perform generative tasks. Since most of the successful applications of GANs have been in the domain of computer vision, generative model sure has a lot of potential, but is not without some drawbacks.

About the Author –

Naresh B

Naresh is a part of Location Zero at GAVS as an AI/ML solutions developer. His focus is on solving problems leveraging AI/ML.
He strongly believes in making success as a habit rather than considering it as a destination.
In his free time, he likes to spend time with his pet dogs and likes sketching and gardening.

Lambda (λ), Kappa (κ) and Zeta (ζ) – The Tale of 3 AIOps Musketeers (PART-3)

“Data that sit unused are no different from data that were never collected in the first place.” – Doug Fisher

In the part 1 (https://bit.ly/3hDChCH), we delved into Lambda Architecture and in part 2 (https://bit.ly/3hDCg1B) about Generic Lambda. Given the limitations of the Generic lambda architecture and its inherent complexity, the data is replicated in two layers and keeping them in-sync is quite challenging in an already complex distributed system.There is a growing interest to find the simpler alternative to the Generic Lambda, that would bring just about the same benefits and handle the full problem set. The solution is Unified Lambda (λ) Architecture.

Unified Lambda (λ) Architecture

The unified approach addresses the velocity and volume problems of Big Data as it uses a hybrid computation model. This model combines both batch data and instantaneous data transparently.

There are basically three approaches:

  1. Pure Streaming Framework
  2. Pure Batch Framework
  3. Lambdoop Framework

1. Pure streaming framework

In this approach, a pure streaming model is adopted and a flexible framework like Apache Samza can be employed to provide unified data processing model for both stream and batch processing using the same data flow structure.

Pure streaming framework

To avoid the large turn-around times involved in Hadoop’s batch processing, LinkedIn came up with a distributed stream processing framework Apache Samza. It is built on top of distributed messaging bus; Apache Kafka, so that it can be a lightweight framework for streaming platform. i.e. for continuous data processing. Samza has built-in integration with Apache Kafka, which is comparable to HDFS and MapReduce. In the Hadoop world, HDFS is the storage layer and MapReduce, the processing layer. In the similar way, Apache Kafka ingests and stores the data in topics, which is then streamed and processed by Samza. Samza normally computes results continuously as and when the data arrives, thus delivering sub-second response times.

Albeit it’s a distributed stream processing framework, its architecture is pluggable i.e. can be integrated with umpteen sources like HDFS, Azure EventHubs, Kinensis etc. apart from Kafka. It follows the principle of WRITE ONCE, RUN ANYWHERE; meaning, the same code can run in both stream and batch mode. Apache Samza’s streams are re-playable, ordered partitions.

Unified API for Batch & Streaming in pure Streaming

Apache Samza offers a unified data processing model for both real-time as well as batch processing.  Based on the input data size, bounded or unbounded the data processing model can be identified, whether batch or stream.Typically bounded (e.g. static files on HDFS) are Batch data sources and streams are unbounded (e.g. a topic in Kafka). Under the bonnet, Apache Samza’s stream-processing engine handles both types with high efficiency.

Unified API for Batch & Streaming in pure Streaming

Another advantage of this unified API for Batch and Streaming in Apache Samza, is that makes it convenient for the developers to focus on the processing logic, without treating bounded and unbounded sources differently. Samza differentiates the bounded and unboundeddata by a special token end-of-stream. Also, only config change is needed, and no code changes are required, in case of switching gears between batch and streaming, e.g. Kafka to HDFS.Let us take an example of Count PageViewEvent for each mobile Device OS in a 5-minute window and send the counts to PageViewEventPerDeviceOS

Pure Batch framework

This is the reverse approach of pure streaming where a flexible Batch framework is employed, which would offer both the batch processing and real-time data processing ability. The streaming is achieved by using mini batches which is small enough to be close to real-time, with Apache Spark/Spark Streaming or Storm’s Trident. Under the hood, Spark streaming is a sequence of micro-batch processes with the sub-second latency. Trident is a high-level abstraction for doing streaming computation on top of Storm. The core data model of Trident is the “Stream”, processed as a series of batches.

Apache Spark achieves the dual goal of Batch as well as real-time processing by the following modes.

  • Micro-batch processing model
  • Continuous Processing model

Micro-batch processing model

Micro-batch processing is analogous to the traditional batch processing in that data are usually processed as a group. The primary difference is that the batches are smaller and processed more often. In spark streaming, the micro-batches are created based on the time rather than on the accumulated data size. The smaller the time to trigger a micro-batch to process, lesser the latency.

Continuous Processing model

Apache Spark 2.3, introduced Low-latency Continuous Processing Mode in Structured Streaming whichenables low (~1 ms) end-to-end latency with at-least-once fault-tolerance guarantees. Comparing this with the default micro-batch processing engine which can achieve exactly-once guarantees but achieve latencies of ~100 ms at best. Without modifying the application logic i.e. DataFrame/Dataset operations mini-batching or continuous streaming can be chosen at runtime. Spark Streaming also has the abilityto work well with several data sources like HDFS, Flume or Kafka.

Example of Micro-batching and Continuous Batching

3. Lambdoop Approach

In many places, capability of both batch and real time processing is needed.It is cumbersome to develop a software architecture of such capabilities by tailoring suitable technologies, software layers, data sources, data storage solutions, smart algorithms and so on to achieve the good scalable solution. This is where the frameworks like Spring “XD”, Summingbird or Lambdoop comes in, since they already have a combined API for batch and real-time processing.

Lambdoop

Lambdoop is a software framework based on the Lambda architecture which provides an abstraction layer to the developers. This feature makes the developers life easy to develop any Big Data applications by combining real time and batch processing approaches. Developers don’t have to deal with different technologies, configurations, data formats etc. They can use the Lambdoop framework as the only needed API. Also, Lambdoop includes other interesting tools such as input/output drivers, visualization tools, cluster management tools and widely accepted AI algorithms.

The Speed layer in Lambdoop runs on Storm and the Batch layer on Hadoop, Lambdoop (Lambda-Hadoop, with HBase, Storm and Redis) also combines batch/real-time by offering a single API for both processing models.

Summingbird

Summingbird aka‘Streaming MapReduce’ is a hybrid computational system where both the batch/streaming computations can be run at the same time and the results can be merged automatically. In Summingbird, the developer can write the code/job logic once and change the backend as and when needed. Following are the modes in which Summingbird Job/code can be executed.

  • batch mode (using Scalding on Hadoop)
  • real-time mode (using Storm)
  • hybrid batch/real-time mode (offers attractive fault-tolerance properties)

If the model assumes streaming, one-at-a-time semantics, then the code can be run in real-time e.g. Strom or in offline/batch mode e.g. Hadoop, spark etc. It can operate in a hybrid processing mode, when there is a need to transparently integrate batch and online results to efficiently generate up-to-date views over long time spans.

Conclusion

The volume of any Big Data platform is handled by building a batch processing application which requires, MapReduce, spark development, Use of other Hadoop related tools like Sqoop, Zookeeper, HCatalog etc. and storage systems like HBase, MongoDB, HDFS, Cassandra. At the same time the velocity of any Big Data platform is handled by building a real-time streaming application which requires, stream computing development using Storm, Samza, Kafka-connect, Apache Flink andS4, and use of temporal datastores like in-memory data stores, Apache Kafka messaging system etc.

The Unified Lambda handles the both Volume and Velocity if any Big Data platform by the intermixed approach of featuring a hybrid computation model, where both batch and real-time data processing are combined transparently. Also, the limitations of Generic Lambda like Dual execution mode, Replicating and maintaining the data sync between different layers are avoided and in the Unified Lambda, there would be only one system to learn and maintain.

About the Author:

Bargunan Somasundaram

Bargunan Somasundaram

Bargunan is a Big Data Engineer and a programming enthusiast. His passion is to share his knowledge by writing his experiences about them. He believes “Gaining knowledge is the first step to wisdom and sharing it is the first step to humanity.”

Growing Importance of Business Service Reliability

Business services are a set of business activities delivered to an outside party, such as a customer or a partner. Successful delivery of business services often depends on one or more IT services. For example, an IT business service that would support “order to cash”, as an example could be “supply chain service”. The supply chain service could be delivered by an application such as SAP, with the customer of that service being an employee in finance/accounting using the application to perform customer-facing services such as accounts receivable, or the collection of cash from an outside party. A business service is not simply the application that the end-user sees – it is the entire chain that supports the delivery of the service, including physical and virtualized servers, databases, middleware, storage, and networks. A failure in any of these can affect the service – and so it is crucial that IT organizations have an integrated, accurate, and up-to-date view of these components and of how they work together to provide the service.

The technologies for Social Networking, Mobile Applications, Analytics, Cloud (SMAC), and Artificial Intelligence (AI) are redefining the business and the services that businesses provide. Their widespread usage is changing the business landscape, increasing reliability and availability to levels that were unimaginable even a few years ago.

Availability versus Reliability

At first glance, it might seem that if a service has a high availability then it should also have high reliability. However, this is not necessarily the case. Availability and Reliability have different meanings, serve different purposes, and require different strategies to maintain desired standards of service levels. Reliability is the measure of how long a business service performs its intended function, whereas availability is the measure of the percentage of time a business service is operable. For example, a business service may be available 90% of the time, but reliable only 75% of the time from a performance standpoint. Service reliability can be seen as:

  • Probability of success
  • Durability
  • Dependability
  • Quality over time
  • Availability to perform a function

Merely having a service available isn’t sufficient. When a business service is available, it should actually serve the intended purpose under varying and unexpected conditions. One way to measure this performance is to evaluate the reliability of the service that is available to consume. The performance of a business service is now rated not by its availability, but by how consistently reliable it is. Take the example of mobile services – 4 bars of signal strength on your smartphone does not guarantee that the quality of the call you received or going to make. Organizations need to measure how well the service fulfills the necessary business performance needs.

Recognizing the importance of reliability, Google initiated Site Reliability Engineering (SRE) practices with a mission to protect, provide for, and progress the software and systems behind all of Google’s public services — Google Search, Ads, Gmail, Android, YouTube, and App Engine, to name just a few — with an ever-watchful eye on their availability, latency, performance, and capacity.

Zero Incident FrameworkTM (ZIF)

GAVS Technologies developed an AIOps based TechOps platform – Zero Incident FrameworkTM (ZIF) that enables proactive detection and remediation of incidents. The ZIF Platform is, available in two versions for our customers to evaluate and experience the power of AI-driven Business Service Reliability: 

ZIF Business Xpress: ZIF Business Xpress has been engineered for enterprises to evaluate AIOps before adoption. 10 to 40 devices can be connected to ZIFBusiness Xpress, to experiment with the value proposition. 

ZIF Business: Targeted for enterprise-wide adoption.

For more details, please visit https://zif.ai

About the Author:

Sri Chaganty


Sri is a Serial Entrepreneur with over 30 years’ experience delivering creative, client-centric, value-driven solutions for bootstrapped, and venture-backed startups.

Inverse Reinforcement Learning

Naresh B

What is Inverse Reinforcement Learning(IRL)?

Inverse reinforcement learning is a recently developed Machine Learning framework that can solve the inverse problem of Reinforcement Learning (RL). Basically, IRL is about learning from humans. Inverse reinforcement learning is the field of learning an agent’s objectives, values, or rewards by observing its behavior.

Before getting into further details of IRL, let us recap RL.
Reinforcement learning is an area of Machine Learning (ML) that takes suitable actions to maximize rewards. The goal of reinforcement learning algorithms is to find the best possible action to take in a specific situation.

Challenges in RL

One of the hardest challenges in many reinforcement learning tasks is that it is often difficult to find a good reward function which is both learnable (i.e. rewards happen early and often enough) and correct (i.e. leads to the desired outcomes). Inverse reinforcement learning aims to deal with this problem by learning a reward function based on observations of expert behavior.

What distinguishes Inverse Reinforcement Learning from Reinforcement Learning?

In RL, our agent is provided with a reward function which, whenever it executes an action in some state, provides feedback about the agent’s performance. This reward function is used to obtain an optimal policy, one where the expected future reward (discounted by how far away it will occur) is maximal.

In IRL, the setting is (as the name suggests) inverse. We are now given some agent’s policy or a history of behavior and we try to find a reward function that explains the given behavior. Under the assumption that our agent acted optimally, i.e. always picks the best possible action for its reward function, we try to estimate a reward function that could have led to this behavior.

The biggest motivation for IRL

Maybe the biggest motivation for IRL is that it is often immensely difficult to manually specify a reward function for a task. So far, RL has been successfully applied in domains where the reward function is very clear. But in the real world, it is often not clear at all what the reward should be and there are rarely intrinsic reward signals such as a game score.

For example, consider we want to design an artificial intelligence for a self-driving car. A simple approach would be to create a reward function that captures the desired behavior of a driver, like stopping at red lights, staying off the sidewalk, avoiding pedestrians, and so on. In real life, this would require an exhaustive list of every behavior we’d want to consider, as well as a list of weights describing how important each behavior is.

Instead, in the IRL framework, the task is to take a set of human-generated driving data and extract an approximation of that human’s reward function for the task. Of course, this approximation necessarily deals with a simplified model of driving. Still, much of the information necessary for solving a problem is captured within the approximation of the true reward function. Since it quantifies how good or bad certain actions are. Once we have the right reward function, the problem is reduced to finding the right policy and can be solved with standard reinforcement learning methods.

For our self-driving car example, we’d be using human driving data to automatically learn the right feature weights for the reward. Since the task is described completely by the reward function, we do not even need to know the specifics of the human policy, so long as we have the right reward function to optimize. In the general case, algorithms that solve the IRL problem can be seen as a method for leveraging expert knowledge to convert a task description into a compact reward function.

Conclusion

The foundational methods of inverse reinforcement learning can achieve their results by leveraging information obtained from a policy executed by a human expert. However, in the long run, the goal is for machine learning systems to learn from a wide range of human data and perform tasks that are beyond the abilities of human experts.

References

About the Author

Naresh is a part of Location Zero at GAVS as an AI/ML solutions developer. His focus is on solving problems leveraging AI/ML. He strongly believes in making success as an habit rather than considering it a destination. In his free time, he likes to spend time with his pet dogs and likes sketching and gardening.

Machine Learning: Building Clustering Algorithms

Gireesh Sreedhar KP


Clustering is a widely-used Machine Learning (ML) technique. Clustering is an Unsupervised ML algorithm that is built to learn patterns from input data without any training, besides being able of processing data with high dimensions. This makes clustering the method of choice to solve a wide range and variety of ML problems.

Since clustering is widely used, for Data Scientists and ML Engineer’s it is critical to understand how to practically build clustering algorithms even though many of us have a high-level understanding of clustering. Let us understand the approach to build a clustering algorithm from scratch.

What is Clustering and how does it work?

Clustering is finding groups of objects (data) such that objects in the same group will be similar (related) to one another and different from (unrelated to) objects in other groups.

Clustering works on the concept of Similarity/Dissimilarity between data points. The higher similarity between data points, the more likely these data points will belong to the same cluster and higher the dissimilarity between data points, the more likely these data points will be kept out of the same cluster.

Similarity is the numerical measure of how alike two data objects are. Similarity will be higher when objects are more alike. Dissimilarity is the numerical measure of how different two data objects. Dissimilarity is lower when objects are more alike.

We create a ‘Dissimilarity Matrix’ (also called Distance Matrix) as an input to a clustering algorithm, where the dissimilarity matrix gives algorithm the notion of dissimilarity between objects. We build a dissimilarity matrix for each attribute of data considered for clustering and then combine the dissimilarity matrix for each data attribute to form an overall dissimilarity matrix. The dissimilarity matrix is an NxN square matrix where N is the number of data points considered for clustering and each element of the NxN square matrix gives dissimilarity between two objects.

Building Clustering Algorithm

Building a clustering algorithm involve the following:

  • Selection of most suited clustering techniques and algorithms to solve the problem. This step needs close collaboration among SMEs, business users, data scientists, and ML engineers. Based on inputs and data study, a possible list of algorithms (one or more) is selected for modeling and development along with tuning parameters are decided (to give algorithm more flexibility for tuning and learning from SME).
  • The selection of data attributes for the formulation of the dissimilarity matrix and methodology for the formation of the dissimilarity matrix (discussed later).
  • Building algorithms and doing the Design of experiments to select the best-suited algorithm and algorithm parameters for implementation.
  • Implementation of algorithm and fine-tuning of parameters as required.

Building a Dissimilarity matrix:

There are different approaches to build a dissimilarity matrix, here we consider building a dissimilarity matrix containing the distance (called Distance Matrix) between data objects (another alternative approach is to feed in coordinate points and let the algorithm compute distance). Let us consider a group of N data objects to be clustered based on three data attributes of each data object. The steps for building a Distance matrix are:

Build a Distance matrix for individual data attributes. Here we build three individual distance matrices (one for each attribute) containing distance between data objects calculated for each attribute. The data is always scaled between [0,1] using one of the standard normalization methods such as Min-Max Scalar. Here is how the distance matrix for an attribute looks like.

Properties of Distance Matrix:

  1. Distance Matrix is NxN square matrix (N – number of objects in clustering space)
  2. Matrix is symmetric with diagonal as zero (zero diagonal as distance of an object from itself is zero)
  3. For categorical data, distance between two points = 0, if both are same; =1 otherwise
  4. For numeric/ordered data, distance between two points = difference between scaled attribute values of two points.

Build Complete Distance matrix. Here we build a complete distance matrix combining distance matrix of individual attributes forming the input for clustering algorithm.

Complete distance matrix = (element-wise sum of individual attribute level matrix)/3;

Generalized Complete distance matrix = (element-wise sum of individual attribute level matrix)/M, where M is the number of attribute level matrix formed.

Considerations for the selection of clustering algorithms:

Before the selection of a clustering algorithm, the following considerations need to be evaluated to identify the right clustering algorithms for the given problem.

  • Partition criteria: Single Level vs hierarchical portioning
  • Separation of clusters: Exclusive (one data point belongs to only one class) vs non-exclusive (one data point can belong to more than one class)
  • Similarity measures: Distance-based vs Connectivity-based
  • Clustering space: Full space (used when low dimension data is processed) vs Subspace (used when high dimension data is processed, where only subspace can be processed and interesting clustering can be formed)
  • Attributes processing: Ability to deal with different types of attributes: Numerical, Categorical, Text, Media, a combination of data types in inputs
  • Discovery of clusters: Ability to form a predefined number of clusters or an arbitrary number of clusters
  • Ability to deal with noise in data
  • Scalability to deal with huge volumes of data, high dimensionality, incremental, or streaming data.
  • Ability to deal with constraints on user preference and domain requirements.

Application of Clustering

There are broadly two applications of clustering.

As an ML tool to get insight into data. Like building Recommendation Systems or Customer segmentation by clustering like-minded users or similar products, Social network analysis, Biological data analysis like Gene/Protein sequence analysis, etc.

As a pre-processing or intermediate step for other classes of algorithms. Like some Pattern-mining algorithms use clustering to group patterns mined and select most representative patterns instead of selecting entire patterns mined.

Conclusion

Building ML algorithm is teamwork with a team consisting of SMEs, users, data scientists, and ML engineers, each playing their part for success. The article gives steps to build a clustering algorithm, this can be used as reference material while attempting to build your algorithm.

About the Author:

Gireesh is a part of the projects run in collaboration with IIT Madras for developing AI solutions and algorithms. His interest includes Data Science, Machine Learning, Financial markets, and Geo-politics. He believes that he is competing against himself to become better than who he was yesterday. He aspires to become a well-recognized subject matter expert in the field of Artificial Intelligence.

Assess Your Organization’s Maturity in Adopting AIOps

Artificial Intelligence for IT operations (AIOps) is adopted by organizations to deliver tangible Business Outcomes. These business outcomes have a direct impact on companies’ revenue and customer satisfaction.

A survey from AIOps Exchange 2019, reports that 84% of Business Owners who attended the survey, confirmed that they are actively evaluating AIOps to be adopted in their organizations.

So, is AIOps just automation? Absolutely NOT!!

Artificial Intelligence for IT operations implies the implementation of true Autonomous Artificial Intelligence in ITOps, which needs to be adopted as an organization-wide strategy. Organizations will have to assess their existing landscape, processes, and decide where to start. That is the only way to achieve the true implementation of AIOps.

Every organization trying to evaluate AIOps as a strategy should read through this article to understand their current maturity, and then move forward to reach the pinnacle of Artificial Intelligence in IT Operations.

The primary Success Factor in adopting AIOps is derived from the Business Outcomes the organization is trying to achieve by implementing AIOps –that is the only way to calculate ROI.

There are 4 levels of Maturity in AIOps adoption. Based on our experience in developing an AIOps platform and implementing the platform across multiple industries, we have arrived at these 4 levels. Assessing an organization against each of these levels helps in achieving the goal of TRUE Artificial Intelligence in IT Operations.

Level 1: Knee-jerk

Events, logs are generated in silos and collected from various applications and devices in the infrastructure. These are used to generate alerts that are commissioned to command centres to escalate as per the SOPs (standard operating procedures) defined. The engineering teams work in silos, not aware of the business impact that these alerts could potentially create. Here, operations are very reactive which could cost the organization millions of dollars.

Level 2: Unified

Have integrated all events, logs, and alerts into one central locale. ITSM process has been unified. This helps in breaking silos and engineering teams are better prepared to tackle business impacts. SOPs have been adjusted since the process is unified, but this is still reactive incident management.

Level 3: Intelligent

Machine Learning algorithms (either supervised or unsupervised) have been implemented on the unified data to derive insights. There are baseline metrics that are calibrated and will be used as a reference for future events. With more data, the metrics get richer. IT operations team can correlate incidents/events with business impacts by leveraging AI & ML. If Mean Time To Resolve (MTTR) an incident has been reduced by automated identification of the root cause, then the organization has attained level 3 maturity in AIOps.

Level 4: Predictive & Autonomous

The pinnacle of AIOps is level 4. If incidents and performance degradation of applications can be predicted by leveraging Artificial Intelligence, it implies improved application availability. Autonomousremediation bots can be triggered spontaneously based on the predictive insights, to fix incidents that are prone to happen in the enterprise. Level 4 is a paradigm shift in IT operations – moving operations entirely from being reactive, to becoming proactive.

Conclusion:

As IT operations teams move up each level, the essential goal to keep in mind is the long-term strategy that needs to be attained by adopting AIOps. Artificial Intelligence has matured over the past few decades, and it is up to AIOps platforms to embrace it effectively. While choosing an AIOps platform, measure the maturity of the platform’s artificial intelligent coefficient.

About the Author:

Anoop Aravindakshan (Principal Consultant Manager) at GAVS Technologies.


An evangelist of Zero Incident FrameworkTM, Anoop has been a part of the product engineering team for long and has recently forayed into product marketing. He has over 14 years of experience in Information Technology across various verticals, which include Banking, Healthcare, Aerospace, Manufacturing, CRM, Gaming, and Mobile.

Prediction for Business Service Assurance

Artificial Intelligence for IT operations or AIOps has exploded over the past few years. As more and more enterprises set about their digital transformation journeys, AIOps becomes imperative to keep their businesses running smoothly. 

AIOps uses several technologies like Machine Learning and Big Data to automate the identification and resolution of common Information Technology (IT) problems. The systems, services, and applications in a large enterprise produce volumes of log and performance data. AIOps uses this data to monitor the assets and gain visibility into the behaviour and dependencies among these assets.

According to a Gartner publication, the adoption of AIOps by large enterprises would rise to 30% by 2023.

ZIF – The ideal AIOps platform of choice

Zero Incident FrameworkTM (ZIF) is an AIOps based TechOps platform that enables proactive detection and remediation of incidents helping organizations drive towards a Zero Incident Enterprise™.

ZIF comprises of 5 modules, as outlined below.

At the heart of ZIF, lies its Analyze and Predict (A&P) modules which are powered by Artificial Intelligence and Machine Learning techniques. From the business perspective, the primary goal of A&P would be 100% availability of applications and business processes.

Let us understand more about thePredict module of ZIF.

Predictive Analytics is one of the main USP of the ZIF platform. ZIF encompassesSupervised, Unsupervised and Reinforcement Learning algorithms for realization of various business use cases (as shown below).

How does the Predict Module of ZIF work?

Through its data ingestion capabilities, the ZIF platform can receive and process all types of data (both structured and unstructured) from various tools in the enterprise. The types of data can be related to alerts, events, logs, performance of devices, relations of devices, workload topologies, network topologies etc. By analyzing all these data, the platform predicts the anomalies that can occur in the environment. These anomalies get presented as ‘Opportunity Cards’ so that suitable action can be taken ahead of time to eliminate any undesired incidents from occurring. Since this is ‘Proactive’ and not ‘Reactive’, it brings about a paradigm shift to any organization’s endeavour to achieve 100% availability of their enterprise systems and platforms. Predictions are done at multiple levels – application level, business process level, device level etc.

Sub-functions of Prediction Module

How does the Predict module manifest to enterprise users of the platform?

Predict module categorizes the opportunity cards into three swim lanes.

  1. Warning swim lane – Opportunity Cards that have an “Expected Time of Impact” (ETI) beyond 60 minutes.
  2. Critical swim lane – Opportunity Cards that have an ETI within 60 minutes.
  3. Processed / Lost– Opportunity Cards that have been processed or lost without taking any action.

Few of the enterprises that realized the power of ZIF’s Prediction Module

  • A manufacturing giant in the US
  • A large non-profit mental health and social service provider in New York
  • A large mortgage loan service provider in the US
  • Two of the largest private sector banks in India

For more detailed information on GAVS’ Analyze, or to request a demo please visithttps://zif.ai/products/predict/

References:https://www.gartner.com/smarterwithgartner/how-to-get-started-with-aiops/

About the Author:

Vasudevan Gopalan

Vasu heads Engineering function for A&P. He is a Digital Transformation leader with ~20 years of IT industry experience spanning across Product Engineering, Portfolio Delivery, Large Program Management etc. Vasu has designed and delivered Open Systems, Core Banking, Web / Mobile Applications etc.

Outside of his professional role, Vasu enjoys playing badminton and focusses on fitness routines.