Understanding Reinforcement Learning in five minutes

Reinforcement learning (RL) is an area of Machine Learning (ML) that takes suitable actions to maximize rewards situations. The goal of reinforcement learning algorithms is to find the best possible action to take in a specific situation. Just like the human brain, it is rewarded for good choices and penalized for bad choices and learns from each choice. RL tries to mimic the way that humans learn new things, not from a teacher but via interaction with the environment. At the end, the RL learns to achieve a goal in an uncertain, potentially complex environment.

Understanding Reinforcement Learning

How does one learn cycling? How does a baby learn to walk? How do we become better at doing something with more practice? Let us explore learning to cycle to illustrate the idea behind RL.

Did somebody tell you how to cycle or gave you steps to follow? Or did you learn it by spending hours watching videos of people cycling? All these will surely give you an idea about cycling; but will it be enough to actually get you cycling? The answer is no. You learn to cycle only by cycling (action). Through trials and errors (practice), and going through all the positive experiences (positive reward) and negative experiences (negative rewards or punishments), before getting your balance and control right (maximum reward or best outcome). This analogy of how our brain learns cycling applies to reinforcement learning. Through trials, errors, and rewards, it finds the best course of action.

Components of Reinforcement Learning

The major components of RL are as detailed below:

  • Agent: Agent is the part of RL which takes actions, receives rewards for actions and gets a new environment state as a result of the action taken. In the cycling analogy, the agent is a human brain that decides what action to take and gets rewarded (falling is negative and riding is positive).
  • Environment: The environment represents the outside world (only relevant part of the world which the agent needs to know about to take actions) that interacts with agents. In the cycling analogy, the environment is the cycling track and the objects as seen by the rider.
  • State: State is the condition or position in which the agent is currently exhibiting or residing. In the cycling analogy, it will be the speed of cycle, tilting of the handle, tilting of the cycle, etc.
  • Action: What the agent does while interacting with the environment is referred to as action. In the cycling analogy, it will be to peddle harder (if the decision is to increase speed), apply brakes (if the decision is to reduce speed), tilt handle, tilt body, etc.
  • Rewards: Reward is an indicator to the agent on how good or bad the action taken was. In the cycling analogy, it can be +1 for not falling, -10 for hitting obstacles and -100 for falling, the reward for outcomes (+1, -10, -100) are defined while building the RL agent. Since the agent wants to maximize rewards, it avoids hitting and always tries to avoid falling.

Characteristics of Reinforcement Learning

Instead of simply scanning the datasets to find a mathematical equation that can reproduce historical outcomes like other Machine Learning techniques, reinforcement learning is focused on discovering the optimal actions that will lead to the desired outcome.

There are no supervisors to guide the model on how well it is doing. The RL agent gets a scalar reward and tries to figure out how good the action was.

Feedback is delayed. The agent gets an instant reward for action, however, the long-term effect of an action is known only later. Just like a move in chess may seem good at the time it is made, but may turn out to be a bad long term move as the game progress.

Time matters (sequential). People who are familiar with supervised and unsupervised learning will know that the sequence in which data is used for training does not matter for the outcome. However, for RL, since action and reward at current state influence future state and action, the time and sequence of data matters.

Action affects subsequent data RL agent receives.

Why Reinforcement Learning

The type of problems that reinforcement learning solves are simply beyond human capabilities. They are even beyond the solving capabilities of ML techniques. Besides, RL eliminates the need for data to learn, as the agent learns by interacting with the environment. This is a great advantage to solve problems where data availability or data collection is an issue.

Reinforcement Learning applications

RL is the darling of ML researchers now. It is advancing with incredible pace, to solve business and industrial problems and garnering a lot of attention due to its potential. Going forward, RL will be core to organizations’ AI strategies.

Reinforcement Learning at GAVS

Reinforcement Learning is core to GAVS’ AI strategy and is being actively pursued to power the IP led AIOps platform – Zero Incident FrameworkTM (ZIF). We had our first success on RL; developing an RL agent for automated log rotation in servers.

References:

Reinforcement Learning: An Introduction second edition by Richard S. Sutton and Andrew G. Barto

https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf

About the Author:

Gireesh Sreedhar KP

Gireesh is a part of the projects run in collaboration with IIT Madras for developing AI solutions and algorithms. His interest includes Data Science, Machine Learning, Financial markets, and Geo-politics. He believes that he is competing against himself to become better than who he was yesterday. He aspires to become a well-recognized subject matter expert in the field of Artificial Intelligence.

AI in Healthcare

The Healthcare Industry is going through a quiet revolution. Factors like disease trends, doctor demographics, regulatory policies, environment, technology etc. are forcing the industry to turn to emerging technologies like AI, to help adapt to the pace of change. Here, we take a look at some key use cases of AI in Healthcare.

Medical Imaging

The application of Machine Learning (ML) in Medical Imaging is showing highly encouraging results. ML is a subset of AI, where algorithms and models are used to help machines imitate the cognitive functions of the human brain and to also self-learn from their experiences.

AI can be gainfully used in the different stages of medical imaging- in acquisition, image reconstruction, processing, interpretation, storage, data mining & beyond. The performance of ML computational models improves tremendously as they get exposed to more & more data and this foundation on colossal amounts of data enables them to gradually better humans at interpretation. They begin to detect anomalies not perceptible to the human eye & not discernible to the human brain!

What goes hand-in-hand with data, is noise. Noise creates artifacts in images and reduces its quality, leading to inaccurate diagnosis. AI systems work through the clutter and aid noise- reduction leading to better precision in diagnosis, prognosis, staging, segmentation and treatment.

At the forefront of this use case is Radio genomics- correlating cancer imaging features and gene expression. Needless to say, this will play a pivotal role in cancer research.

Drug Discovery

Drug Discovery is an arduous process that takes several years from the start of research to obtaining approval to market. Research involves laboring through copious amounts of medical literature to identify the dynamics between genes, molecular targets, pathways, candidate compounds. Sifting through all of this complex data to arrive at conclusions is an enormous challenge. When this voluminous data is fed to the ML computational models, relationships are reliably established. AI powered by domain knowledge is slashing down time & cost involved in new drug development.

Cybersecurity in Healthcare

Data security is of paramount importance to Healthcare providers who need to ensure confidentiality, integrity, and availability of patient data. With cyberattacks increasing in number and complexity, these formidable threats are giving security teams sleepless nights! The main strength of AI is its ability to curate massive quantities of data- here threat intelligence, nullify the noise, provide instant insights & self-learn in the process. Predictive & Prescriptive capabilities of these computational models drastically reduces response time.

Virtual Health assistants

Virtual Health assistants like Chatbots, give patients 24/7 access to critical information, in addition to offering services like scheduling health check-ups or setting up appointments. AI- based platforms for wearable health devices and health apps come armed with loads of features to monitor health signs, daily activities, diet, sleep patterns etc. and provide alerts for immediate action or suggest personalized plans to enable healthy lifestyles.

AI for Healthcare IT Infrastructure

Healthcare IT Infrastructure running critical applications that enable patient care, is the heart of a Healthcare provider. With dynamically changing IT landscapes that are distributed, hybrid & on-demand, IT Operations teams are finding it hard to keep up. Artificial Intelligence for IT Ops (AIOps) is poised to fundamentally transform the Healthcare Industry. It is powering Healthcare Providers across the globe, who are adopting it to Automate, Predict, Remediate & Prevent Incidents in their IT Infrastructure. GAVS’ Zero Incident FrameworkTM (ZIF) – an AIOps Platform, is a pure-play AI platform based on unsupervised Machine Learning and comes with the full suite of tools an IT Infrastructure team would need. Please watch this video to learn more.

READ ALSO OUR NEW UPDATES

Disaster Recovery for Modern Digital IT

A Disaster Recovery strategy includes policies, tools and processes for recovery of data and restoration of systems in the event of a disruption. The cause of disruption could be natural, like earthquakes/floods, or man-made like power outages, hardware failures, terror attacks or cybercrimes. The aim of Disaster Recovery(DR) is to enable rapid recovery from the disaster to minimize data loss, extent of damage, and disruption to business. DR is often confused with Business Continuity Planning(BCP). While BCP ensures restoration of the entire business, DR is a subset of that, with focus on IT infrastructure, applications and data.

IT disasters come at the cost of lost revenue, tarnished brand image, lowered customer confidence and even legal issues relating to data privacy and compliance. The impact can be so debilitating that some companies never fully recover from it. With the average cost of IT downtime running to thousands of dollars per minute, it goes without saying that an enterprise-grade disaster recovery strategy is a must-have.

Why do companies neglect this need?

Inspite of the obvious consequences of a disaster, many organizations shy away from investing in a DR strategy due to the associated expenditure. Without a clear ROI in sight, these organizations decide to risk the vulnerability to catastrophic disruptions. They instead make do with just data backup plans or secure only some of the most critical elements of their IT landscape.

Why is Disaster Recovery different today?

The ripple effects of modern digital infrastructure have forced an evolution in DR strategies. Traditional Disaster Recovery methods are being overhauled to cater to the new hybrid IT infrastructure environment. Some influencing factors:

  • The modern IT Landscape

o Infrastructure – Today’s IT environment is distributed between on-premise, colocation facilities, public/private cloud, as-a-service offerings and edge locations. Traditional data centres are losing their prominence and are having to share their monopoly with these modern technologies. This trend has significant advantages such as reduced CapEx in establishing data centers, reduced latency because of data being closer to the user, and high dynamic scalability.

o Data – Adding to the complexity of modern digital infrastructure is the exponential growth in data from varied sources and of disparate types like big data, mobile data, streaming content, data from cloud, social media, edge locations, IoT, to name a few.

  • Applications – The need for agility has triggered the shift away from monolith applications towards microservices that typically use containers to provide their execution environment. Containers are ephemeral and so scale, shrink, disappear or move between nodes based on demand.
  • While innovation in IT helps digital transformation in unimaginable ways, it also makes it that much harder for IT teams to formulate a disaster recovery strategy for today’s IT landscape that is distributed, mobile, elastic and transient.
  • Cybercrimes are becoming increasingly prevalent and are a big threat to organizations. Moderntechnologies fuel increasing sophistication in malware and ransomware. As their complexity increases, they are becoming harder to even detect while they lie low and do their harm quietly inside the environment. By the time they are detected, the damage is done and it’s too late. DR strategies are also constantly challenged by the lucrative underworld of ransomware.

Solution Strategies for Disaster Recovery

  • On-Premise DR: This is the traditional option that translates toheavy upfront investments towardsthe facility, securing the facility, infrastructure including the network connectivity/firewalls/load balancers, resources to scale as needed, manpower, test drills, ongoing management and maintenance, software licensing costs, periodic upgrades for ongoing compatibility with the production environment and much more.

A comprehensive DR strategy involves piecing together several pieces of a complex puzzle. Due to the staggering costs and time involved in provisioning and managing infra for the duplicate storage and compute, companies are asking themselves if it is really worth the investment, and are starting to explore more OpEx based solutions. And, they are discovering that the cloud may be the answer to this challenge of evolving infra, offering cost-effective top-notch resiliency.

  • Cloud-based DR: The easy availability of public cloud infrastructure & services, with affordablemonthly subscription plans and pay per use rates, has caused an organic switch to the cloud for storage, infra and as a Service(aaS) needs. To complement this, replication techniques have also evolved to enable cloud replication. With backup on the cloud, the recovery environment needs to be paid for only when used in the event of a disaster!

Since maintaining the DR site is the vendor’s responsibility, it reduces the complexity in managing the DR site and the associated operating expenses as well. Most DR requirements are intrinsically built into cloud solutions: redundancy, advanced networks, bandwidth, scalability, security & compliance. These can be availed on demand, as necessitated by the environment and recovery objectives. These features have made it feasible for even small businesses to acquire DR capabilities.

Disaster Recovery-as-a-Service(DRaaS) which is fast gaining popularity, is a DR offering on the cloud, where the vendor manages the replication, failover and failback mechanisms as needed for recovery, based on a SLA driven service contract .

On the flip side, as cloud adoption becomes more and more prevalent, there are also signs of a reverse drain back to on-premise! Over time, customers are noticing that they are bombarded by hefty cloud usage bills, way more than what they had bargained for. There is a steep learning curve

in assimilating the nuances of new cloud technologies and the innumerable options they offer. It is critical for organizations to clearly evaluate their needs, narrow down on reliable vendors with mature offerings, understand their feature set and billing nitty-gritties and finalize the best fit for their recovery goals. So, it is Cloud, but with Caution!

  • Integrating DR with the Application: Frank Jablonski, VP of Global Marketing, SIOS Technology Corppredicts that applications will soon have Disaster Recovery architected into their core, as a value-add. Cloud-native implementations will leverage the resiliency features of the cloud to deliver this value.

The Proactive Approach

Needless to say, investing in a proactive approach for disaster prevention will help mitigate the chances for a disaster in the first place. One sure-fire way to optimize IT infrastructure performance, prevent certain types of disasters and enhance business services continuity is to use AI augmented ITOps platforms to manage the IT environment. GAVS’ AIOps platform, Zero Incident FrameworkTM(ZIF) has modules powered by Advanced Machine Learning to Discover, Monitor, Analyze, Predict, and Remediate, helping organizations drive towards a Zero Incident EnterpriseTM. For more information, please visit the ZIF website.

READ ALSO OUR NEW UPDATES

DevOps

Today’s mantra for software delivery is Agility. It is a huge differentiator that gives organizations a competitive edge, and emboldens even fledgling start-ups to challenge giants in the IT industry. Traditional methods of software development have not been able to cope with today’s velocity of delivery and innovation demands, and are screaming for a lighter yet holistic approach. In traditional development models like Waterfall, phases of the development life cycle are followed sequentially: Requirements, Analysis & Design, Development, Integration & Testing, Deployment, and Maintenance, with documentation and sign-off at the end of each phase. This approach is heavy and documentation-intensive and is not quite responsive to requirement or scope changes since it dictates strict adherence to the linear process model. And worse still, the end product of a long-drawn development cycle may not be quite what the customer expected!

Agile development broadly refers to methodologies like Scrum & Kanban that are based on iterative development/testing in short bursts called sprints, continuous feedback, retrospection, course-correction and constant collaboration amongst the teams involved & the customer. This facilitates incremental evolution of the software and adaptability.

What does DevOps bring to the table?

The idea behind DevOps is to foster a collaborative work culture within the organization where the Development, QA & ITOps teams work together as one cross-functional unit towards common goals. When the walls of team silos are broken down; teams integrate well with each other; and there is free flow of communication, it percolates down to quicker, quality deliverables with very low failure rates. It is such a basic but powerful idea that makes us wonder why we didn’t do it all along!

DevOps practices help create a standardized and stable operating environment and eliminate the warring dynamics between Dev, QA & Ops teams; each working on their own agenda; refusing to take ownership and blaming each other when issues occur, because they are now one multi-disciplinary unit working together from day 01, with full control and autonomy over the entire software delivery process.

DevOps is a winning combination of healthy work culture, processes and tools/cloud services. A good DevOps implementation ensures incremental evolution of software through processes like Continuous Integration, Continuous Delivery/Continuous Deployment(CI/CD), automated in a process pipeline with an integrated toolchain, so that human-induced wait-time/lag does not hamper agility. Continuous Testing and Continuous Performance Monitoring are also an integral part of this pipeline. Although there will be differences in the DevOps implementation styles and the tools/services used, here’s a quick look at a typical CI/CD process.

Frequent commits and builds of small blocks of code is a DevOps best practice. This prevents the chaos that usually ensues when code from the different feature branches get integrated into the main code branch, firing merge conflicts. Continuous Integration is a workflow strategy that involves compiling the committed code that is on a source repository like GitHub, Azure Repos or Bitbucket, validating it with automated static code analysis, unit and integration tests. The quality of the test suite will determine the quality of the newly integrated code. Typically this involves an Integration server/CI service like Jenkins, Azure DevOps or GoCD that gets triggered on commit(or a pull request(PR) as the implementation may be), that builds the code and runs automated tests.

Continuous Delivery(CD) is an extension of Continuous Integration(CI) where further tests such as UI/UX, Load, UAT, QA are done in varied environments, and finally deployed to a staging environment. This process makes it deploy-ready to production, based on approval.

Continuous Deployment is the same as Continuous Delivery with the exception that the deployable package is automatically promoted into production without the need for human approval. This is routinely done in highly mature DevOps implementations but is obviously too risky for those just starting out on their DevOps journeys. Such organizations could stay with Continuous Delivery until their DevOps environment stabilizes. For those progressing into Continuous Deployment, DevOps offers granular control for things like users to deploy to and the time of deployment.

Infrastructure as Code

Infrastructure as Code(IaC) is another important DevOps practice used with Continuous Delivery. As the name suggests, this is the management of infrastructure using code where the desired configuration settings of the environment are specified as code. Every time this code is run, the same environment is generated. This solves issues arising out of inconsistencies in environment configurations and the need for manually maintaining the settings of deployment environments. The pipeline executes this code to configure multiple test targets, enabling application testing in simulated production environments. IaC typically follows application code versioning and is validated just like regular code. It also enables dynamic provisioning of environments on demand.

DevOps Benefits

High delivery velocity and predictable quality are automatic outcomes of a good DevOps implementation since iterative development based on feedback & course-correction, continuous testing & monitoring are core to the methodology. No time is lost in big releases, managing the big release mayhem. Continuous customer feedback looped into the process helps avoid go-live surprises(read shocks) for the customer.

DevOps principles foster responsible autonomy and a spirit of collaboration where the entire cross-functional, cross-trained DevOps team works together and takes end-to-end responsibility from start to finish.

Pipeline Automation injects speed, delivery predictability and enhances productivity by freeing human resources from manual tasks and giving them a sense of satisfaction and purpose as they are able to routinely see their work in customer’s hands.

Taking the plunge

Everyone wants to hop on the DevOps bandwagon but not all of them have clarity on where and how to start. DevOps is first and foremost a change in work culture and its implementation is primarily an exercise in changing mindsets and behavioural aspects at the workplace.

Importantly, the organization needs to arrive at clear objectives and expected business outcomes. As with most things, it’s always a good idea to start small, stabilize and use that as a pivot to move forward to the next baby step. Piloting it on a low risk application with few users will help everyone ease into the new style of working and give the DevOps movement some momentum. It would also help to initially run the current and DevOps environments in parallel to reduce risk.

Importantly, the initiative needs to be backed by the right processes, automation tools and employee enablement by cross training software developers, QA and IT personnel, enabling them to take charge of the entire process.

Amazon and Netflix are fantastic examples of DevOps done right. Dave Hahn, SRE manager at Netflix says as a company, they don’t think about DevOps and that DevOps is just the wonderful result of a

healthy culture and healthy thinking! Netflix, according to Dave, has millions of customers across the globe, has hundreds of thousands of customer interactions every second, streams tens of billions of hours of entertainment every quarter and manages it all with just 10s of Ops engineers who are also software engineers! That’s the power of DevOps!

For information on our DevOps offerings, please click here.

Data Migration Powered by RPA

What is RPA?

Robotic Process Automation(RPA) is the use of specialized software to automate repetitive tasks. Offloading mundane, tedious grunt work to the software robots frees up employee time to focus on more cerebral tasks with better value-add. So, organizations are looking at RPA as a digital workforce to augment their human resources. Since robots excel at rules-based, structured, high-volume tasks, they help improve business process efficiency, reduce time and operating costs due to the reliability, consistency & speed they bring to the table.

Generally, RPA is low-cost, has faster deployment cycles as compared to other solutions for streamlining business processes, and can be implemented easily. RPA can be thought of as the first step to more transformative automations. With RPA steadily gaining traction, Forrester predicts the RPA Market will reach $2.9 Billion by 2021.

Over the years, RPA has evolved from low-level automation tasks like screen scraping to more cognitive ones where the bots can recognize and process text/audio/video, self-learn and adapt to changes in their environment. Such Automation supercharged by AI is called Intelligent Process Automation.

Use Cases of RPA

Let’s look at a few areas where RPA has resulted in a significant uptick in productivity.

Service Desk – One of the biggest time-guzzlers of customer service teams is sifting through scores ofemails/phone calls/voice notes received every day. RPA can be effectively used to scour them, interpret content, classify/tag/reroute or escalate as appropriate, raise tickets in the logging system and even drive certain routine tasks like password resets to closure!

Claims Processing – This can be used across industries and result in tremendous time and cost savings.This would include interpreting information in the forms, verification of information, authentication of e-signatures & supporting documents, and first level approval/rejection based on the outcome of the verification process.

Data Transfers – RPA is an excellent fit for tasks involving data transfer, to either transfer data on paperto systems for digitization, or to transfer data between systems during data migration processes.

Fraud Detection – Can be a big value-add for banks, credit card/financial services companies as a first lineof defense, when used to monitor account or credit card activity and flag suspicious transactions.

Marketing Activities – Can be a very resourceful member of the marketing team, helping in all activities

right from lead gen, to nurturing leads through the funnel with relevant, personalized, targeted content

delivery.

Reporting/Analytics

RPA can be used to generate reports and analytics on predefined parameters and KPIs, that can help

give insights into the health of the automated process and the effectiveness of the automation itself.

The above use cases are a sample list to highlight the breadth of their capabilities. Here are some industry-specific tasks where RPA can play a significant role.

Banks/Financial Services/Accounting Firms – Account management through its lifecycle, Cardactivation/de-activation, foreign exchange payments, general accounting, operational accounting, KYC digitization

Manufacturing, SCM –Vendor handling, Requisition to Purchase Order, Payment processing, Inventorymanagement

HR – Employee lifecycle management from On-boarding to Offboarding, Resume screening/matching

Data Migration Triggers & Challenges

A common trigger for data migration is when companies want to sunset their legacy systems or integrate them with their new-age applications. For some, there is a legal mandate to retain legacy data, as with patient records or financial information, in which case these organizations might want to move the data to a lower-cost or current platform and then decommission the old system.

This is easier said than done. The legacy systems might have their data in flat files or non-relational DBs or may not have APIs or other standards-based interfaces, making it very hard to access the data. Also, they might be based on old technology platforms that are no longer supported by the vendor. For the same reasons, finding resources with the skillset and expertise to navigate through these systems becomes a challenge.

Two other common triggers for data migrations are mergers/acquisitions which necessitate the merging of systems and data and secondly, digital transformation initiatives. When companies look to modernize their IT landscape, it becomes necessary to standardize applications and remove redundant ones across application silos. Consolidation will be required when there are multiple applications for the same use cases in the merged IT landscape.

Most times such data migrations can quickly spiral into unwieldy projects, due to the sheer number, size, and variety of the systems and data involved, demanding meticulous design and planning. The first step would be to convert all data to a common format before transition to the target system which would need detailed data mappings and data cleansing before and after conversion, making it extremely complex, resource-intensive and expensive.

RPA for Data Migration

Structured processes that can be precisely defined by rules is where RPA excels. So, if the data migration process has clear definitions for the source and target data formats, mappings, workflows, criteria for rollback/commit/exceptions, unit/integration test cases and reporting parameters, half the battle is won. At this point, the software bots can take over!

Another hurdle in humans performing such highly repetitive tasks is mental exhaustion, which can lead to slowing down, errors and inconsistency. Since RPA is unfazed by volume, complexity or monotony, it automatically translates to better process efficiency and cost benefits. Employee productivity also increases because they are not subjected to mind-numbing work and can focus on other interesting tasks on hand. Since the software bots can be configured to create logfiles/reports/dashboards in any format, level of detail & propagation type/frequency, traceability, compliance, and complete visibility into the process are additional happy outcomes!

To RPA or not to RPA?

Well, while RPA holds a lot of promise, there are some things to keep in mind

  • Important to choose the right processes/use-cases to automate, else it could lead to poor ROI
  • Quality of the automation depends heavily on diligent design and planning
  • Integration challenges with other automation tools in the landscape
  • Heightened data security and governance concerns since it will have full access to the data
  • Periodic reviews required to ensure expected RPA behavior
  • Dynamic scalability might be an issue when there are unforeseen spikes in data or usage patterns
  • Lack of flexibility to adapt to changes in underlying systems/platforms could make it unusable

But like all other transformational initiatives, the success of RPA depends on doing the homework right, taking informed decisions, choosing the right vendor(s) and product(s) that align with your Business imperatives, and above all, a whole-hearted buy-in from the business, IT & Security teams and the teams that will be impacted by the RPA.

Analyze

Have you heard of AIOps?

Artificial intelligence for IT operations (AIOps) is an umbrella term for the application of Big Data Analytics, Machine Learning (ML) and other Artificial Intelligence (AI) technologies to automate the identification and resolution of common Information Technology (IT) problems. The systems, services and applications in a large enterprise produce immense volumes of log and performance data. AIOps uses this data to monitor the assets and gain visibility into the working behaviour and dependencies between these assets.

According to a Gartner study, the adoption of AIOps by large enterprises would rise to 30% by 2023.

ZIF – The ideal AIOps platform of choice

Zero Incident FrameworkTM (ZIF) is an AIOps based TechOps platform that enables proactive detection and remediation of incidents helping organizations drive towards a Zero Incident Enterprise™

ZIF comprises of 5 modules, as outlined below.

At the heart of ZIF, lies its Analyze and Predict (A&P) modules which are powered by Artificial Intelligence and Machine Learning techniques. From the business perspective, the primary goal of A&P would be 100% availability of applications and business processes.

Come, let us understand more about the Analyze function of ZIF.

With Analyzehaving a Big Data platform under its hood, volumes of raw monitoring data, both structured and unstructured, can be ingested and grouped to build linkages and identify failure patterns.

Data Ingestion and Correlation of Diverse Data

The module processes a wide range of data from varied data sources to break siloes while providing insights, exposing anomalies and highlighting risks across the IT landscape. It increases productivity and efficiency through actionable insights.

  • 100+ connectors for leading tools, environments and devices
  • Correlation and aggregation methods uncover patterns and relationships in the data

Noise Nullification

Eliminates duplicate incidents, false positives and any alerts that are insignificant. This also helps reduce the Mean-Time-To-Resolution and event-to-incident ratio.

  • Deep learning algorithms isolate events that have the potential to become incidents along with their potential criticality
  • Correlation and Aggregation methods group alerts and incidents that are related and needs a common remediation
  • Reinforcement learning techniques are applied to find and eliminate false positives and duplicates

Event Correlation

Data from various sources are ingested real-time into ZIF either by push or pull mechanism. As the data is ingested, labelling algorithms are run to label the data based on identifiers. The labelled data is passed through the correlation engine where unsupervised algorithms are run to mine the patterns. Sub-sequence mining algorithms help in identifying unique patterns from the data.

Unique patterns identified are clustered using clustering algorithms to form cases. Every case that is generated is marked by a unique case id. As part of the clustering process, seasonality aspects are checked from historical transactions to derive higher accuracy of correlation.

Correlation is done based on pattern recognition, helping to eliminate the need for relational CMDB from the enterprise. The accuracy of the correlation increases as patterns reoccur. Algorithms also can unlearn patterns based on the feedback that can be provided by actions taken on correlation. As these are unsupervised algorithms, the patterns are learnt with zero human intervention.

Accelerated Root Cause Analysis (RCA)

Analyze module helps in identifying the root causes of incidents even when they occur in different silos. Combination of correlation algorithms with unsupervised deep learning techniques aid in accurately nailing down the root causes of incidents/problems. Learnings from historical incidents are also applied to find root causes in real-time. The platform retraces the user journeys step-by-step to identify the exact point where an error occurs.

Customer Success Story – How ZIF’s A&P transformed IT Operations of a Manufacturing Giant

  • Seamless end-to-end monitoring – OS, DB, Applications, Networks
  • Helped achieve more than 50% noise reduction in 6 months
  • Reduced P1 incidents by ~30% through dynamic and deep monitoring
  • Achieved declining trend of MTTR and an increasing trend of Availability
  • Resulted in optimizingcommand centre/operations head count by ~50%
  • Resulted in ~80% reduction in operations TCO

For more detailed information on GAVS’ Analyze, or to request a demo please visit zif.ai/products/analyze

References: www.gartner.com/smarterwithgartner/how-to-get-started-with-aiops

ABOUT THE AUTHOR

Vasudevan Gopalan


Vasu heads Engineering function for A&P. He is a Digital Transformation leader with ~20 years of IT industry experience spanning across Product Engineering, Portfolio Delivery, Large Program Management etc. Vasu has designed and delivered Open Systems, Core Banking, Web / Mobile Applications etc.

Outside of his professional role, Vasu enjoys playing badminton and focusses on fitness routines.

READ ALSO OUR NEW UPDATES

CCPA for Healthcare

The California Consumer Privacy Act (CCPA) is a state statute intended to enhance consumer protection and data privacy rights of the residents of California, United States. It is widely considered one of the most sweeping consumer privacy laws, giving Californians the strongest data privacy rights in the U.S.

The focus of this article is CCPA as it applies to Healthcare. Let’s take a quick look at what CCPA is and then move onto its relevance for Healthcare entities. CCPA is applicable to any for-profit organization – regardless of whether it physically operates out of California – that interacts with, does business with and/or collects, processes or monetizes personal information of California residents AND meets at least one of these criteria: has annual gross revenue in excess of $25 million USD; collects or transacts with the personal information of 50,000 or more California consumers, households, or devices; earns 50% or more of its annual revenue by monetizing such data. CCPA also empowers California consumers with the rights to complete ownership; control; and security of their personal information and imposes new stringent responsibilities on businesses to enable these rights for their consumers.

Impact on Healthcare Companies

Companies directly or indirectly involved in the healthcare sector and dealing with medical information are regulated by the Confidentiality of Medical Information Act (CMIA) and the Health Insurance Portability and Accountability Act (HIPAA). CCPA does not supersede these laws & does not apply to ‘Medical Information (MI)’ as defined by CMIA, or to ‘Protected Health Information (PHI)’ as defined by HIPAA. CCPA also excludes de- identified data and information collected by federally-funded clinical trials, since such research studies are regulated by the ‘Common Rule’.

The focus of the CCPA is ‘Personal Information (PI)’ which means information that “identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.” PI refers to data including but not limited to personal identifiers such as name, address, phone numbers, email ids, social security number; personal details relating to education, employment, family, finances; biometric information, geolocation, consumer activity like purchase history, product preferences; internet activity.

So, if CCPA only regulates personal information, are healthcare companies that are already in compliance with CMIA and HIPAA safe? Is there anything else they need to do?

Well, there is a lot that needs to be done! This only implies that such companies should continue to comply with those rules when handling Medical Information as defined by the CMIA, or Protected Health Information, as defined by HIPAA. They will still need to adhere to CCPA regulations for personal data that is outside of MI and PHI. This will include

employee personal information routinely obtained and processed by the company’s HR; those collected from websites, health apps, health devices, events; clinical studies that are not funded by the federal government; information of a CCPA-covered entity that is handled by a non-profit affiliate, to give a few examples.

There are several possibilities – some not so apparent – even in healthcare entities, for personal data collection and handling that would fall under the purview of CCPA. They need to take stock of the different avenues through which they might be obtaining/handling such data and prioritize CCPA compliance. Else, with the stringent CCPA regulations, they could quickly find themselves embroiled in class action lawsuits (which by the way, do not require proof of damage to the plaintiff) in case of data breaches, or statutory penalties of up to $7500 for each violation.

The good news is that since CCPA carves out a significant chunk of data that healthcare companies/those involved in healthcare-related functions collect and process, entities that are already complying with HIPAA and CMIA are well into the CCPA compliance journey. A peek into the kind of data CMIA & HIPAA regulate will help gauge what other data needs to be taken care of.

CMIA protects the confidentiality of Medical Information (MI) which is “individually identifiable information, in electronic or physical form, in possession of or derived from a provider of health care, health care service plan, pharmaceutical company, or contractor regarding a patient’s medical history, mental or physical condition, or treatment.”

HIPAA regulates how healthcare providers, health plans, and healthcare clearinghouses, referred to as ‘covered entities’ can use and disclose Protected Health Information (PHI), and requires these entities to enable protection of data privacy. PHI refers to individually identifiable medical information such as medical records, medical bills, lab tests, scans and the like. This also covers PHI in electronic form(ePHI). The privacy and security rule of HIPAA is also applicable to ‘business associates’ who provide services to the ‘coveredentities’ that involve the use or disclosure of PHI.

Two other types of data that are CCPA exempt are Research Data & De-Identified Data. As mentioned above, the ‘Common Rule’ applies only to federally-funded research studies, and the CCPA does not provide much clarity on exemption status for data from clinical trials that are not federally-funded.

And, although the CCPA does not apply to de-identified data, the definitions of de-identified data of HIPAA and CCPA slightly differ which makes it quite likely that de-identified data by HIPAA standards may not qualify under CCPA standards and therefore would not be exempt from CCPA regulations.

Compliance Approach

Taking measures to ensure compliance with regulations is cumbersome and labour-intensive, especially with the constantly evolving regulatory environment. Using this opportunity for a proactive, well-thought-out approach for comprehensive enterprise-wide data security and governance will be strategically wise since it will minimize the need for policy and process rehaul with each new regulation.

The most crucial step is a thorough assessment of the following:

  • Policies, procedures, workflows, entities relating to/involved in data collection, sharing and processing, in order to arrive at clear enterprise-wide data mapping; to determine what data, data activities, data policies would fall under the scope of CCPA; and to identify gaps and decide on prioritized action items for compliance.
  • Business processes, contracts, terms of agreement with affiliates, partners and third-party entities the company does business with, to understand CCPA applicability. In some cases,

HIPAA and CMIA may be applicable to only the healthcare-related business units, subjecting other business units to CCPA compliance.

  • Current data handling methods, not just its privacy & security. CCPA dictates that companies need to have mechanisms put in place to cater to CCPA consumer right to request all information relating to the personal data collected about them, right to opt-out of sale of their data, right to have their data deleted by the organization (which will extend to 3rd parties doing business with this organization as well).

Consumer Consent Management

With CCPA giving full ownership and control of personal data back to its owners, consent management mechanisms become the pivot of a successful compliance strategy. An effective mechanism will ensure proper administration and enforcement of consumer authorizations.

Considering the limitations of current market solutions for data privacy and security, GAVS has come up with its Blockchain-based Rhodium Framework (pending patent) for Customer Master Data Management and Compliance with Data Privacy Laws like CCPA.

You can get more details on CCPA in general and GAVS’ solution for true CCPA Compliance in our White Paper, Blockchain Solution for CCPA Compliance.

READ ALSO OUR NEW UPDATES

Proactive Monitoring

Is your IT environment proactively monitored?

It is important to have the right monitoring solution for an enterprise’s IT environment. More than that, it is imperative to leverage the right solution and deploy it for the appropriate requirements. In this context, the IT environment includes but is not limited to Applications, Servers, Services, End-User Devices, Network devices, APIs, Databases, etc. Towards that, let us understand the need and importance of Proactive Monitoring. This has a direct role in achieving the journey towards Zero Incident EnterpriseTM. Let us unravel the difference between reactive and proactive monitoring.

Reactive Monitoring – When a problem occurs in an IT environment, it gets notified through monitoring and the concerned team acts on it to resolve the issue.The problem could be as simple as slowness/poor performance, or as extreme as the unavailability of services like web site going down or server crashing leading to loss of business and revenue.  

Proactive Monitoring – There are two levels of proactive monitoring, 

  • Symptom-based proactive monitoring is all about identifying the signals and symptoms of an issue in advance and taking appropriate and immediate action to nip the root-cause in the bud.
  • Synthetic-based proactive monitoring is achieved through Synthetic Transactions. Performance bottlenecks or failures are identified much in advance; even before the actual user or the dependent layer encounters the situation

Symptom-based proactive monitoring is a USP of the ZIF Monitor module. For example, take the case of CPU related monitoring. It is common to monitor the CPU utilization and act based on that. But Monitor doesn’t just focus on CPU utilization, there are a lot of underlying factors which causes the CPU utilization to go high. To name a few,

  • Processor queue length 
  • Processor context switches
  • Processes that are contributing to high CPU utilization

It is important to arrest these brewing factors at the right time, i.e., in the case of Processor Queue length, continuous or sustained queue of greater than 2 threads is generally an indication of congestion at processor level.Of course, in a multiple processor environment, we need to divide the queue length by the number of processors that are servicing the workload. As a remedy, the following can be done

1) the number of threads can be limited at the application level

2) unwanted processes can be killed to help close the queued items

3) upgrading the processor will help in keeping the queue length under control, which eventually will control the CPU utilization.

Above is a sample demonstration of finding the symptom and signal and arrest them proactively. ZIF’s Monitor not only monitors these symptoms, but also suggests the remedy through the recommendation from SMEs.

Synthetic monitoring (SM) is done by simulating the transactions through the tool without depending on the end-user to do the transactions. The advantages of synthetic monitoring are, 

  • it uses automated transaction simulation technology
  • it helps to monitor the environment round-the-clock 
  • it helps to validate from across different geographic locations 
  • it provides options to choose the number of flows/transactions to be verified
  • it is proactive – identifies performance bottlenecks or failures much in advance even before the actual user or the dependent layer encounters the situation

How does Synthetic Monitoring(SM) work?

It works through 3 simple steps,

1) Record key transactions – Any number of transactions can be recorded, if required, all the functional flows can be recorded. An example of transaction in an e-commerce website could be, as simple as login and view the product catalogue, or,as elaborate as login, view product catalogue, move item to cart, check-out, make-payment and logout. For simulation purpose, dummy credit cards are used during payment gateway transactions.

2) Schedule the transactions – Whether it should run every 5 minutes or x hours or minutes.

3) Choose the location from which thesetransactions need to be triggered – The SM is available as on-premise or cloud options. Cloud SM provides the options to choose the SM engines available across globe (refer to the green dots in the figure below).

This is applicable mainly for web based applications, but can also be used for the underlying APIs as well.

SM solution has engines which run the recorded transactions against the target application. Once scheduled, the SM engine hosted either on-premise or remotely (refer to the green dots in the figure shown as sample representation), will run the recorded transactions at a predefined interval. The SM dashboard provides insights as detailed under the benefits section below.

Benefits of SM

As the SM does the synthetic transactions, it provides various insights like,

  • The latency in the transactions, i.e. the speed at which the transaction is happening. This also gives a trend analysis of how the application is performing over a period.
  • If there are any failures during the transaction, SM provides the details of the failure including the stack trace of the exception. This makes fixing the failure simpler, by avoiding the time spent in debugging.
  • In case of failure, SM provides insights into the parameter details that triggered the failure.
  • Unlike real user monitoring, there is the flexibility to test all flows or at least all critical flows without waiting for the user to trigger or experience it.
  • This not only unearths the problem at the application tier but also provides deeper insights while combining it with Application, Server, Database, Network Monitoring which are part of the ZIF Monitor suite.
  • Applications working fine under one geography may fail in a different geography due to various factors like network, connectivity, etc. SM will exactly pinpoint the availability and performance across geographies.

For more detailed information on GAVS’Monitor, or to request a demo please visit, https://zif.ai/products/monitor/

About the Author

Suresh Kumar Ramasamy


Suresh heads the Monitor component of ZIF at GAVS. He has 20 years of experience in Native Applications, Web, Cloud and Hybrid platforms from Engineering to Product Management. He has designed & hosted the monitoring solutions. He has been instrumental in conglomerating components to structure the Environment Performance Management suite of ZIF Monitor.

Suresh enjoys playing badminton with his children. He is passionate about gardening, especially medicinal plants.

READ ALSO OUR NEW UPDATES

Monitoring Microservices and Containers

Monitoring applications and infrastructure is a critical part of IT Operations. Among other things, monitoring provides alerts on failures, alerts on deteriorations that could potentially lead to failures, and performance data that can be analysed to gain insights. AI-led IT Ops Platforms like ZIF use such data from their monitoring component to deliver pattern recognition-based predictions and proactive remediation, leading to improved availability, system performance and hence better user experience.

The shift away from monolith applications towards microservices has posed a formidable challenge for monitoring tools. Let’s first take a quick look at what microservices are, to understand better the complications in monitoring them.

Monoliths vs Microservices

A single application(monolith) is split into a number of modular services called microservices, each of which typically caters to one capability of the application. These microservices are loosely coupled, can communicate with each other and can be deployed independently.

Quite likely the trigger for this architecture was the need for agility. Since microservices are stand-alone modules, they can follow their own build/deploy cycles enabling rapid scaling and deployments. They usually have a small codebase which aids easy maintainability and quick recovery from issues. The modularity of these microservices gives complete autonomy over the design, implementation and technology stack used to build them.

Microservices run inside containers that provide their execution environment. Although microservices could also be run in virtual machines(VMs), containers are preferred since they are comparatively lightweight as they share the host’s operating system, unlike VMs. Docker and CoreOS Rkt are a couple of commonly used container solutions while Kubernetes, Docker Swarm, and Apache Mesos are popular container orchestration platforms. The image below depicts microservices for hiring, performance appraisal, rewards & recognition, payroll, analytics and the like linked together to deliver the HR function.

Challenges in Monitoring Microservices and Containers

Since all good things come at a cost, you are probably wondering what it is here… well, the flip side to this evolutionary architecture is increased complexity! These are some contributing factors:

Exponential increase in the number of objects: With each application replaced by multiple microservices, 360-degree visibility and observability into all the services, their interdependencies, their containers/VMs, communication channels, workflows and the like can become very elusive. When one service goes down, the environment gets flooded with notifications not just from the service that is down, but from all services dependent on it as well. Sifting through this cascade of alerts, eliminating noise and zeroing in on the crux of the problem becomes a nightmare.

Shared Responsibility: Since processes are fragmented and the responsibility for their execution, like for instance a customer ordering a product online, is shared amongst the services, basic assumptions of traditional monitoring methods are challenged. The lack of a simple linear path, the need to collate data from different services for each process, inability to map a client request to a single transaction because of the number of services involved make performance tracking that much more difficult.

Design Differences: Due to the design/implementation autonomy that microservices enjoy, they could come with huge design differences, and implemented using different technology stacks. They might be using open source or third-party software that makes it difficult to instrument their code, which in turn affects their monitoring.

Elasticity and Transience: Elastic landscapes where infrastructure scales or collapses based on demand, instances appear & disappear dynamically, have changed the game for monitoring tools. They need to be updated to handle elastic environments, be container-aware and stay in-step with the provisioning layer. A couple of interesting aspects to handle are: recognizing the difference between an instance that is down versus an instance that is no longer available; data of instances that are no longer alive continue to have value for analysis of operational efficiency or past performance.

Mobility: This is another dimension of dynamic infra where objects don’t necessarily stay in the same place, they might be moved between data centers or clouds for better load balancing, maintenance needs or outages. The monitoring layer needs to arm itself with new strategies to handle moving targets.

Resource Abstraction: Microservices deployed in containers do not have a direct relationship with their host or the underlying operating system. This abstraction is what helps seamless migration between hosts but comes at the expense of complicating monitoring.

Communication over the network: The many moving parts of distributed applications rely completely on network communication. Consequently, the increase in network traffic puts a heavy strain on network resources necessitating intensive network monitoring and a focused effort to maintain network health.

What needs to be measured

This is a high-level laundry list of what needs to be done/measured while monitoring microservices and their containers.

Auto-discovery of containers and microservices:

As we’ve seen, monitoring microservices in a containerized world is a whole new ball game. In the highly distributed, dynamic infra environment where ephemeral containers scale, shrink and move between nodes on demand, traditional monitoring methods using agents to get information will not work. The monitoring system needs to automatically discover and track the creation/destruction of containers and explore services running in them.

Microservices:

  • Availability and performance of individual services
  • Host and infrastructure metrics
  • Microservice metrics
  • APIs and API transactions
    • Ensure API transactions are available and stable
    • Isolate problematic transactions and endpoints
  • Dependency mapping and correlation
  • Features relating to traditional APM

Containers:

  • Detailed information relating to each container
    • Health of clusters, master and slave nodes
  • Number of clusters
  • Nodes per cluster
  • Containers per cluster
    • Performance of core Docker engine
    • Performance of container instances

Things to consider while adapting to the new IT landscape

Granularity and Aggregation: With the increase in the number of objects in the system, it is important to first understand the performance target of what’s being measured – for instance, if a service targets 99% uptime(yearly), polling it every minute would be an overkill. Based on this, data granularity needs to be set prudently for each aspect measured, and can be aggregated where appropriate. This is to prevent data inundation that could overwhelm the monitoring module and drive up costs associated with data collection, storage, and management.    

Monitor Containers: The USP of containers is the abstraction they provide to microservices, encapsulating and shielding them from the details of the host or operating system. While this makes microservices portable, it makes them hard to reach for monitoring. Two recommended solutions for this are to instrument the microservice code to generate stats and/or traces for all actions (can be used for distributed tracing) and secondly to get all container activity information through host operating system instrumentation.    

Track Services through the Container Orchestration Platform: While we could obtain container-level data from the host kernel, it wouldn’t give us holistic information about the service since there could be several containers that constitute a service. Container-native monitoring solutions could use metadata from the container orchestration platform by drilling into appropriate layers of the platform to obtain service-level metrics. 

Adapt to dynamic IT landscapes: As mentioned earlier, today’s IT landscape is dynamically provisioned, elastic and characterized by mobile and transient objects. Monitoring systems themselves need to be elastic and deployable across multiple locations to cater to distributed systems and leverage native monitoring solutions for private clouds.

API Monitoring: Monitoring APIs can provide a wealth of information in the black box world of containers. Tracking API calls from the different entities – microservices, container solution, container orchestration platform, provisioning system, host kernel can help extract meaningful information and make sense of the fickle environment.

Watch this space for more on Monitoring and other IT Ops topics. You can find our blog on Monitoring for Success here, which gives an overview of the Monitorcomponent of GAVS’ AIOps Platform, Zero Incident FrameworkTM (ZIF). You can Request a Demo or Watch how ZIF works here.

About the Author:

Sivaprakash Krishnan


Bio – Siva is a long timer at Gavs and has been with the company for close to 15 years. He started his career as a developer and is now an architect with a strong technology background in Java, Big Data, DevOps, Cloud Computing, Containers and Micro Services. He has successfully designed & created a stable Monitoring Platform for ZIF, and designed & driven cloud assessment and migration, enterprise BRMS and IoT based solutions for many of our customers. He is currently focused on building ZIF 4.0, a new gen business-oriented TechOps platform.

Padmapriya Sridhar


Bio – Priya is part of the Marketing team at GAVS. She is passionate about Technology, Indian Classical Arts, Travel and Yoga. She aspires to become a Yoga Instructor some day!

The Chatty Bots!

Chatbots can be loosely defined as software to simulate human conversation. They are widely used as textbots or voicebots in social media, in websites to provide the initial engagement with visitors, as part of  customer service/IT operations teams to provide tier 1 support round the clock and for various other organizational needs, as we’ll see later in the blog, in integration with enterprise tools/systems. Their prevalence can be attributed to how easy it has now become to get a basic chatbot up & running quickly, using the intuitive drag-drop interfaces of chatbot build tools. There are also many cloud-based free or low-cost AI platforms for building bots using the provided APIs. Most of these platforms also come with industry-specific content, add-on tools for analytics and more.

Rule-based chatbots can hold basic conversation with scripted ‘if/then’ responses for commonly raised issues/faqs, and redirect appropriately for queries beyond their scope. They use keyword matches to get relevant information from their datastore. Culturally, as we begin to accept and trust bots to solve problems and extend support; with companies beginning to see value in these digital resources; and with heavy investments in AI technologies, chatbots are gaining traction, and becoming more sophisticated. AI-led chatbots are way more complex than their rule-based counterparts and provide dynamically tailored, contextual responses based on the conversation and interaction history. Natural Language Processing capabilities give these chatbots the human-like skill to comprehend nuances of language and gauge the intent behind what is explicitly stated.    

The Artificial Neural Network(ANN) for Natural Language Processing(NLP) 

An ANN is an attempt at a tech equivalent of the human brain! You can find our blog on ANNs and Deep Learning here.

Traditional AI models are incapable of handling highly cognitive tasks like image recognition, image classification, natural language processing, speech recognition, text-speech conversion, tone analysis and the like. There has been a lot of success with Deep Learning approaches for such cerebral use cases. For NLP, handling the inherent complexities of language such as sentiment, ambiguity or insinuation, necessitates deeper networks and a lot of training with enormous amounts of data. Each computational layer of the network progressively extracts finer and more abstract details from the inputs, essentially adding value to the learnings from the previous layers. With each training iteration, the network adapts, auto-corrects and finetunes its weights using optimization algorithms, until it reaches a maturity level where it is almost always correct in spite of input vagaries. The USP of a deep network is that, armed with this knowledge gained from training, it is able to extract correlations & meaning from even unlabeled and unstructured data.

Different types of neural networks are particularly suited for different use cases. Recurrent Neural Networks(RNNs) are good for sequential data like text documents, audio and natural language. RNNs have a feedback mechanism where each neuron’s output is fed back as weighted input, along with other inputs. This gives them ‘memory’ implying they remember their earlier inputs, but with time the inputs get diluted by the presence of new data. A variant of the RNN helps solve this problem. Long Short Term Memory (LSTM) models have neurons(nodes) with gated cells that can regulate whether to ‘remember’ or ‘forget’ their previous inputs, thereby giving more control over what needs to be remembered for a long time versus what can be forgotten. For e.g.: it would help to ‘remember’ when parsing through a text document because the words and sentences are most likely related, but ‘forgetting’ would be better during the move from one text document to the next, since they are most likely unrelated.

The Chatbot Evolution

In the 2019 Gartner CIO Survey, CIOs identified chatbots as the main AI-based application used in their enterprises. “There has been a more than 160% increase in client interest around implementing chatbots and associated technologies in 2018 from previous years”, says Van Baker, VP Analyst at Gartner.

Personal & Business communication morphs into the quickest, easiest and most convenient mode of the time. From handwritten letters to emails to phone calls to SMSs to mere status updates on social media is how we now choose to interact. Mr. Baker goes on to say that with the increase of millennials in the workplace, and their  demand for instant, digital connections, they will have a large impact on how quickly organizations adopt the technology.

Due to these evolutionary trends, more organizations than we think, have taken a leap of faith and added these bots to their workforce. It is actually quite interesting to see how chatbots are being put to innovative use, either stand-alone or integrated with other enterprise systems.

Chatbots in the Enterprise

Customer service & IT service management(ITSM) are use cases through which chatbots gained entry into the enterprise. Proactive personalized user engagement, consistency and ease of interaction, round-the-clock availability & timely address of issues have lent themselves to operational efficiency, cost effectiveness and enhanced user experience. Chatbots integrated into ITSM help streamline service, automate workflow management, reduce MTTR, and provide always-on services. They also make it easier to scale during peak usage times since they reduce the need for customers to speak with human staff, and the need to augment human resources to handle the extra load. ChatOps is the use of chatbots within a group collaboration tool where they run between the tool and the user’s applications and automate tasks like providing relevant data/reports, scheduling meetings, emailing, and ease the collaborative process between siloed teams and processes, like in a DevOps environment where they double up as the monitoring and diagnostic tool for the IT landscape.

In E-commerce, chatbots can boost sales by taking the customer through a linear shopping experience from item search through purchase. The bot can make purchase suggestions based on customer preferences gleaned from product search patterns and order history.

In Healthcare, they seamlessly connect healthcare providers, consumers and information and ease access to each other. These bot assistants come in different forms catering to specific needs like personal health coach, companion bot to provide the much-needed conversational support for patients with Alzheimer’s, confidant and therapist for those suffering from depression, symptom-checker to provide initial diagnosis based on symptoms and enable remote text or video consultation with a doctor as required and so on.

Analytics provide insights but often not fast enough for the CXO. Decision-making becomes quicker when executives can query a chatbot to get answers, rather than drilling through a dashboard. Imagine getting immediate responses to requests like Which region in the US has had the most sales during Thanksgiving? Send out a congratulatory note to the leadership in that region. Which region has had the poorest sales? Schedule a meeting with the team there. Email me other related reports of this region. As can be seen here, chatbots work in tandem with other enterprise tools like analytics tools, calendar and email to make such fascinating forays possible.

Chatbots can handle the mundane tasks of Employee Onboarding, such as verification of mandatory documents, getting required forms filled, directing them to online new-hire training and ensuring completion.

When integrated with IoT devices, they can help in Inventory Management by sending out notifications when it’s time to restock a product, tracking shipment of new orders and alerting on arrival.

Chatbots can offer Financial Advice by recommending investment options based on transactional history, current investments or amounts idling in savings accounts, alerting customer to market impact on current portfolio and so much more.

As is evident now, the possibilities of such domain-specific chatbots are endless, and what we have seen is just a sampling of their use cases!

Choosing the Right Solution

The chatbot vendor market is crowded, making it hard for buyers to fathom where to even begin. The first step is an in-depth evaluation of the company’s unique needs, constraints, main use cases and enterprise readiness. The next big step is to decide between off-the shelf or in-house solutions. An in-house build will be an exact fit to needs, but it might be difficult to get long-term management buy-in to invest in related AI technologies, compute power, storage, ongoing maintenance and a capable data science team. Off-the-shelf solutions need a lot of scrutiny to gauge if the providers are specialists who can deliver enterprise-grade chatbots. Some important considerations:

The solution should (be);

Platform & Device Agnostic so it can be built once and deployed anywhere

Have good Integration Capabilities with tools, applications and systems in the enterprise

Robust with solid security and compliance features

Versatile to handle varied use cases

Adaptable to support future scaling

Extensible to enable additional capabilities as the solution matures, and to leverage innovation to provide advanced features such as multi-language support, face recognition, integration with VR, Blockchains, IoT devices

Have a Personality! Bots with a personality add a human-touch that can be quite a differentiator. Incorporation of soft features such as natural conversational style, tone, emotion, and a dash of humor can give an edge over the competition.

About the Author:

Priya is part of the Marketing team at GAVS. She is passionate about Technology, Indian Classical Arts, Travel and Yoga. She aspires to become a Yoga Instructor some day!