iSolutions has always paid special attention to monitoring its applications and its customers’ architectures.
For this reason, the company decided to invest in the development of a proprietary application, iSAlert, and the introduction of the NOC team, which we will discuss below, in order to monitor and provide an extremely proactive service to its customers.
iSAlert provides NOC teams with sophisticated and highly customizable alerts based on data sampled from iSBets platform monitoring services, or directly by querying real-time data.
The logic of generating alerts is based on the execution of a query command, a control value and an operator, these elements together are called “Rules” within the system.
Rules are executed on iSolutions’ client environments every minute by a dedicated service and constantly monitor if the values returned by the query command do not meet the control value with the set operator.
In case this is not met, the system will generate an alert which will then be displayed on a dedicated website.
iSAlert Web Site
The iSAlert site then allows for the management of the rules and, of course, also the alerts generated, which look like the following:
Each alert, generated according to the configurations of the various rules, presents various information:
The iSAlert Web site has various functions and pages dedicated to specific purposes.
Starting with the index page, where we find the list of alerts on various clients, with related action buttons to operate on them:
Opening the detail of each alert then provides access to a more comprehensive management page, which provides various possible operations, from changing severity, to assignment, from adding notes to extend the info provided by the alert following preliminary checks, to the ability to comment on and view a list of similar past alerts.
The “TV view” page allows the application to be opened on a monitor, such as a dedicated TV in the team office, and presents simplified alerts , without action buttons showing only critical and error severity alerts, with all relevant information.
The “Dashboard” page allows a check by the NOC team leader on the work of technicians, analyzing alerts handled and operations performed.
Finally, the “History” page allows to consult and export all alerts, including past ones, in order to identify patterns and abnormal behaviors on the various monitored domains, or even directly errors in Rules configurations.
After we have introduced the logic service and the website, let us then look at the overall architecture:
iSAlert can connect to an unlimited number of environments, and the connection strings information for the connection is configured in a dedicated database.
iSAlert notifications system
An indispensable necessity in an alerting system is its ability to send reliable but, above all, immediate alerts, which is why the development of a notification service and its integration with the most widely used communication channels was necessary.
iSAlert currently allows asynchronous but very fast notification of alerts via 3 channels: Telegram, Slack and Email.
For the management of notifications, there is a centralized service that communicates notifications to dedicated bots (for Telegram) or apps (for Slack) developed by iSolutions; these notifications are retrieved from a queue, present on the environment of the various clients, and inserted into it by a dedicated generation service.
Of course, as with the alerts generation rules, the notification generation rules are also highly customizable, and it is possible to accurately specify which type of alerts to notify, in which case, and through which communication channel.
Below is the complete architecture:
This tool developed by iSolutions has a very strong connection to the recent introduction of the Team NOC into the corporate organization.
The Team NOC “Network Operations Center,” is a team built to provide iSolutions’ customers with constant monitoring, both of the applications developed and the network infrastructure over which they operate.
The growth of iSolutions’ customers highlighted the need for a tool, which could provide the NOC team with immediate visibility into business-critical events, or anomalies in the system, while at the same time being flexible and highly customizable so that it could be effectively dropped into the operational context of iSolutions’ developed applications.
For example, iSAlert allowed the creation of alerts related to performance indicators, CPU, RAM usage, disk space consumption, resource usage by SQL, but also of highly customized rules focused on iSBets product; for example, rules for analyzing the correctness of data imported from external providuers, rules for checking amounts calculated by payment systems and data computation, in short, particularly custom and product-focused analyses.
NOC Team & Runbook
A requirement that has proven key to the successful introduction of the NOC team and its operation is the introduction of “Runbooks“, that can be described as “alert resolution guides”.
These are set up by the development teams and the it-ops team, during rule creation, in order to provide NOC team members steps to best handle the alerts and lead them to resolution of the those on their own.
The runbooks are present and displayed in each active alert in order to provide immediate, easily searchable, and resolvable instructions to NOC team members.
The introduction of the NOC team has brought several benefits, among all of which we mention:
- A clear definition of the role and duties, which unlike classic Technical Support has direct responsibility for managing alerts and monitoring customer environments
- Increased proactivity and speed in handling business critical events or anomalies in the system
- Reduced support requests made by customers, this due to the NOC team’s ability to intercept and resolve issues in advance
The experimental phase introduction of the NOC team within the iSolutions organization has brought obvious benefits as described above, the team will now have to evolve, grow through the introduction of new members within it, a need given by the growing number of the company’s customers, increase skills, monitoring and intervention fields.
Another major challenge will be to be able to fully cover all areas of the system, including even the AWS cloud world recently introduced in iSolutions’ service package.