Four Things To Automate On Your IBM i System
How do you know when something’s gone wrong on your IBM I system? IBM i system monitoring traditionally consists of operators or administrators manually watching system operator messages, observing lights on the system panel, watching the Work with Active Jobs (WRKACTJOB) screen, checking hardware status, and performing other time consuming manual tasks. Several third-party vendors offer IBM i message and resource monitoring packages such as SEA Software’s absMessage that automate the system monitoring process. Automated IBM i monitoring eliminates the need for administrators to manually review system health and performance. The software watches the system for you, attempts to solve problem automatically (when appropriate), alerts on-call responders to fix an issue, and can record system activity in a Security Information and Event Management solution (SIEM).
System monitoring packages improve productivity and insure all issues are handled in a timely manner. With automated monitoring, events no longer get lost in the noise and issues can be immediately addressed.
What are your priorities?
By responding to events in a timely manner, IBM i automated monitoring reduces system risk, proactively answers system issues, and increases system efficiency. At the very least, IBM i administrators should use these packages to monitor four critical system areas:
- System availability
- Job performance
- Security
- System messages
Here’s an overview of what you should monitor for in each area.
Area #1 – System Availability Monitoring
Your top priority is always to ensure your system is up and running and that you have network connectivity. Nothing else matters if the system is down or unreachable.
System availability monitoring ensures that your system hardware is healthy, that you aren’t running out of disk space, and that your backups are executing properly.
Key system availability items you’ll want to monitor for in your IBM i monitoring and resource software include:
- Hardware errors – Monitoring for hardware failure messages, including disk drive failures
- Disk drive capacity issues – Alerting responders which disk utilization approaches or bypasses its system utilization threshold, usually in the 85-90% utilization range
- Connectivity issues – Pinging other critical systems to insure your IBM i partitions have network connectivity
- Power issues – Monitoring the state of your power supplies to insure there isn’t a power supply failure or that the system isn’t running on auxiliary power
- Backup issues –Monitoring a tape drive or media backup solution, to insure there aren’t any backup failures, uninitialized media, and that the backup has completed successfully.
Implementing an IBM i monitoring solution can help ensure that your IBM i is healthy and operating as expected, taking a lot of work off your administrator’s shoulders. Not only does system availability monitoring improve efficiencies, it also reduces the risk of an issue going unnoticed and escalating into crisis mode over time.
Area #2 – Jobs Monitoring
Job monitoring is the next most critical IBM i monitoring item. When jobs get stuck in a message wait state (MSGW), are held, or are looping, it can hold up business processing, corrupt your database, and cost the organization money. Job monitoring is critical to keeping everything running smoothly.
Some of the most important job statuses that you should watch for in your IBM i monitoring software include:
- Jobs in message waiting status
- Jobs that didn’t finish by their expected completion time
- Jobs that didn’t start on time
- Jobs running longer than their usual run time, and that may be looping
- Jobs that are in lock wait status
- Server jobs that aren’t running
IBM i monitoring solutions can help ensure that your batch processes are on track. If something does go wrong with a job, these solutions can alert on-call responders or initiate contingency actions that will correct for specific problems found.
Area #3 – Security Monitoring
IBM i is one of the most securable platforms available, but you still need to monitor system security. It’s not enough to set up your security values, lock down your users, and walk away; you need to monitor the security of your system 24x7x365 to prevent breaches. If an unauthorized user gains access to your system, you want to ensure that you know when it is happening, so you can take immediate action and mitigate your risks.
It’s also important to have an audit trail of what is happening on your system when it comes to users with elevated privileges. Users with elevated privileges could potentially access your files via ODBC and you might never know it. Monitoring what users are doing when they have elevated privileges is another way to reduce risk. In addition to sending out alerts when a potential security breach is detected, many system monitoring packages can also send SYSLOG information to Security Information and Event Management solutions (SIEM) for centralized reporting and troubleshooting.
Preventing users from accessing sensitive data outside of your core application is key to reducing risk. Monitoring user activity will help protect your business from being on tomorrow’s newscast.
Some security issues you should track with an IBM i system monitoring package would include:
- When a SECOFR user changes their password
- When SECOFR users sign on remotely and use ODBC, FTP, OLE DB, or SQL to transfer data
- When SECOFR users sign on after hours
- When a user profile or device is disabled
- Multiple failed log-in attempts for users
Area #4 – Automated Message Handling
It’s critically important to automate message handling, not only to alert on-call responders when there’s a problem but also to initiate action to resolve common inquiry messages.
The most common automated message response is to answer an unable to access an object or record inquiry message with an ‘R’ (retry) response before alerting an on-call responder. Many object and record lock errors are temporary and can be easily resolved with a simple retry, without alerting the on-call staff.
System monitoring packages can also be configured to perform remedial actions when an error is detected, run complex scripts that reply to specific messages, execute programs and commands, or issue additional actions. Scripted responses can again, automatically resolve issues without involving on-call staff.
Automatic responses increase efficiencies and reduce the impact to users when an error occurs. There’s no reason for your Administrator or Operator to have to respond manually to routine or common messages.
If an automated response doesn’t’ fix an issue, the system monitoring package can escalate an alert to an on-call responder to fix the issue. The key is to allow the solution to automatically handle the mundane tasks without intervention and escalate things that require more attention to your on-call responders.
Other ways to automate system monitoring
IBM i message and resource monitoring packages such as SEA Software’s absMessage can increase system efficiency, reduce risk, and help you become more proactive. But by itself, IBM i message and resource monitoring is only one part of a total system monitoring solution. IBM i job scheduling software such as absScheduler can keep tabs of your automated job streams, can make changes to complicated job streams due to changing variables or job completion status, and make job stream changes based on changes to a job’s status. And IBM i firewall products such as iSecurity firewall, can protect exit points, servers, and precisely control what users can access. Be sure to check out all three types of system monitoring software for a complete solution to your monitoring needs. Please feel free to contact us at SEA software for more information on how to solve your own IBM i monitoring needs.