As your organization grows, so does the number of servers, devices, and services that you depend on. The term “system” covers all of the computing resources of your organization. Each element in the system infrastructure relies on underlying services or provides services to components that are closer to the user.
In networking, it is typical to think of a system as a layered stack. User software sits at the top of the stack and system applications and services on the next layer down. Beneath the services and applications, you will encounter operating systems and firmware. The performance of software elements needs to be monitored as an application stack.
Here is our list of the best system monitoring software & tools:
- SolarWinds Server Health Monitor EDITOR’S CHOICE – Free tool that runs on Windows Server. This will monitor the availability, health, and performance of up to 5 servers. Download 100% FREE Tool.
- Datadog Infrastructure Monitoring (FREE TRIAL) This system monitoring tool covers all IT hardware in a typical business including network devices and servers.
- Paessler PRTG Network Monitor (FREE TRIAL) – Covers networks, bandwidth usage, servers, and applications. This tool is free for up to 100 sensors.
- SolarWinds Server & Application Monitor (FREE TRIAL) – A comprehensive monitoring system that includes a drill-down view of all of the resources and services that support each application.
- ManageEngine OpManager (FREE TRIAL) – Comprehensive network device performance monitor that can be combined with other resource management tools from ManageEngine.
- SolarWinds RMM (FREE TRIAL) A remote monitoring and management tool that enables central IT departments to manage IT resources on several remote sites.
- Site24x7 Server Monitoring Cloud-based monitoring service that tracks the performance of networks, servers, applications and Websites.
- Atera An IT infrastructure monitoring and management system that is delivered from the cloud and intended for use by MSPs.
- Nagios XI and Nagios Core – Nagios XI is a paid monitor and Nagios Core is free. Both can be extended by thousands of add-ons that are available from the user community forum.
- Progress WhatsUp Gold – An extendible network monitoring tool with SNMP procedures underlying device health checks.
Users will notice performance problems with the software that they use, but those problems rarely arise within that software. All layers of the application stack need to be examined to find the root cause of performance issues. You need to head off problems with real-time monitoring before they occur. Monitoring tools help you spot errors and service failures before they start to impact users.
The system stack continues on below the software. Hardware issues can be prevented through monitoring. You will need to monitor servers, network devices, interface performance, and network link capacity. You need to monitor many different types of interacting system elements in order to keep your IT services running smoothly. Here we’ll look at six sophisticated system monitoring packages for Windows and Linux.
- 1 Why do system performance monitoring?
- 2 System monitoring software essentials
- 3 Some basic system monitoring software tools
- 4 Minimum system monitoring software capabilities
- 5 The best system monitoring software & tools
- 5.1 1. SolarWinds Server Health Monitor (FREE TOOL)
- 5.2 EDITOR'S CHOICE
- 5.3 2. Datadog Infrastructure Monitoring (FREE TRIA)
- 5.4 3. Paessler PRTG Network Monitor (FREE TRIAL)
- 5.5 4. SolarWinds Server & Application Monitor (FREE TRIAL)
- 5.6 5. ManageEngine OpManager (FREE TRIAL)
- 5.7 6. SolarWinds RMM (FREE TRIAL)
- 5.8 7. Site24x7 Server Monitoring
- 5.9 8. Atera
- 5.10 9. Nagios XI and Nagios Core
- 5.11 10. Progress WhatsUp Gold
- 6 Choosing a System Monitoring Tool
Why do system performance monitoring?
Knowing whether a computer has issues is fairly straightforward when the computer is right in front of you. (Knowing what’s causing the problem? That’s harder.)
But a computer sitting by itself is not as useful as it could be. Even the smallest small-office/home-office network has multiple nodes: laptops, desktops, tablets, WiFi access points, internet gateway, smartphones, file servers and/or media servers, printers, and so on. That means you are in charge of “infrastructure” rather than just “equipment.” Any component might start behaving badly and could cause issues for the others.
You most likely rely on off-premises servers and services, too. Even a personal website raises the nagging question, “Is my site still up?” And when your ISP has problems, your local network’s usefulness suffers. You need an activity monitor. Organizations rely more and more on servers and services hosted in the cloud: SaaS applications (email, office apps, business packages, etc); file storage; cloud hosting for your own databases and apps; and so on. This requires a sophisticated monitoring solution that can handle hybrid environments.
Bandwidth monitoring tools and NetFlow and sFlow based traffic analyzers help you stay aware of the activity, capacity, and health of your network. They allow you to watch traffic as it flows through routers and switches, or arrives at and leaves hosts.
But what of the hosts on your network, their hardware, and the services and applications running there? Monitoring the activity, capacity, and health of hosts and applications is the focus of system monitoring.
System monitoring software essentials
In order to keep your system fit for purpose, your monitoring activities need to cover the following priorities:
- Acceptable delivery speeds
- Constant availability
- Preventative maintenance
- Software version monitoring and patching
- Intrusion detection
- Data integrity
- Security monitoring
- Attack mitigation
- Virus prevention and detection
Lack of funding may cause you to compromise on monitoring completeness. The expense of monitoring can be justified because of it:
- reduces user/customer support costs
- prevents loss of income caused by system outages or attack vulnerability
- prevents data leakage leading to litigation
- prevents hardware damage and loss of business-critical data
Expense on system monitoring reduces costs in other areas of the IT budget.
Some basic system monitoring software tools
Anyone who’s curious about their workstation or laptop’s performance has likely encountered Windows Task Manager or Linux’s ps and top. (The more experienced know of Sysinternals on Windows and htop, atop, pgrep, and pstree on Linux.)
Task Manager is a good example of the basic activity monitoring information you can learn about a host, starting with what processes are running and which currently consume the most resources.
Climb up a level and it will show you current and recent utilization for key resources like CPU, memory, disk, and network connections. Other tabs will show you more details on running processes, operating system services, and other key data.
Unix and Linux have analogous tools, like top.
Task Manager and top provide a continuously updating display of utilization. These simple real-time monitoring utilities are good for basic ad hoc monitoring of a single machine, to see what’s running and what’s consuming the system’s resources.
Minimum system monitoring software capabilities
A more sophisticated system monitoring package provides a much broader range of capabilities, such as:
- Monitoring multiple servers. Handling servers from various vendors running various operating systems. Monitoring servers at multiple sites and in cloud environments.
- Monitoring a range of server metrics: availability, CPU usage, memory usage, disk space, response time, and upload/download rates. Monitoring CPU temperature and power supply voltages.
- Monitoring applications. Using deep knowledge of common applications and services to monitor key server processes, including web servers, database servers, and application stacks.
- Automatically alerting you of problems, such as servers or network devices that are overloaded or down, or worrisome trends. Customized alerts that can use multiple methods to contact you – email, SMS text messages, pager, etc.
- Triggering actions in response to alerts, to handle certain classes of problems automatically.
- Collecting historical data about server and device health and behavior.
- Displaying data. Crunching the data and analyzing trends to display illuminating visualizations of the data.
- Reports. Besides displays, generating useful predefined reports that help with tasks like forecasting capacity, optimizing resource usage, and predicting needs for maintenance and upgrades.
- Customizable reporting. A facility to help you create custom reports.
- Easy configurability, using methods like auto-discovery and knowledge of server and application types.
- Unintrusiveness: imposing a low overhead on your production machines and services. Making smart use of agents to offload monitoring where appropriate.
- Scalability: Able to grow with your business, from a small or medium business (SMB) to a large enterprise.
The best system monitoring software & tools
Apart from having the minimum system monitoring software capabilities that we listed above, when selecting the tools in this post, we also took some of the following criteria into consideration:
- Ease of installation and use, including the availability of documentation, community forums, and support
- Commitment to continual updates, improvements, and ongoing maintenance.
- Real-world problem-solving applicability and a robust feature set.
SolarWinds produces a suite of products for comprehensive network monitoring and management. For system monitoring, two are most relevant: a free tool, the Server Health Monitor, and a for-cost tool, the Server and Application Monitor.
The free Server Health Monitor (SHM) will monitor the availability, health, and performance of up to 5 servers – if you have the right type of servers.
- Supported servers are: Dell PowerEdge™, HP ProLiant™, and IBM eServer™ xSeries.
- Supported blade enclosures are: Dell PowerEdge M1000e, and HP BladeSystem c3000 and c7000.
- And supported hypervisors are: VMware vSphere® ESX Hypervisor and ESXi™ Hypervisor.
SHM uses SNMP, WMI, and CIM to poll the standard components in each server, including power supply, fan speed, temperature, CPU, and battery.
Once you’ve installed SHM, configuration is straightforward. For each server, you specify the hostname or IP address, and provide credentials for SNMP, WMI, and/or VMware. You can also adjust the polling interval.
The dashboard tab displays the overall health of the monitored servers. You can click on a server to get its particulars. Each sensor on that server is listed, and you can click a sensor to get greater detail.
SHM gives you near-real-time visibility into the health of a small collection of servers. As an entry-level tool, it doesn’t include a mechanism to send you alerts when you are aren’t in front of the screen, or to generate reports on things like historical trends. Download the free tool.
The SolarWinds Server Health Monitor is our 1st Choice! Monitor the health, status, and availability of server hardware with this powerful system monitoring software utility. And it’s 100% free!
Download 100% FREE Tool: solarwinds.com/free-tools/server-health-monitor/
The Datadog Infrastructure monitoring system uses an alerting mechanism to spot status problems with IT equipment before they develop into major disasters.
The alert monitors don’t just look out for hardware failure, they also check that service level objectives (SLOs) are being met. This means that not only does it ensure that the system is working, it watches for signs that performance levels might become impaired to unacceptable levels.
The Datadog system is composed of a series of modules, so the hardware monitoring of the Infrastructure monitor is not the only option available. The Application Performance Monitor unit of Datadog also adds insights that are vital for system monitoring.
Combining modules creates enhanced monitoring options. For example, Application Performance Monitoring includes root-cause analysis features that link through to the Infrastructure module. This will identify performance problems with an application that is really a hardware capacity issue.
Adding on the Network Performance Monitoring module also allows the identification of traffic flows, which might be the real cause of slow delivery of applications.
All Datadog services operate from the cloud and can integrate the management of local, remote, and cloud-based resources. The monitored system all need to have an agent program installed on them so that the remote monitor is able to gather system statistics. The user of the monitor is then able to access the console through any browser from anywhere.
The Infrastructure module is offered in three plan levels: Free, Pro, and Enterprise. As the name suggests, the basic package costs nothing to use. The top plan, Enterprise, includes machine learning processes to set alerts to spot performance falling below acceptable levels.
The Network Performance Monitor and Application Performance Monitor are each available in one plan level. There is also an App Analytical add-on available for the APM. All modules are charged for by subscription with a monthly rate and a cheaper annual fee. Whichever payment period is chosen, each term has to be paid for in advance. Datadog offers a 15-day free trial of each of its modules.
The Paessler PRTG Network Monitor is a “batteries included” solution that monitors your servers and devices, network traffic, and more. PRTG can use NetFlow and sFlow, and we covered it in some detail in our exploration of free NetFlow traffic analyzers.
The PRTG Network Monitor runs on Windows. It monitors mail servers, web servers, database servers, file servers, and virtual servers. PRTG can monitor multiple sites and cloud services. It uses SNMP, WMI, NetFlow, sFlow, ping, ssh, REST APIs, and packet sniffing.
Setting up the tool is a bit complex but a setup wizard and how-to video lead you through the steps. The tool will find many devices and servers via auto-discovery.
In the user interface, a primary view is the device tree showing the devices (including servers) in your network, and the sensors monitoring each.
On the server hardware side, its sensors can monitor CPU load, memory, disk, server room environment, etc. On the applications side, it comes with more than 200 sensor types for common network services, including HTTP, SMTP/POP3 (email), FTP, etc.
You can specify thresholds for alerts, and PRTG can send notifications of detected issues via several methods, including email and SMS. It provides a range of predefined reports and facilities for designing custom reports. Reports can also be scheduled.
The free version is limited to 100 sensors after a 30-day trial which you can download here. Because a sensor is an individual data stream, each server and device will typically require several sensors.
The free version of PRTG Network Monitor provides a well-stocked toolbox for monitoring a small network.
The SolarWinds Server & Application Monitor (SAM) is part of the for-cost Orion suite of network monitoring and management tools; we looked at components of the Orion suite in our article on the best sFlow traffic analyzers. Where the Server Health Monitor can meet the needs of a small shop, SAM can cover small businesses to large enterprises. SolarWinds offers a 30-day free trial of SAM.
As the name suggests, SAM monitors the health and performance of server hardware and virtual servers from multiple vendors, as well as doing deep monitoring of many hundreds of applications. It can monitor multiple sites and cloud environments like Azure and AWS.
The SolarWinds Orion suite will auto-discover hosts and devices on your network. Then you can start to monitor them.
Once a server is identified and monitoring has been running, look under Node Details to see SAM’s display of the node’s performance and health data.
The server status data is displayed both graphically and in tables.
A second discovery scan is required so SAM can detect the applications running on the nodes previously discovered.
You can configure the application discovery scan to specify which applications SAM should look for. Then you provided the credentials SAM needs to access the information on the various nodes.
One SAM has detected applications and begun its regular scan, the Application Summary will show top-level status for applications running on your servers.
The summary includes application alerts and events, top 10 nodes by CPU load, by physical memory, virtual memory, I/O operations, etc.
SAM, working with the SolarWinds suite of network monitoring and management tools, provides a full range of features for customizable dashboards, analysis, alerting, reporting, etc.
SAM and the SolarWinds suite are enterprise-grade packages, so they are not cheap and call for considerable resources on the server hosting them. Most components tack on an additional charge. But if your network is large or growing, the SolarWinds suite with SAM is worth exploring.
MORE INFORMATION ON THE OFFICIAL SOLARWINDS SITE:
ManageEngine produces a full network management suite and offers free versions of some of their tools. In our roundup of free bandwidth monitoring software, we previously looked at the free edition of ManageEngine OpManager.
Setting up the OpManager is a multi-step process but not overly complex. Once you provide the subnet and SNMP parameters, OpManager will scan your subnets and discover your devices.
OpManager monitors availability and performance metrics of physical and virtual servers. The Application Performance Monitoring plug-in adds the ability to monitor applications. OpManager uses SNMP, WMI, and CLI via SSH or telnet.
The “Server Top 10” tab displays top utilization and availability for the discovered physical servers.
The “Virtualization Summary” tab displays metrics for your virtual servers.
OpManager contains sophisticated facilities for alerting and reports. You can set alerts based on thresholds; and it has a variety of useful canned reports, ranging from troubleshooting support to capacity planning and billing, as well as facilities for creating custom reports.
The predefined reports include “Network Health Status,” which gives a rollup for all the detected hosts.
The Application Performance Monitoring plug-in is only available free for a 30-day trial. It adds monitoring for application stacks and servers, web servers and services, databases, containers, and public and hybrid cloud environments. There is also a free version that limits OpManager to 10 devices.
SolarWinds RMM is a cloud-based remote monitoring and management software system. It is able to monitor networks, endpoints, mobile devices, servers, and applications. This service is not only a monitoring system. It can also automate many routine maintenance tasks and implement automated responses to status alerts or malicious activity.
As this system is resident in the cloud, it doesn’t matter where the monitored resources are physically located. They don’t have to be on the same site as the IT support team. The service is also able to include cloud resources into the monitoring system.
The endpoint and server monitoring functions of the system keep constant checks on the capacity and utilization of CPU, memory, and disk space on each machine. Other server monitoring functions include the management of virtualizations and the availability of patches and updates for services and software.
Network management services focus on the capabilities of the Simple Network Management Protocol (SNMP). The SolarWinds RMM system acts as an SNMP Manager, gathering status reports from device agents. Agent responses are interpreted into system data, which is shown live on the screens of the dashboard. SNMP also provides a system discovery function that will automatically document all network equipment.
The SNMP system includes an alerting mechanism. This means that technicians can work on other tasks and assumer everything is OK with the network unless otherwise notified. The system alerts appear on the dashboard and can also be forwarded to key staff members via email or SMS message.
The dashboard offers summary screens for just about every category of equipment or service in the system. Each entry in a summary screen provides a link through to a details screen. Screens contain tables of data and also illustrative graphs and charts.
The system dashboard can be accessed from anywhere through a standard web browser or a special mobile app.
SolarWinds RMM is charged for by subscription with no setup fees. This makes the service appealing for startups because there are no upfront costs involved in starting to monitor remote systems with SolarWinds RMM. SolarWinds offers 30-day free trial for those who are interested in trying out the service
Site24x7 covers all of the aspects of system monitoring, giving you visibility on network, server, and application performance. This combination of competences is great for those who run virtualizations. The tool is able to monitor Microsoft Hyper-V and VMware and it’s also able to track the activities of Docker containers.
As a cloud-based system, Site24x7 is location-neutral. Your infrastructure management team does not need to be physically located in the same building as the facilities being managed. The dashboard is accessed through a browser and there is also an app to gain access through mobile devices.
Server monitoring gives you constant updates on statuses that include CPU, memory, disk, network interface, software operating system, port, and file system checks. In total, the tool covers more than 30 different server performance factors.
The application monitoring screens in the tool give a constant live view of all activity on your network. It seeks out the causes of detected errors, down to analyzing lines of code.
Monitoring is not limited to one particular site. The Site24x7 package is able to reach across the network to check on multi-site systems and remote systems. If you don’t run your own servers, but use cloud resources, you can still deploy the Site24x7 monitor to get visibility on your network performance. The tool can monitor AWS and Microsoft Azure servers.
On-premises, Site24x7 can oversee devices running Windows, Windows Server, Linux, FreeBSD Unix, and OS X. The tool is able to give detailed activity data for each switch on your network and is also able to monitor wireless networks.
Site24x7 is offered in packages to suit website-driven businesses and MSPs and well as regular bricks and mortar companies. The service is charged for on a subscription basis and there is a restricted Free Edition that can monitor up to five servers.
You can get a 30-day free trial of the system. If you choose not to buy at the end of the trial period, you get switched over to the Free edition.
Atera is a remote monitoring and management system (RMM) designed for use by managed service providers (MSPs). The tool goes beyond system monitoring because it includes a large number of system administration utilities, such as a patch manager.
The Atera service enables the MSP technician to monitor the IT infrastructure of a client through the installation of agent software on that remote site. Once the service begins for a new client, the Atera system searches through the client network and logs all attached devices. This creates an equipment inventory and starts the monitoring process for each device.
Atera also searches all endpoints and servers to record what software is installed on the site. This search feeds through to license management functions in the system. Both the equipment search and the software log present opportunities for the MSP to adjust contracts in order to reflect exactly what infrastructure the service will monitor – many clients don’t know exactly what equipment and software they have on-site before the MSP contract begins.
The monitoring system includes the supervision of networks, servers, and applications. The server monitor checks on all of the standard performance issues of a server. These include CPU, memory, and disk capacity and utilization.
The performance of applications is closely linked to the statuses of the servers that host them. This is particularly true in the case of virtualizations. Atera is able to monitor a wide range of applications including web and email servers, databases, virtualizations, and communication services.
The Atera system includes monitoring modules for in-house use by the MSP. These include contracts management, client management, and team management.
Beyond monitoring processes, Atera includes Help Desk management software that includes remote access and chat facilities for use by the technicians manning the Help Desk.
Atera is a cloud-based service, so the MSP doesn’t need to install any software in order to use it. All access to the dashboards of the service is made through a standard web browser. The service is charged for on a subscription basis with fees levied per technician per month. There is also a yearly tariff available, which works out cheaper. Atera is available for testing on a 30-day free trial.
Nagios is an enduring standard in network monitoring. Nagios Core is the open-source free version, and Nagios XI is the commercial for-cost variant with additional features and automated assistance for configuration. Nagios has a reputation for being powerful, reliable, scalable, and extremely customizable – and being complex to configure.
The free version has a learning curve but also an active community. It monitors servers, services and applications, just like the commercial version. It includes reporting by email and SMS, a basic user interface (including the network map), and basic reports.
Nagios Core lacks auto-discovery, and you must learn to set up and maintain complex configuration. On the plus side, it does give you a lot of flexibility to customize and extend the tool. Community-developed addons can perform discovery and help you get started with configuration.
You can use the free 60-day trial to evaluate the for-cost version and, if you elect to go with the free one, save the auto-generated config files from /usr/local/nagios/etc before uninstalling your eval copy. You can then use those files as your starting point for your new install’s configuration.
The commercial version Nagios XI has a richer range of features, including automated support for discovering your devices and hosts, automatically configuring the tool, and commercially-supported addons. It has a much more sophisticated user interface and more advanced reporting that covers trends, capacity planning assistance, etc.
Nagios XI is built to run on Red Hat Linux and CentOS. For Windows, use a VM appliance with Hyper-V or VMware. It includes an auto-discovery tool and a configuration wizard for adding a new device, host, or application).
Once Nagios XI is installed and monitoring, the Operations Screen gives you a high-level view of the current state of the network, and the Operations Center lets you drill down to the items mentioned.
The Host Status page shows a summary of metrics for the monitored hosts. You can drill down to an individual host to see details including performance graphs, capacity planning info, alarms, etc.
The Service Status page summarizes the state of the monitored services.
Nagios is a well-regarded solution for network monitoring. As with other tools that offer a fully-free vs commercial version tradeoff, you must decide whether you have (or will develop) the expertise and time to use the free tool, or whether it would be more cost-effective to pay for the automation and support of the commercial version.
WhatsUp Gold is a long-established network monitoring tool from Progress (formerly IPSwitch). It’s a feature-rich yet straightforward server and application monitoring tool.
WhatsUp monitors servers, virtual servers, cloud services, and applications. It also monitors network traffic. Cloud monitoring includes hybrid cloud environments for Azure and AWS.
The free version is a free five-point license for monitoring up to five resources (eg, five servers).
WhatsUp must be installed on Windows. Setup is simple and uses auto-discovery. The user interface provides multiple views including an interactive network map and the ability to drill down to investigate issues. Dashboards are customizable.
It provides many canned reports and report customization too. There are multiple options for notification, including email and SMS. Triggered actions can also be specified for responding automatically to alerts.
WhatsUp’s list view shows the discovered hosts and devices, summarizing their characteristics and status.
The map view is an interactive map for visualizing your network’s components and their statuses. You can drill down to inspect the availability and performance of individual nodes.
The top 10 view shows critical statuses in your network.
The for-cost Application Performance Monitoring add-on adds the ability to monitor common applications and services.
The free trial edition of WhatsUp Gold is a straightforward and fully featured tool for monitoring and managing a small shop. Graduating to the for-cost version lets you move up to covering large networks.
Choosing a System Monitoring Tool
Besides the tooling to monitor your systems, you need a protocol in place for solving problems and responding to incidents. Best practices for system monitoring call for forethought and attention to design.
Free tools are tempting, particularly if you are on a tight budget. The free versions of paid software are usually limited in their capacity so that they can only support small networks. Some freeware has worked its way into the toolkits of seasoned network administrators mainly out of familiarity. However, these underfunded tools are usually under-supported and glitch-laden.
Planning is a key stage when buying new monitoring software. You need to look for suites of monitoring tools that cover the whole system stack. Remember that spending on monitoring saves you money in other areas of the IT department and prevents loss of income to the business due to system failure.