App::Netdisco::Manual::Troubleshooting - Tips and Tricks for Troubleshooting
The two basic components in Netdisco's world are Nodes and Devices. Devices are your network hardware, such as routers, switches, and firewalls. Nodes are the end-stations connected to Devices, such as workstations, servers, printers, and telephones.
Devices respond to SNMP, and therefore can report useful information about themselves such as interfaces, operating system, IP addresses, as well as knowledge of other systems via MAC address and ARP tables. Devices are actively contacted by Netdisco during a discover (and other polling jobs such as macsuck, arpnip).
Netdisco discovers Devices using "neighbor protocols" such as CDP and LLDP. We assume your Devices are running these protocols and learning about their connections to each other. If they aren't, you'll need to configure manual topology within the web interface (or simply have standalone Devices).
Nodes, on the other hand, are passive as far as Netdisco is concerned. The only job that contacts a Node is nbtstat, which makes NetBIOS queries. Nodes are learned about via the MAC and ARP tables on upstream Devices.
Because Netdisco only learns about Devices through a neighbor protocol, it's possible to run an SNMP agent on a Node. Only if the Node is also advertising itself via a neighbor protocol will Netdisco treat it as a Device. This can account for undesired behaviour, such as treating a server (Node) as a Device, or vice versa only recognising a switch (Device) as a Node.
To prevent discovery of devices,
devices_no configuration setting.
If you don't see links between Devices in Netdisco,
it might be because they're not running a neighbor protocol,
or for some reason not reporting the relationships to Netdisco.
show command to troubleshoot this:
~netdisco/bin/netdisco-do show -d 192.0.2.1 -e c_id
Please read the section above, if you've not yet done so.
Netdisco has four principal job types:
Gather information about a Device, including interfaces, vlans, PoE status, and chassis components (modules). Also learns about potential new Devices via neighbor protocols and adds jobs for their discovery to the queue.
Gather MAC to port mappings from known Devices reporting Layer 2 capability. Wireless client information is also gathered from Devices supporting the 802.11 MIBs.
Gather MAC to IP mappings from known Devices reporting layer 3 capability.
Poll a Node to obtain its NetBIOS name.
The actions as named above will operate on one device only. Complimentary job types
nbtwalk will enqueue one corresponding single-device job for each known device. The Netdisco backend daemon will then process the queue (in a random order).
Besides reading the whole of this manual page for general tips, take a look at the "SNMP Connect Failures" report under the Admin menu. Any devices listed have had multiple SNMP connect failures, indicating a possible configuration error on the device or in Netdisco's configuration.
Netdisco uses neighbor protocols to discover devices and will use as the default identity for a device the interface IP advertised over those neighbor protocols. You can use the
device_identity configuration setting to steer Netdisco towards using a different interface for the canonical device name.
If you upgrade the operating system then your system libraries will change and Netdisco needs to be rebuilt (specifically, C library bindings).
The safest way to do this is set up a new user and follow the same install instructions, connecting to the same database. Stop the web and backend daemon for the old user, and start them for the new user. Then delete the old user account.
Alternatively, if you do not mind the downtime: stop the web and backend daemons then delete the
~netdisco/perl5 directory and reinstall from scratch. The configuration file, database, and MIBs can all be reused in-place.
netdisco-do command has several debug flags which will show what's going on internally. Usually you always add
-D for general Netdisco debugging, then
-I for SNMP::Info logging and
-Q for SQL tracing. For example:
~netdisco/bin/netdisco-do discover -d 192.0.2.1 -DIQ
You will see that SNMP community strings and users are hidden by default, to make the output safe for sending to Netdisco developers. To show the community string and SNMPv3 protocols, set the
SHOW_COMMUNITY environment variable:
SHOW_COMMUNITY=1 ~netdisco/bin/netdisco-do discover -d 192.0.2.1 -DIQ
This is useful when trying to work out why some information isn't displaying correctly (or at all) in Netdisco. It may be that the SNMP response isn't understood. Netdisco can dump any leaf or table, by name:
~netdisco/bin/netdisco-do show -d 192.0.2.1 -e interfaces ~netdisco/bin/netdisco-do show -d 192.0.2.1 -e Layer2::HP::interfaces
You can combine this with SNMP::Info debugging, shown above (
Start an interactive terminal with the Netdisco PostgreSQL database. If you pass an SQL statement in the "-e" option then it will be executed.
~netdisco/bin/netdisco-do psql ~netdisco/bin/netdisco-do psql -e 'SELECT ip, dns FROM device' ~netdisco/bin/netdisco-do psql -e 'COPY (SELECT ip, dns FROM device) TO STDOUT WITH CSV HEADER'
The last example above is useful for sending data to Netdisco developers, as it's more compact and readable than the standard tabular output (second example).
The database schema can be fully redeployed (even over an existing installation), in a safe way, using the following command:
You can see HTTP Headers received by Netdisco, and other information such as how it's parsing the config file, by enabling the Dancer debug plugin. First download the plugin:
~netdisco/bin/localenv cpanm --notest Dancer::Debug
Then run the web daemon with the environment variable to enable the feature:
DANCER_DEBUG=1 ~/bin/netdisco-web restart
A side panel appears in the web page with debug information. Be sure to turn this off when you're done (stop and start without the environment variable) otherwise secrets could be leaked to end users.
If you change the SNMP community string in use on a Device, and update Netdisco's configuration to match, then everything will continue to work fine.
However, if the Device happens to support two community strings then Netdisco can become "stuck" on the wrong one, as it caches the last-known-good community string to improve performance. To work around this, delete the device (either in the web GUI or using
netdisco-do at the command line), and then re-discover it.
Try running the following command for installation:
curl -L http://cpanmin.us/ | CFLAGS="-DPERL_ARGS_ASSERT_CROAK_XS_USAGE" perl - --notest --local-lib ~/perl5 App::Netdisco