So, Your the senior network engineer for a nice large network. Everything is running smoothly until one day a primary link goes down and you seem to be the last one to know about it! Everyone is running to you to find out whats going on and expecting you to know what and why things went down. Every had that happen to you?
Now if you have setup your network properly this should have never happened, right? You are using dynamic routing protocols to alt-route your traffic around any potential problem to avoid this scenario, right? NOTE: If you aren’t running an internal routing protocol like OSPF or EIGRP you better get a plan together and get it place today! But I digress
🙂
So back to our scenario… Your network is running smoothly and all of a sudden you have a link that goes down. All traffic is re-routed around the problem (dynamically) and the users NEVER experience any problem at all! No one ever knows about the catastrophe that was just avoided. What a cool thing! Too bad the Captain of the Titanic wasn’t as fortunate.
But let me ask you…. if the user never see’s the problem how will YOU know that a problem has occurred or is occurring? If you’re pinging the far end of your link as a test of it usability, these pings will get re-routed with all the other traffic, and you may end up getting caught not knowing an outage has happened.
Here’s a quick tip for proactively monitoring all your links and not depending on traffic flow to see if the other end of your link is up or not.
In each of your routing protocols you need to be logging adjacency changes. Or in the case of BGP, neighbor changes. This way if your router loses contact with its distant end partner, a log entry to will be made to document this loss of contact.
The configuration change is simple, only one line needs to be added to your routing protocol section within your config. Here’s two examples:
router bgp 65000 no synchronization bgp router-id 1.1.1.1 bgp log-neighbor-changes router ospf 1 router-id 1.1.1.1 log-adjacency-changes
In both of the above examples we are telling the router to create a log entry if the associated routing protocol does not receive hello packets from its neighbor and an adjacency change occurs.
From here what you should be doing is redirecting your logs to a syslog server:
logging 172.16.4.160
Most syslog servers and network management servers will have email or paging mechanisms to alert you if a particular line of text is detected in the logs. Simply have your syslog server notify you if it detects anything in the log with the string “ADJCH” or “NBRCHANGE” or “Neighbor Down“. You may need to experiment to see what your exact log entry may be for your particular router/routing protocol.
Remember always be the first to know whats going on in your network. And don’t be caught off guard!
I hope this is helpful to you al and as always drop me a line if you have any questions or comments.
Take care,
Freak!