Skip to content
November 29, 2010 / A network engineer's diary..

Configuring SLA monitors and using them for route tracking and graphing WAN conditions.

 

In this article let’s take a look at the Cisco SLA. Cisco SLA is a toolkit that enables to monitor and measure network statistics such as packet loss, delay and jitter (variable delay) in near real time. Should you choose to act on the measurements, for instance change a static route or use a backup interface when the jitter on the tracked route or tracked interface increases above a set threshold, sla provides with a set of tracking commands to do that too, furthermore one can even write some EEM (embedded event manager) processes on the device to achieve a bit more complex functions (such as sending a email to a tech when the jitter increases or any other non-core functions), we will keep that for the next section.

SLA (or SAA in some older IOS versions) if configured between two routers send a constant stream of live traffic, the type of which depends on the configuration itself, the router at the far end measures the deviation from expected behavior for this traffic and informs the originating router of any such variations. Should the range for deviations exceed a certain threshold, an alarm is generated on the originating router. To better understand the process, let’s consider an example wherein a SLA configured to measure Jitter on a pair of end to end routers. Originating router that participates in the SLA generates a constant stream of packets, time stamps each packet and dispatches it to the far end, the far end router looks at the timestamp on the packet and matches it with its own reception time, even negates its own processing time for the packet from the total and sends this numeric value back to the originating router. Should the deviation recorded exceed a threshold value, originating router generates an alarm.

To better understand this, lets take a look at a live scenario. We have two routers, one at the origination side that generates the live traffic and another at the termination side that receives and reports back the results. We would configure the origination router to send two streams of data, one for measuring the path loss and another stream for measuring the Jitter along the path. We would be using the path loss stream to track the connectivity and data collected from the jitter stream will be utilized to analyze network conditions.

The configured network looks like the one shown above. Let’s proceed with the configurations now; on the originating router we would select the type of traffic that needs to be measured, in the below example we have selected ICMP traffic to measure path loss and a steady stream of traffic on port 10000 to measure the jitter characteristics. Configuration on the originating router is as shown below.

 

Note that each type of test needs a monitor statement of its own. Also note that once a SLA monitor is set a scheduling command needs to be implemented for the time that the SLA measurements are to be taken.

On the terminating end, we use an IP SLA responder to initiate the receiving process as shown below.

Now, let’s take a look at the IP SLA monitoring statistics on the originating router.

Notice that the tests are named as Index 1 and Index 2; also note that the ‘latest operation return code’ is set to ‘OK’ on both the streams. Should the path loss or the jitter go above or below a set threshold, the return code will change to a ‘non OK’ value initiating an alarm.

Using these SLA measurements lets track a static route towards the far end point, so that if and when SLA detects a timeout due to path loss, a preconfigured static floating route kicks in to take its place.

Let’s take a look at the tracking configuration now.

Note that (in the figure above) the tracking number should correspond to the IP SLA index number. Also note that the tracking statement is then tagged to the static route through 10.1.1.2, we also have another floating static through 172.16.17.2 which would come into play should the SLA take the tracking route down.

This is how the reachability looks for 200.1.1.0 subnet when SLA for path loss works fine (when there is no path loss).

Now to simulate a loss of connectivity I have removed the frame relay map, note that the serial interface would still stay up, but the route is lost. Mentioned below is the debug at the originating router. Notice that the rtr 1 state change from up to down.

Now let’s verify the routing table again.

As planned our static route has took the place of tracked route and the subnet 200.1.1.0 is reachable again.

That’s the end of SLA and tracking, now let’s take a look at how the jitter characteristics that we had mapped on stream2 of originating router be displayed in graphical format using cacti (refer to the links mentioned below for configuration of this tool and can be run on any available Linux or BSD distributions available)

Let’s configure the originating router to honor the SNMP walk requests that it gets from cacti

Now we would do the corresponding configurations on CACTI as well (refer to the links mentioned below to install and configure cacti for this purpose)

Here is a graph that charts the round trip time (RTT) on the two router network (can you believe a 40ms delay and all I have is a frame relay switch sitting between two routers).

Graph Template: Cisco – SAA Basic Statistics

Here is another graph that charts the standard deviation and mean of jitter on the network.

Graph Template: Cisco – SAA Jitter Dispersion

G

Cacti can store and display the logs for years, making it really easy to diagnose and resolve time sensitive problems that normally plague WAN links. Thanks for reading on, stay tuned for more articles.

For more installation on CACTI configuration for SNMP walk configuration refer to http://bit.ly/i3l2Jz

For further information on templates that can be used for SLA monitoring refer to http://bit.ly/3P6K5g

Advertisements
August 30, 2010 / A network engineer's diary..

BGP Confederations and Route Reflectors – A design overview.

 

Folks welcome back, in this section we would design a network that employs confederations and route-reflectors. Confederations help us divide a larger BGP autonomous system into smaller autonomous systems. Route-reflectors on the other hand are employed to suppress the split horizon rule of BGP (split horizon in BGP stops a router from advertizing a route it learned from an iBGP peer to another neighbor, because iBGP expects full mesh connectivity between all the routers running it).

To illustrate how these concepts come into play in a real network design, check out the diagram mentioned above. There are two main autonomous systems (AS) here, AS 10000 and AS 20000 (indicated by a red dotted line).I have split up AS 10000 into two confederation autonomous systems, AS 65010 to the left and AS 65020 to the right. Routers R1, R2 and R3 form a part of the confederation 65010 and router R4 & R5 form a part of confed 65020, (indicated by black dotted line encircling these routers). R6 is placed in an external AS of 20000 and has peering at two points with AS 10000, on router R2 and on router R4. Note that all the confederation autonomous system numbers should be in the range of 64512 to 65535, and are locally significant (think private IP address blocks)

Mentioned below is R1’s configuration:

Note that the BGP process of member routers takes the AS number of the confederation AS, hence R1’s running 65010.

R3 and R5 would have similar configurations as that of R1; all are internal confederation routers with no direct peering to the external networks.

Now let’s take a look at the R2’s configuration.

Most of the statements on R2 are self explanatory except for “bgp confederation identifier 10000” and “bgp confederation peer 65020”. Confederation identifier config identifies R2 as a part of the main autonomous system 10000 to the external router R6 on AS 20000. Note that this command is necessary only on the routers that interface with the external network in a confederation (for instance R1, R3 and R5 does not need this configuration). bgp confederation config on the other hand identifies R3 as a part of the confederation 65010 to R4, a similar config is needed on R4 to identify itself as a part of the confed 65020 to R2.

Taking a look at the R4’s config, it looks just like a mirror image of R3. Note that the bgp confederation peer on R4 is set to 64010 to mirror R3’s config’s.

Simple enough! Now let’s take a look at R6’s config’s and its routing tables.

R6 is unaware of any complexity within the AS 10000 and peers with the AS on R2 and R4; R6 sees the AS just as AS 10000.

Further looking at the routing table for R6 we see that all the routes R6 has learned are learned through autonomous system 10000 regardless of which confederation within autonomous system 10000 they originated from.

That completes our discussion on BGP confederations.

Let’s briefly take a look at the route reflectors now, consider router R1, R2 and R3 from the architecture, per BGP split horizon rule, R2 should not advertize the routes learned from R1 to any of its downstream routers, which in this case is R3. Rule was put in place because the creators of BGP saw that iBGP is deployed only on service provider cores, so full mesh connectivity of the routers was made mandatory. Which means there are is no need for one router to advertize its routes learned via iBGP to another neighbor. But when the routers are not run in full mesh connections, this rule becomes a pain in the a**. By making a peer a route reflector client, split horizon rule is selectively suppressed for that particular peer.

Now let’s remove the route reflector config from the R2;

Mentioned below is a snapshot of R1’s BGP topology table with the route-reflector client removed from R2. Note that R2 drops all the routes that it learned from iBGP neighbor R3 from being advertized to R1, but advertizes the routes that are learned via other EBGP neighbors to R1. As a result R3’s subnets 30.1.1.0/21 disappears from the routing table of R1.

Similarly R3 also gets all other routes, but for the ones originated by R1 (being advertized by R2 of course). Hence the subnets 10.1.0.0 /22 subnets vanish from the routing table.

Let’s now try putting the configurations back and see how it impacts the routing table

Once the change is made R1 starts seeing the 30.1.0.0/21 network once again,

Similarly we see the 10.1.0.0/22 subnet network once again on R3’s routing tables.

Note that there were some route-maps created on R2 to achieve the next hop changes to the network that were advertized between R1 and R3, without which the routes would not be accessible. I am not showing this step to limit the length of the article. The next-hop-self command on BGP peering relationships does not apply for route-reflector based routes (and Cisco and juniper opines it as a feature and not a bugJ).

Hope you enjoyed the article; stay tuned for more and have a nice summer.

 

 

 

 

 

 

 

 

July 31, 2010 / A network engineer's diary..

Design and deployment of HSRP in a LAN environment

Folks, welcome back! In this session we would take a look at HSRP. HSRP or Hot Standby Routing Protocol was primarily designed for providing layer two redundancy for default gateways (DG) failures on LAN segments. Understandably, for most networks out there if the default gateway is lost (may be by a router crash or by a interface disconnect or …) LAN looses the ability to communicate with the external networks, HSRP offers some layer 2/3 redundancy for such failures by providing a virtual IP and a virtual MAC address and binding interfaces on two or more routers to the same virtual IP address (VIP).Any one router can be active at a time. A router that is chosen to be active would attend to the requests that come in for the VIP, should the active router go inaccessible the standby router/s assumes the role of servicing the requests for the VIP. Note that to benefit from HSRP, default gateways on the PC’s should be configured with the VIP instead of interface addresses of the routers.

With that in mind let’s actually design a network that does this for us. Our goal here is to create a HSRP group with R3 as an active router and R2 as a standby router. We would further set the VIP to 192.168.1.100 and R3 to preempt (give it the capability to assume the active role should it go inaccessible and come back up). Further we would like to set a tracking on the interface between R3 and R4, so that if the serial interface goes down, R3 loses its active status to R2 (note that if the serial interface goes down its useless to keep R3 active as it has no way to route the packets to the core) . Also should R3 regain the serial interface it should switch back to active, we would use preempt statement to do that. Refer to the diagram mentioned below for further details.

Let’s begin by configuring R2 (refer to the snapshot mentioned below). Note that the VIP in this case is 192.168.1.100 and R2 is given a priority of 101, also note that the preempt keyword enables R2 to kick the active router off and assume its role, should the priority of R2 increase to a better value.

Let’s move on to R3. The configuration is almost the same on R3 except for increasing the priority to 255 (thereby making it an active router in the group) and adding a statement to indicate that serial 1/0 to be a tracking interface, with a priority negation of 155. In essence whenever serial 1/0 goes down; a priority of 155 is subtracted from 255 bringing the priority value to 100, thereby making R2 as the active router (remember R2’s priority value is 101). Again note that due to the preempt statement in the config, should the serial 1/0 come back on R3, it again returns to its original active state.

 

Once that’s done, we see that the HSRP groups come up.

Also note that that R2 and R3 exchange the HSRP hello’s between each other once every 3 seconds by default. A debug log on R3 below shows hello packets coming into R3 from R2 announcing that it’s in standby mode and a hello packet leaving R3 announcing that it’s active.

 

Taking a look at the arp table on router R1 mentioned below, note that the VIP now has a separate MAC address of 0000.0c07.ac0a, which is different from the individual BIA’s of the routers R2 and R3. Note that the 0x0a on the last octet corresponds to the decimal 10, the HSRP group number that we created.

 

Now that we are familiar with the inner workings of HSRP, let’s see it in action. As a first test, I would unplug the Ethernet cable on R3. As soon as this happens the router resigns itself of active role and goes into an Init state because it detects the change in the eth interface.

 

R2 does not know what happened until the next few seconds until its hello/dead timers get expired, as soon as that happens it assumes an active role. Debug logs on R2 are mentioned below.

Note that the virtual MAC address remains unchanged during the switch; hence R1 need not change its arp cache. That’s the advantage of using a Virtual mac, rather than router’s BIA. A router reboot or any other connectivity loss also results in a similar outcome as above (not shown).As soon as I plug the Ethernet cable back, R3 leaps back to active status (not shown).

As a last section, let’s take a look at tracking interface in action. To illustrate this, let’s unplug the serial cable from R3, note that as soon the serial 1/0 is unplugged from R3, its priority value changes to 100, which is one less than R2’s priority and hence it loses its active status

Mentioned below is a debug log of R3 stepping down.

Mentioned here is a debug log of R2 stepping up to be an active router.

Note that as soon as the tracking interface is connected back, priority value of R3 surges back to 255 and it becomes active again (not shown).

That concludes this article on HSRP. Thanks for reading on and stay tuned for more in coming days.

 

 

 

 

July 1, 2010 / A network engineer's diary..

Voice VLANS, Data VLANS and DHCP architectures in centralized and distributed environs.

 

Thanks for tuning in, in this article let us check out some design challenges for LAN implementation housing a converged voice and data network. Today we would take a look at how to segregate data and voice traffic streams using VLANS and dynamically assign them IP addresses from two separate pools for a standalone architecture. Further we would discuss on how this model/configuration can be extended for a distributed architecture.

To begin with I have an IP Phone (cisco 7940 phone) that’s daisy chained to a PC (my old dell laptop) and connected to a 3560 switch-port, just like shown in the figure mentioned below. In the background I am running a VM of a call manager server with the TFTP service activated.

 

 

Note that both the traffic from the IP Phone as well as the PC enters the same switch port, and so we create two VLAN’s (with keyword access for PC and voice for Phone) to achieve segregation of the traffic streams.

 

 

Router Ethernet interface is configured as a standard dot1q trunked sub-interface each with the respective VLAN tags, nothing funky there.

Once that’s done, we would proceed towards implementing two separate DHCP pools for both the VLAN’s (I always forget the commands and the sequence of typing them, cisco doc cd comes in handy, mentioned below is a link for the dhcp section of the CD)

http://tinyurl.com/2ccweco

Note that I have excluded the IP addresses that I had assigned to the physical and switched virtual interfaces, also note that the default router should be an IP address in the same subnet and should be in up and up status. (It’s this IP address that listens in on the DHCP request that’s broadcast in the subnet). Lease period is optional, typically for a phone network I would set a lease of infinity. One thing that’s different on a dhcp pool for a IP Phone as against a PC is that for a IP Phone we need to configure an option 150 listing the TFTP server address (any IP Phone would have to go to a TFTP server, typically a call manager to retrieve its configuration files, IP in option 150 is the address of the TFTP server)

At this stage I would bring up my devices and the DHCP IP’s are issued. It’s that easy. Note that the PC gets IP addresses in the 10.x subnet and the phone in the 20.x subnet as planned. Further (what I am not able to show you here) is phone accesses the TFTP server on the 192.168.1.110 IP address and boots up with an extension, an awesome sight to see!

 

This completes the DHCP allocation for a standalone remote office, now let’s venture a bit further and take up a case of a VOIP deployment for a distributed architecture with a main site running the DHCP server and a remote site housing the phones and PC’s.

We would create two DHCP pools on the main site router in this case using the same process outlined above and we convert the remote site router into a DHCP relay agent.

 

 

Note that the DHCP broadcast request is converted into a unicast packet and sent to the router that houses the pool, making it a safe bet on WAN links.

At the remote site we configure the head end router as a relay, with the help of a IP helper command, note that any interface IP address on the main site router can be used, preferably a loopback interface.

 

Further we can see the DHCP bindings being allocated to the clients at the main site.

Note that using the DHCP server at the main site gives us the advantage of centralized pool management in complex networks, but if your WAN goes down, phones do end up without an IP address. Not really a problem if you have a centralized cluster housed at your main site, but might just create another failure point if you are running a distributed cluster of CUCM at both the main and the remote sites.

This completes the section on basic design of vlan and dhcp instances for data/voice traffic. Comments and critiques on the article are certainly welcome. Stay tuned for more articles in coming weeks and months.

June 19, 2010 / A network engineer's diary..

BGP design for a dual homed enterprise – the bare minimums!

 

We are back with another article and this time I would take a look at a typical enterprise network that connects to more than one ISP’s  (I would consider two) for redundancy. One of main concerns in any typical dual homed BGP network is preventing your own AS from being a transit network. In this article we would take a look at how this can be achieved. In addition to that, most enterprises would like to prefer one ISP over the other for each directional streams of traffic. For instance, in the diagram that’s shown below, AS 100 which happens to be a Enterprise network (say) and would want to use ISP-B( AS 200)  for its incoming traffic and would want to use ISP B (AS 300) for its  outgoing traffic, but of course you should have each of them capable of handling both the streams when one of the ISP’s is inaccessible for some reason.

 

With this basic goals let’s begin designing and implementing this network.

 

 


 

 

 

I have already deployed BGP on all four routers R1, R2, R3 and R4. I have created some internal routes on each of these routers that would provide us with some meat to work with,

 

Shown below is the complete BGP table for R1 host router, before changing any of the BGP attributes on the router.


 

 

 

 

note that 100.1.1.0/24 exists in AS 400, which has peering relationship with both the ISP’s and hence the BGP process on R1 is learning it from both AS 200 and AS 300.

 

Coming to the first design aspect which is to prevent the host network (R1 in AS 100) from being a transit network, to achieve that we should stop the subnets advertised by ISP A from reaching ISP B and vice versa through R1. We facilitate this by matching the AS path for both the ISP AS paths using as-path access lists and then by using them in a deny route-map, which we would then implement on the neighbor’s in the outbound direction, as shown below.

 


 

 


 


That stops R2’s and R3’s routes being advertised into one another using R1, hence prevents R1 from becoming a transit for both the ISP’s.

 

Now moving on to the second part of the design objective, that is preferring AS 200 as the outbound ISP and AS 300 as the inbound ISP. Note that there are two parts to it, in the former we are talking about traffic leaving AS 100 and in the latter we refer to traffic that enters AS 100.

 

To make AS 300 as the preferred path for all outbound traffic we match the routes being advertised by AS 300 and set the local preference to any value above 100 (this can also be achieved by changing the cisco proprietary attribute of “weight”)

 

On R1


 

And

 


 

 

Note that the route with best local preference is placed in the routing table.

 


 

Note the LocPref value changed to 120 on all the routes learned from AS 300.

 


 

Also note that the route 100.1.1.0/24 is now preferred via AS 300 than via AS 200.

 

That completes our second goal, now moving on to the third and last aspect of the design, which is to make AS 300 less preferable (or AS 200 more preferable) for external traffic into the host AS.

 

To achieve this, we would create another as-path access list to collect all the locally advertized subnets. Once done, we would then use them in a route- map to prepend the last/local AS number a couple of times, we would then use the neighbor statements on the ISP neighbor command that needs to be less preferred.

 


 


 

 


 

This would prepend AS 100 to the AS paths that are advertised making it unlikely to be chosen as a return path, just to be sure let’s take a look at R4’s routing table to see how this may turn out.

 


 

A look at the R4 reveals that all the routes that are learned from AS 300 has a longer AS path with more hops on it and those that are learned from 200 have a much shorter hop, hence routes out of 200 are preferred.

 

That completes the second part of this objective.

 

Note that this method of AS preference would work if the person/AS originating the traffic chooses to keep all other all other parameters such as local preference, weight, origin status at their defaults. However should he change those parameters, we might still end up receiving traffic in the ISP that we don’t want to (just like the way we do it while sending our traffic out). Its best to keep your ISP in the loop while you set these preferences, its better yet if your ISP can do this for you (in which case you mark the routes to be treated a certain way with a community tag maybe?)

 

That completes the design session and we have achieved all our three goals. Please stay tuned for more sessions on hot and ‘not so hot’ networking topics.

 

 

 

 

 

June 19, 2010 / A network engineer's diary..

Static connected routes and the IGP’s – a weird twist!

Until recently I had thought that only the network interfaces whose IP subnets fall under the purview of the network statements that we define under the routing protocol are put into the topology table and are therefore advertized to the neighboring routers running the same IGP. Folks, that’s a myth. Static connected routes with the next hop as a interface acts and behaves the same way as any IP address with a interface would and if the subnet defined on the static route falls under the networks that we define for the routing protocol, well, then guess what .. the static route is picked up by the IGP and is propagated towards all its neighbors running the same protocol. Interesting as it may seem, this happens only with distance vector IGP’s like EIGRP, RIP and IGRP. Link state IGP’s do not do this with the static routes.

To see this in operation, lets take a look at the example network as shown in figure 1.

All the routers are running EIGRP on them and we have a static connected route defined with null 0 interface on R10 pointing towards R12’s lo1 ip address of 20.1.1.x subnet .

 

This static connected route is picked up by EIGRP (note that we have NOT redistributed the route into EIGRP) and propagated throughout the network.

On R11 we see that EIGRP finds the static route advertised with a better feasible distance and hence adds it to the routing table.

As a result even if there is a path available to reach the loop back IP address 20.1.1.10 through 16.1.1.x subnet R11 chooses to go through R10, effectively creating a blackhole for all the traffic towards 20.1.1.x subnet.

This caught me off guard, hope you will better fare through the pitfall. Thanks for reading on, stay tuned for more articles.

June 18, 2010 / A network engineer's diary..

Cisco MDIX – a feature that can bring down ports without warning.

 

 

 

It’s a known CCNA fact that you need straight through cables to connect unlike devices and a crossover cable to connect like ones, so undoubtedly for a switch to switch connection a crossover cable is used. But, I have met a lot of network engineers who believe that nowadays this does not really matter, because switch has a auto detection capabilities that can switch the TX and the RX pins on its end port to correct itself, should it detect a straight through cable being plugged instead of a crossover. Well that’s true, until you find out that MDIX (Media Dependent Interface Crossover) works only if the speed and duplex settings on the trunk port are in auto mode/state. The moment you hard-code the speed or duplex to a particular value, MDIX feature is switched off and the port state changes to ‘down’. Refer to the figure mentioned below thats illustrates this in action (Note that I have connected two switches by creating a dot1q trunk port).

 


Worst yet, switch does not even throw a error warning you that its the auto MDIX feature thats stopped operating and thats taking the port down, and if you are like me, you would probably start with checking the encapsulation on trunk ports and dot1q native vlan configurations, before arriving at this one, making this auto feature a ‘good to know’. Thanks for reading on. Please stay tuned for more articles.