Redundancy

There are 2 redundancy stack mode.

  • N+1 redundancy stack mode: Any switch can be active or standby.
  • 1+1 redundancy stack mode: It assigns an active and a standby role to the specific switches and the member role to all the remaining switches. If the active switch reboots, the standby switch becomes the active one and vice versa. But the members don’t change.

1+1 Redundancy

Prerequisites

  • All the switches in the stack must be running the same license level as the active switch.
  • All the switches in the stack must be running compatible software versions.

Enable 1+1 Mode

By selecting the role, it enables the 1+1 stack mode.

Device# switch 1 role active
Device# switch 2 role standby

WARNING: Changing the switch role may result in redundancy mode being configured to designated Active/Standby mode for this stack. If the configured Active or Standby switch numbers do not boot up, then the stack will not be able to boot. Do you want to continue?[y/n]? [yes]:

Device# reload

After reloading, the stack mode is as below. If Switch 2 is booted earlier than Switch 1, Switch 2 takes the active role. This means the active and standby role are exchangeable in 1+1 mode. But the member role doesn’t change.

Device# show switch stack-mode 
Switch# Role Mac Address Version Mode Configured State
--------------------------------------------------------------------------------------------------------------------
1 Active 6cdd.30dc.fd80 V01 1+1 Active Ready
*2 Standby b4a8.b9c0.af80 V01 1+1 Standby Ready
3 Member 7486.0bc5.7c00 V01 1+1 Member Ready

Switch back to N+1 Mode

By clearing the stack mode, it changes back to N+1 stack mode.

Device# switch clear stack-mode 

WARNING: Clearing the chassis HA configuration will result in both the chassis move into Stand Alone mode. This involves reloading the standby chassis after clearing its HA configuration and coming up with all interfaces in shutdown mode. Do you wish to continue? [y/n]? [yes]:

Device# show switch stack-mode
Switch# Role Mac Address Version Mode Configured State
--------------------------------------------------------------------------------------------------------------------
1 Standby 6cdd.30dc.fd80 V01 1+1' None' Ready
*2 Active b4a8.b9c0.af80 V01 1+1' None' Ready
3 Member 7486.0bc5.7c00 V01 1+1' None' Ready

Device# reload

Form Switch Stack and Role Selection

Note: The config mode is available only after the role selection is done. Otherwise, the terminal shows Config mode locked out until standby initializes or when switch is in recovery mode.

Take a 3-switch stack with N+1 mode as an example.

Power up Switch One by One

Note: Add a switch in power-off state to another switch after it is booted up for 2 minutes. Then power the switch up. It won’t compete for the role selection.

Add the 2nd switch to the 1st switch. The 1st switch detects its stack port status change. The stack starts to initialize 2nd switch as a member.

*Jul 10 08:01:11.215: %STACKMGR-6-STACK_LINK_CHANGE: Switch 1 R0/0: stack_mgr: Stack port 2 on Switch 1 is up
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
*1 Active 6cdd.30dc.fd80 15 V01 Ready
2 Member b4a8.b9c0.af80 15 Initializing

After the initialization is done, the 2nd switch is added to the stack. Sync to the 2nd switch is in process.

*Jul 10 08:03:14.236: %STACKMGR-4-SWITCH_ADDED: Switch 1 R0/0: stack_mgr: Switch 2 has been added to the stack.
Applying config on Switch 2...[DONE]

PLATFORM starts to add the parts of the 2nd switch. The 2nd switch obtains the lowest switch number available. The stack decides the 2nd switch be selected as the standby. The sync is still in process

*Jul 10 08:05:17.433: %IOSXE_REDUNDANCY-6-PEER: Active detected switch 2 as standby.
*Jul 10 08:05:17.430: %STACKMGR-6-STANDBY_ELECTED: Switch 1 R0/0: stack_mgr: Switch 2 has been elected STANDBY.
*Jul 10 08:05:22.469: %REDUNDANCY-5-PEER_MONITOR_EVENT: Active detected a standby insertion (raw-event=PEER_FOUND(4))

Device# show switch stack-mode
Switch# Role Mac Address Version Mode Configured State
--------------------------------------------------------------------------------------------------------------------
*1 Active 6cdd.30dc.fd80 V01 N+1 None Ready
2 Standby b4a8.b9c0.af80 V01 N+1 None HA sync in progress

*Jul 10 08:06:49.145: %PLATFORM-4-ELEMENT_WARNING: Switch 1 R0/0: smand: 2/RP/0: limited space - copy corefiles/switch-reports out of flash:core & crashinfo: directories. crashinfo: value 88% (1420 MB) exceeds warning level 80% (1290 MB).
*Jul 10 08:10:09.147: %PLATFORM-4-ELEMENT_WARNING: Switch 2 R0/0: smand: 2/RP/0: limited space - copy corefiles/switch-reports out of flash:core & crashinfo: directories. crashinfo: value 88% (1420 MB) exceeds warning level 80% (1290 MB).

When the sync is done, the status is as below. By default, the switch priority is 15 which is the highest.

Device# show switch 
Switch/Stack Mac Address : 6cdd.30dc.fd80 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
*1 Active 6cdd.30dc.fd80 15 V01 Ready
2 Standby b4a8.b9c0.af80 15 V01 Ready

The priority can be adjusted manually. But it is not stored in the running-config. When additional switches are added, they only obtain the member role.

Power up all Switches

Since the default priority is 15 for all switches, the switch with lowest MAC ADDRESS takes the active role and the one with 2nd lowest mac address takes the standby role. When reloading the stack, messages are different in the different switches.

Device1# Chassis 1 reloading, reason - Reload command
Device3# Chassis 3 reloading, reason - Reload command
Device2# All switches in the stack have been discovered. Accelerating discovery

Switch/Stack Mac Address : 6cdd.30dc.fd80 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
*1 Active 6cdd.30dc.fd80 15 V01 Ready
2 Member b4a8.b9c0.af80 15 V01 Ready
3 Standby 7486.0bc5.7c00 15 V01 Ready

Break Stack and Restore Standalone Mode

  1. Power off the switch to be removed.
  2. Remove the StackWise Cable.
  3. Issue the no switch X provision command to remove the switch from the stack provision.

After the switch is powered off, the switch status in show switch changes from Ready to Removed. After removing the provision command from the running config, the switch status in show switch changes from Removed to Unprovisioned.

All the switches maintain the same config and switch number after the stack break. If they are stacked as the same old order, the stack config won’t change and the interface numbers remain the same, e.g. Te2/0/1. The switch X provision MODEL command is automatically generated when stacking happens.

However, if the switches need to be restore to standalone mode, issue the no switch X provision command on the non-active switches as well as switch X renumber 1. The interface numbers will revert back to the default, e.g. Te1/0/1.

Corner Cases

Removing Active and Standby Switches in 1+1 Mode

The remaining member switches stuck in looped auto reboot.

First, remove the active Switch 1. The standby Switch 2 takes the active role. The member switch remains the same.

Device# show switch 
Switch/Stack Mac Address : b4a8.b9c0.af80 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
1 Member 0000.0000.0000 0 V01 Removed
*2 Active b4a8.b9c0.af80 14 V01 Ready
3 Member 7486.0bc5.7c00 13 V01 Ready

After removing the active Switch 2, Switch 3 starts to reboot and reboots endlessly.

Chassis 3 reloading, reason - lost both active and standby

Current ROMMON image : Primary
Last reset cause : SoftwareReload
C9300-48U platform with 8388608 Kbytes of main memory

switch: boot flash:packages.conf
boot: attempting to boot from [flash:packages.conf]
boot: reading file packages.conf
################################################################

Stuck in ROMMON Mode after Reload

Even if I reset the ROMMON parameters, the switch still runs into ROMMON mode after reboot.

switch: reset
Do you wish to reset? [y] : y
Please wait while the system restarts.

Initializing Hardware......

System Bootstrap, Version 17.9.2r, RELEASE SOFTWARE (P)
Compiled Wed 11/23/2022 12:30:48.96 by rel

Current ROMMON image : Primary
Last reset cause : SoftwareReload
C9300-24T platform with 8388608 Kbytes of main memory

switch: set
ABNORMAL_RESET_COUNT=0
AUTOREBOOT_RESTORE=0
AUTO_SWITCH_CONSOLE_DISABLE=0
BAUD=9600
BOARDID=20568
BOOT=flash:packages.conf;
BOOTLDR=
BSI=0
Boot=cat9k_iosxe.17.06.03.SPA.bin
CALL_HOME_DEBUG=0000000000000
......

Booting from packages.conf file solved the issue.

switch: boot flash:packages.conf
boot: attempting to boot from [flash:packages.conf]
boot: reading file packages.conf
#############################################################

The same issue occurs after the reload. The following message is shown in the terminal. I enable the boot manual and warning message is gone. But after the reboot, it still runs into ROMMON mode. I power it off and one and boot from packages.conf. Finally the switch is up and normal. In the stack, I configure no boot manual again.

Device# reload
AUTOBOOT: First boot statement uses file flash:packages.conf
WARNING: Missing MANUAL_BOOT romvar or manual boot enabled on Switch 1
Please correct boot statement or copy image to flash:.

Device(config)#boot manual

Device# reload
Reload command is being issued on Active unit, this will reload the whole stack
Proceed with reload? [confirm]

Switch Numbers are Messed

After switching back to N+1 mode, 1 switch is stuck in the ROMMON mode. After I fix the issue, the switch numbers are messed and not the same as how they are physically laid. If I renumber the switches to 4,5,6 from 1,2,3, the stack won’t remove the old member and their interface config after the reload.

Stack1#show switch 
Switch/Stack Mac Address : 6cdd.30dc.fd80 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
1 Member 0000.0000.0000 0 Provisioned
2 Member 0000.0000.0000 0 Provisioned
3 Member 0000.0000.0000 0 Provisioned
4 Standby 7486.0bc5.7c00 15 V01 Ready
5 Member b4a8.b9c0.af80 15 V01 Ready
*6 Active 6cdd.30dc.fd80 15 V01 Ready

Remove the switch 1,2,3 provision, the interfaces and their config are gone.

Device(config)#no switch 4 provision 
......

Device#show switch
Switch/Stack Mac Address : 6cdd.30dc.fd80 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
1 Member 0000.0000.0000 0 Unprovisioned
2 Member 0000.0000.0000 0 Unprovisioned
3 Member 0000.0000.0000 0 Unprovisioned
4 Standby 7486.0bc5.7c00 14 V01 Ready
5 Member b4a8.b9c0.af80 15 V01 Ready
*6 Active 6cdd.30dc.fd80 13 V01 Ready

Device(config-if)#int gi4/0/1
^
% Invalid input detected at '^' marker.

Renumber the switches back to 1,2,3 and reload. Remove the switch 4,5,6 provision. It is back to normal then.

Leave a Reply

Your email address will not be published. Required fields are marked *