Aufgrund des Bugs CSCuj67614 bzw. CSCub63571 musste ich heute auf einem Cisco 4500x VSS Cluster ein Software Update machen. Beim Bug handelt es sich um folgendes Problem:
Die vom Kunden eingesetzte ist die letzte noch davon betroffene Version.
cat4500e-universalk9.SPA.03.04.02.SG.151-2.SG2.bin
Ein Blick auf die Download Optionen zeigt, dass es innerhalb des 3.4er Train ein aktuelles Suggested Release gibt.
cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin
Das Vorgehen zum Upgrade ist relativ einfach.
- Files hochladen
- Bootvariable anpassen
- Chassis einzeln neustarten.
Dennoch gibt es ein zwei Dinge die man unbedingt beachten muss. Nachfolgend eine Anleitung zum Upgrade des 4500x VSS Cluster.
Files in Bootflash des aktiven 4500x laden:
4500x# copy tftp bootflash: Address or name of remote host []? x.x.x.x Source filename []? cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin Destination filename [cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin]? Accessing tftp://x.x.x.x/cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin... Loading cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin from x.x.x.x (via Vlan203): !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [OK - 119576292 bytes] 126089600 bytes copied in 346.999 secs (335324 bytes/sec)
Files vom aktiven 4500x bootflash in den Standby 4500x slavebootflash kopieren:
4500x# copy bootflash: slavebootflash: Source filename []? cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin Destination filename [cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin]? Copy in progress...CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 126089600 bytes copied in 103.554 secs (1302482 bytes/sec)
Überprüfen ob die Files ordnungsgemäss in beiden bootflashs liegen:
4500x#dir bootflash: | in cat 96963 -rw- 125276868 Oct 7 2013 11:02:25 +02:00 cat4500e-universalk9.SPA.03.04.02.SG.151-2.SG2.bin 97005 -rw- 126089600 Apr 25 2015 20:02:29 +02:00 cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin 4500x#dir slavebootflash: | in cat 96963 -rw- 125276868 Oct 7 2013 11:02:25 +02:00 cat4500e-universalk9.SPA.03.04.02.SG.151-2.SG2.bin 97005 -rw- 126089600 Apr 25 2015 20:02:29 +02:00 cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin
Bootvariable überprüfen:
4500x#show bootvar BOOT variable = flash:03.04.02.SG.151-2.SG2.bin,12; CONFIG_FILE variable does not exist BOOTLDR variable does not exist Configuration register is 0x2101 Standby BOOT variable = flash:03.04.02.SG.151-2.SG2.bin,12; Standby CONFIG_FILE variable does not exist Standby BOOTLDR variable does not exist Standby Configuration register is 0x2101
Zwei Sachen müssen hier angepasst werden:
- Boot Variable auf cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin
- Config Register auf 0x2102
Standardmässig ist das Config Register auf 0x2101 eingestellt, das ist schlecht weil es die konfigurierte Bootvariable ignoriert und stattdessen einfach das erst beste File im Flash nimmt. Würde theoretisch funktionieren, wenn man die Files entsprechend umbenennt oder das alte File vorher löscht. Schöner ist jedoch das anpassen.
4500x(config)#config-register 0x2102 4500x(config)#no boot system 4500x(config)#boot system flash bootflash:cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin 4500x(config)#do write memory Building configuration... Compressed configuration from 21832 bytes to 9355 bytes[OK]
Ein „write memory“ an dieser stelle ist wichtig, denn erst dann erscheint die neue Bootvariable im „show bootvar“
4500x#show bootvar BOOT variable = bootflash:cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin,1; CONFIG_FILE variable does not exist BOOTLDR variable does not exist Configuration register is 0x2101 (will be 0x2102 at next reload) Standby BOOT variable = bootflash:cat4500e-universalk9.SPA.03.04.05.SG.151-2.SG5.bin,1; Standby CONFIG_FILE variable does not exist Standby BOOTLDR variable does not exist Standby Configuration register is 0x2101 (will be 0x2102 at next reload)
Nun sind die Voraussetzungen für einen reload erfüllt. Wer sicher gehen will, überprüft nochmals den VSS Status:
4500x#show redundancy states my state = 13 -ACTIVE peer state = 8 -STANDBY HOT Mode = Duplex Unit = Primary Unit ID = 1 Redundancy Mode (Operational) = Stateful Switchover Redundancy Mode (Configured) = Stateful Switchover Redundancy State = Stateful Switchover Manual Swact = enabled Communications = Up client count = 77 client_notification_TMR = 240000 milliseconds keep_alive TMR = 9000 milliseconds keep_alive count = 0 keep_alive threshold = 18 RF debug mask = 0
In meinem Fall habe ich den Standby 4500x zu erst reloaded:
4500x#redundancy reload peer Reload peer [confirm]
Der Switch reloaded nun. Insofern die Topology redundant mit MEC Etherchannel ausgelegt ist, sollte kein merkbarer Unterbruch entstehen
Verfolgen kann man den Bootvorgang mit show module:
4500x#show module Switch Number: 1 Role: Virtual Switch Standby Chassis Type : WS-C4500X-16 Power consumed by backplane : 0 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 16 4500X-16 10GE (SFP+) 2 8 10GE SFP+ M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 f0f7.55c4.xxxx to f0f7.55c4.xxxx Provision 2 d072.dc91.xxxx to d072.dc91.xxxx Provision Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+---------------------------------- 1 Standby Supervisor SSO Disabled Switch Number: 2 Role: Virtual Switch Active Chassis Type : WS-C4500X-16 Power consumed by backplane : 0 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 16 4500X-16 10GE (SFP+) WS-C4500X-16 JAE173xxxx 2 8 10GE SFP+ C4KX-NM-8 JAE171xxxx M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 c08c.60e0.xxxx to c08c.60e0.27cf 1.1 15.0(1r)SG10 03.04.02.SG Ok 2 4c4e.358d.xxxx to 4c4e.358d.1c87 1.0 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+----------------------------------
Nach einer Zeit sollten wieder die ersten Lebenszeichen des Standby Switches kommen:
Apr 26 09:06:25: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session UP on slot 11 port 2.
Apr 26 09:06:25: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session UP on slot 11 port 1.
Apr 26 09:06:25: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session DOWN on slot 11 port 1.
Apr 26 09:06:26: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session UP on slot 11 port 1.
Apr 26 09:06:41: %VSLP-5-VSL_UP: Ready for control traffic
Apr 26 09:06:45: %VSLP-5-RRP_ROLE_RESOLVED: Role resolved as ACTIVE by VSLP
Apr 26 09:06:45: %EC-5-BUNDLE: Interface TenGigabitEthernet2/1/1 joined port-channel Port-channel20
Apr 26 09:06:45: %EC-5-BUNDLE: Interface TenGigabitEthernet2/1/2 joined port-channel Port-channel20
Apr 26 09:06:45: %C4K_REDUNDANCY-6-DUPLEX_MODE: The peer Supervisor has been detected
Apr 26 09:07:09: %SW_LEVEL-6-RESULT: Operational redundancy mode is UNKNOWN, due to software license-level mismatch at ACTIVE and STANDBY. Software Level on Active: entservices; on Standby: entservices.
Apr 26 09:07:23: %C4K_REDUNDANCY-6-MODE: ACTIVE supervisor initializing for sso mode
Apr 26 09:07:23: %C4K_REDUNDANCY-3-COMMUNICATION: Communication with the peer Supervisor has been established
Apr 26 09:07:34: %C4K_REDUNDANCY-5-CONFIGSYNC: The bootvar has been successfully synchronized to the standby supervisor
Apr 26 09:07:34: %C4K_REDUNDANCY-5-CONFIGSYNC: The config-reg has been successfully synchronized to the standby supervisor
Apr 26 09:07:34: %C4K_REDUNDANCY-5-CONFIGSYNC: The startup-config has been successfully synchronized to the standby supervisor
Apr 26 09:07:35: %C4K_REDUNDANCY-5-CONFIGSYNC: The private-config has been successfully synchronized to the standby supervisor
Apr 26 09:07:36: %C4K_REDUNDANCY-5-CONFIGSYNC_RATELIMIT: The vlan database has been successfully synchronized to the standby supervisor
Etwas verstörend war die License Meldung, sie verschwand jedoch ohne Probleme.
Ein weiterer Blick auf show module zeigt, das der Neustart problemlos verlief:
4500x#show module Switch Number: 1 Role: Virtual Switch Standby Chassis Type : WS-C4500X-16 Power consumed by backplane : 0 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 16 4500X-16 10GE (SFP+) WS-C4500X-16 JAE1733xxxx 2 8 10GE SFP+ C4KX-NM-8 JAE1714xxxx M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 f0f7.55c4.xxxx to f0f7.55c4.xxxx 1.1 15.0(1r)SG10 03.04.05.SG Ok 2 d072.dc91.xxxx to d072.dc91.xxxx 1.0 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+---------------------------------- 1 Standby Supervisor SSO Standby hot Switch Number: 2 Role: Virtual Switch Active Chassis Type : WS-C4500X-16 Power consumed by backplane : 0 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 16 4500X-16 10GE (SFP+) WS-C4500X-16 JAE173xxxx 2 8 10GE SFP+ C4KX-NM-8 JAE171xxxx M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 c08c.60e0.xxxx to c08c.60e0.xxxx 1.1 15.0(1r)SG10 03.04.02.SG Ok 2 4c4e.358d.xxxx to 4c4e.358d.xxxx 1.0 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+---------------------------------- 1 Active Supervisor SSO Active
Ebenfalls deutet ein show switch virtual auf ein wieder vollfunktionsfähiges VSS hin:
#show switch virtua Executing the command on VSS member switch role = VSS Active, id = 2 Switch mode : Virtual Switch Virtual switch domain number : 100 Local switch number : 2 Local switch operational role: Virtual Switch Active Peer switch number : 1 Peer switch operational role : Virtual Switch Standby Executing the command on VSS member switch role = VSS Standby, id = 1 Switch mode : Virtual Switch Virtual switch domain number : 100 Local switch number : 1 Local switch operational role: Virtual Switch Standby Peer switch number : 2 Peer switch operational role : Virtual Switch Active
Nun muss man eigentlich nur noch den Vorgang auf dem aktiven 4500x wiederholen, dazu macht mein ein „redundancy force-switchover“, damit wird ein switchover der aktiven Rolle auf den Standby ausgelöst und gleichzeitig der Switch neugestartet.
4500x#redundancy force-switchover This will reload the active unit and force switchover to standby[confirm] Preparing for switchover..
Verfolgen kann man das ganze ebenfalls wieder mit dem erwähnten „show module“ bzw. „show redundancy state“. Irgendwann sollte auch dieser Switch wieder die ersten Lebenszeichen geben. (Hat gefühlt etwa 5min zum Booten)
Apr 26 09:17:04: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session UP on slot 1 port 2. Apr 26 09:17:04: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session UP on slot 1 port 1. Apr 26 09:17:20: %VSLP-5-VSL_UP: Ready for control traffic Apr 26 09:17:25: %VSLP-5-RRP_ROLE_RESOLVED: Role resolved as ACTIVE by VSLP Apr 26 09:17:25: %EC-5-BUNDLE: Interface TenGigabitEthernet1/1/1 joined port-channel Port-channel10 Apr 26 09:17:25: %EC-5-BUNDLE: Interface TenGigabitEthernet1/1/2 joined port-channel Port-channel10 Apr 26 09:17:25: %C4K_REDUNDANCY-6-DUPLEX_MODE: The peer Supervisor has been detected Apr 26 09:17:45: %SW_LEVEL-6-RESULT: Operational redundancy mode is UNKNOWN, due to software license-level mismatch at ACTIVE and STANDBY. Software Level on Active: entservices; on Standby: entservices. Apr 26 09:18:04: %C4K_REDUNDANCY-6-MODE: ACTIVE supervisor initializing for sso mode Apr 26 09:18:04: %C4K_REDUNDANCY-3-COMMUNICATION: Communication with the peer Supervisor has been established Apr 26 09:18:14: %C4K_REDUNDANCY-5-CONFIGSYNC: The bootvar has been successfully synchronized to the standby supervisor Apr 26 09:18:14: %C4K_REDUNDANCY-5-CONFIGSYNC: The config-reg has been successfully synchronized to the standby supervisor Apr 26 09:18:14: %C4K_REDUNDANCY-5-CALENDAR: The calendar has been successfully synchronized to the standby supervisor for the first time Apr 26 09:18:14: %C4K_REDUNDANCY-5-CONFIGSYNC: The startup-config has been successfully synchronized to the standby supervisor Apr 26 09:18:14: %C4K_REDUNDANCY-5-CONFIGSYNC: The private-config has been successfully synchronized to the standby supervisor Apr 26 09:18:15: %C4K_REDUNDANCY-5-CONFIGSYNC_RATELIMIT: The vlan database has been successfully synchronized to the standby supervisor
Schlussendlich sollte man folgenden Output sehen:
4500x#show module Switch Number: 1 Role: Virtual Switch Active Chassis Type : WS-C4500X-16 Power consumed by backplane : 0 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 16 4500X-16 10GE (SFP+) WS-C4500X-16 JAE1733xxx 2 8 10GE SFP+ C4KX-NM-8 JAE1714xxxx M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 f0f7.55c4.xxx to f0f7.55c4.xxx 1.1 15.0(1r)SG10 03.04.05.SG Ok 2 d072.dc91.xxx to d072.dc91.xxxx 1.0 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+---------------------------------- 1 Active Supervisor SSO Active Switch Number: 2 Role: Virtual Switch Standby Chassis Type : WS-C4500X-16 Power consumed by backplane : 0 Watts Mod Ports Card Type Model Serial No. ---+-----+--------------------------------------+------------------+----------- 1 16 4500X-16 10GE (SFP+) WS-C4500X-16 JAE1733xxxx 2 8 10GE SFP+ C4KX-NM-8 JAE171xxxx M MAC addresses Hw Fw Sw Status --+--------------------------------+---+------------+----------------+--------- 1 c08c.60e0.xxxx to c08c.60e0.xxxx 1.1 15.0(1r)SG10 03.04.05.SG Ok 2 4c4e.358d.xxx to 4c4e.358d.xxxx 1.0 Ok Mod Redundancy role Operating mode Redundancy status ----+-------------------+-------------------+---------------------------------- 1 Standby Supervisor SSO Standby hot