20118 Inter-Board HIG Communication Failure

Description

No heartbeat packet is received on the HiGig link.

Attribute

Alarm ID Alarm Severity Alarm Type
20118 Major Fault

Parameters

ID Name Meaning
1 Subrack No.

Subrack No.

2 Slot No.

Slot No.

3 Cpu No.

CPU No.

4 Backplane HIG Link No.

The disconnected backplane HiGig link No.

5 Alarm Attribute

0=Normal, Alarms period are longer than the Transient Threshold.

1=Transient Count, The times and the period of the alarms whose periods are shorter than the Transient Threshold are summed up. If the sum result is longer than the Alarm Occurrence Period Threshold or Alarm Occurrence Times Threshold, an alarm so called Transient Count Alarm is triggered. Because Transient Count Alarm is based on accumulative result, the recovery of the alarm may occur at least one Summing Cycle later. The Summing Cycle can be query by MML "LST STATSLDWIN".

Impact on the System

  • If only one or two HIGs fail, the service between the active and standby GSCUs decreases.
  • If three HIGs fail, the standby GSCU is unavailable for service, and the switch capability of the system decreases.

System Actions

  • The service carried by the faulty HIG is passed to other normal HIGs.
  • If all HIGs are fail,Inhabit all external GE ports on the standby board.

Possible Causes

  • The peer board is reset.
  • The heartbeat packets between the active and standby switching network boards cannot be processed when the CPU is busy.
  • The local board is faulty.
  • The peer board is faulty.
  • The backplane (including slots and circuit) is faulty.

Procedure

  1. Check whether the peer board is reset.

    Y => The peer board is reset. This alarm is caused by the restoration of the peer board. The alarm handling is complete.

    N => The peer board is not reset. Go to 2.

  2. Secure the installation of the local board and the peer board, and then check whether the alarm is cleared.

    Y => The alarm is cleared. The alarm handling is complete.

    N => The alarm persists. Go to 3.

  3. Reset the standby board, and check whether the alarm is cleared after the standby board restarts.

    Y => The alarm is cleared. The alarm handling is complete.

    N => The alarm persists. Go to 4.

  4. Replace the local board and wait until the board is successfully restarted. Check whether the alarm is cleared.

    Y => The alarm is cleared. The alarm handling is complete.

    N => The alarm persists. Go to 5.

  5. Switch the active board into standby state on the LMT, and then check whether the switch over succeeds.

    Y => The switchover succeeds. Go to 6.

    N => The switchover fails. Go to 7.

  6. Wait until the standby board is successfully started. Check whether the alarm is cleared.

    Y => The alarm is cleared. The alarm handling is complete.

    N => The alarm persists. Go to 8.

  7. Reset the active board and wait until the standby board is successfully started. Check whether the alarm is cleared.

    Y => The alarm is cleared. The alarm handling is complete.

    N => The alarm persists. Go to 8.

  8. Replace the current standby board and wait until the board is successfully restarted. Check whether the alarm is cleared.

    Y => The alarm is cleared. The alarm handling is complete.

    N => The alarm persists. Please contact Huawei Customer Service Center.


Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.