Troubleshooting system-level problems

Recovering from a system panic

To recover from a system panic:

  1. Copy the full PANIC message and the EIP number (address of the instruction being executed) from the console screen to your system log book. See ``Getting the EIP number'' for instructions on determining the EIP number.

  2. Reset or power-cycle the machine and press <Enter> at the Boot: prompt to reboot the system.

  3. When the system prompts you, respond yes to save a copy of memory at the time of the PANIC.

    NOTE: We recommend that you save this dump to tape so you have a record of the panic in case you need to compare it to the dump from another PANIC that happens later, and so that you have the dump to send to SCO OpenServer Support as discussed in ``Additional help from Technical Support''.

    The following illustrates how to save the dump to tape. This example uses /dev/rct0, but using /dev/rctmini works well if you have such a device on your system.

       There may be a system dump memory image in the swap device.
       Do you want to save it? (y/n) y

    Use Floppy Drive 0 (/dev/rfd0) by default Press ENTER to use default device. Enter valid Floppy Drive number to use if different. Enter "t" to use tape. > t

    Enter choice of tape drive : 1 - /dev/rct0 2 - /dev/rctmini n - no, QUIT > 1

    Insert tape cartridge and press return, or enter q to quit. > <insert tape

    Wait. dd if=/dev/swap of=/dev/rct0 bs=120b count=751 skip=0

    Done. Use /etc/ldsysdump to copy dump from tape or diskettes Press return to continue >

    We strongly recommend that you use tapes rather than floppy disks to save system dump images. The typical SCO OpenServer system has many megabytes of memory, so it takes several floppy disks to save a single image. Problems can arise if you do not have enough floppy disks, or if you insert them in the wrong order. You can run crash(ADM) on the dump from the dumpdev device, or reboot the system and copy this data to disk for study. See ``Examining a memory dump with crash(ADM)''.

    When it panics, the system writes the kernel image to the dumpdev device, which is usually the same as the swap device. The data will be overwritten as soon as any paging occurs on the system. See ``Defining the default dump device'' for more information.

  4. If you want to study the dump image with the crash(ADM) command, use the ldsysdump(ADM) command to copy the image to disk. In the sample session that follows, 06May94 is the name of the file to which the dump will be copied, but you can use any name that is meaningful:
       # cd /tmp
       # ldsysdump 06May94

    Use Floppy Drive 0 (/dev/rfd0) by default. Press ENTER to use the default. Enter valid Floppy Drive number to use if different than default. Enter "t" to use tape drive. > t

    Enter choice of tape drive: 1 - /dev/rct0 2 - /dev/rctmini n - no, QUIT > 1

    Insert tape cartridge and press return, or enter q to quit. >

    Wait. dd if=/dev/rct0 bs=120b count=751

    System dump copied into image. Use crash(ADM) to analyze the dump.

  5. At the prompt to check the root filesystem, answer ``y.'' This will check and, in most cases, fix any corruption on the root filesystem. In rare cases, the operating system becomes corrupted and must be restored or reinstalled. See ``Cleaning filesystems'' for more information.

  6. Run fsck(ADM) on those filesystems that were mounted when the system panicked. This happens automatically for all filesystems that are marked dirty when the system is brought up to multiuser state, but by running fsck manually, you can control the response to problems that are found. See the fsck(ADM) manual page for more information.

  7. Verify the integrity of the security system. See ``System file integrity checking: integrity(ADM)''.

Next topic: Recovering after a power failure
Previous topic: System crashes

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003