环境:SUN 6900

打完最新补丁,执行重启命令后,出现ERROR: Fast Data Access MMU Miss  报错,无法进入系统

{c} ok boot device
/
Can't open device
TL = 1, TT = 68. ERROR: Fast Data Access MMU Miss
TSTATE= 0x1402 [ccr = 0x0, asi = 0x0, pstate = 0x14, cwp = 0x2]
TPC= 00000000f0036c88
TNPC= 00000000f0036c8c
SFSR= 000000000080800b, TAGACCESS = 00000000ffffe000
D-SFAR = 00000000ffffffff
TICK= 800000a82fa025a7, TICKCMP = ffffffffffffffff

网上查询后说导致此报错有诸多原因,主要如下:

1. 硬件问题,主板、CPU、内存错误

2. 接口问题,建议重新插拔系统板

3. 软件问题,OBP 设置问题,是板卡的版本同步问题,建议执行

init 0 (or STOP+A)
set-defaults
reset-all
boot (or boot cdrom)

因为是重新启动,估计硬件问题可能性不大,所以执行reset-all ,


{c} ok
{c} ok reset-all
Resetting ...
{/N0/SB3/P0/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P0/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P0/C1} Use is subject to license terms.
{/N0/SB3/P1/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P1/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P1/C1} Use is subject to license terms.
{/N0/SB3/P2/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P2/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P2/C1} Use is subject to license terms.
{/N0/SB3/P3/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P3/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P3/C1} Use is subject to license terms.
{/N0/SB5/P0/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P0/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P0/C1} Use is subject to license terms.
{/N0/SB5/P1/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P1/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P1/C1} Use is subject to license terms.
{/N0/SB5/P2/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P2/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P2/C1} Use is subject to license terms.
{/N0/SB5/P3/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P3/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P3/C1} Use is subject to license terms.
Copying IO PROM to CPU DRAM
.{/N0/SB3/P0/C0} @(#) lpost     5.20.8  2007/11/20 10:33
{/N0/SB3/P0/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P0/C0} Use is subject to license terms.
{/N0/SB3/P1/C0} @(#) lpost      5.20.8  2007/11/20 10:33
.{/N0/SB3/P1/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
.{/N0/SB3/P1/C0} Use is subject to license terms.
..........................................
{/N0/SB3/P0/C0} Running PCI IO Controller Basic Tests
{/N0/SB3/P0/C0} Jumping to memory 00000000.00000020 [00000010]
{/N0/SB3/P0/C0} System PCI IO post code running from memory
{/N0/SB3/P0/C0} @(#) lpost      5.20.8  2007/11/20 10:37
{/N0/SB3/P0/C0} Running PCI IO Controller Functional Tests
{/N0/SB3/P0/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P0/C0} Use is subject to license terms.
{/N0/SB3/P0/C0} Subtest: PCI IO Controller Register Initialization for aid 0x18
{/N0/SB3/P0/C0} Running PCI IO Controller Ecc Tests
{/N0/SB3/P0/C0} Running SBBC Basic Tests
{/N0/SB3/P0/C0} Subtest: SBBC PCI Reg Initialization for aid 0x18
{/N0/SB3/P0/C0} Running Probe io Devices
{/N0/SB3/P0/C0} Running PCI IO Controller Basic Tests
{/N0/SB3/P0/C0} Subtest: PCI IO Controller Register Initialization for aid 0x19
{/N0/SB3/P0/C0} Running PCI IO Controller Functional Tests
{/N0/SB3/P0/C0} Running PCI IO Controller Ecc Tests
{/N0/SB3/P0/C0} Running Probe io Devices
{/N0/SB3/P0/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P0/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P0/C0} Use is subject to license terms.
{/N0/SB3/P0/C1} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P0/C1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P0/C1} Use is subject to license terms.
{/N0/IB6/P0} Passed
{/N0/IB6/P1} Passed
{/N0/SB3/P0/C0} Running Domain Level Tests
{/N0/SB3/P2/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P3/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P0/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P1/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P2/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB5/P3/C0} @(#) lpost      5.20.8  2007/11/20 10:33
{/N0/SB3/P2/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P3/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P0/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P1/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P2/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB5/P3/C0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB3/P0/C0} Running Domain Basic Tests
{/N0/SB3/P2/C0} Use is subject to license terms.
{/N0/SB3/P3/C0} Use is subject to license terms.
{/N0/SB5/P0/C0} Use is subject to license terms.
{/N0/SB5/P1/C0} Use is subject to license terms.
{/N0/SB5/P2/C0} Use is subject to license terms.
{/N0/SB5/P3/C0} Use is subject to license terms.
{/N0/SB3/P0/C0} Running Domain Advanced Tests
{/N0/SB3/P0/C0} Running Domain Stick Sync Tests
{/N0/SB3/P0/C0} Running Domain Verify Stick Sync Tests
{/N0/SB3/P0/C0}  DCB_DECOMP_OBP command succeeded
{/N0/SB3/P0/C0}  CORE 13 clearing 00000000.00000000 to 00000000.80000000
{/N0/SB3/P0/C0}  CORE 14 clearing 00000000.80000000 to 00000001.00000000
{/N0/SB3/P0/C0}  CORE 15 clearing 00000001.00000000 to 00000001.80000000
{/N0/SB3/P0/C0}  CORE 20 clearing 00000001.80000000 to 00000002.00000000
{/N0/SB3/P0/C0}  CORE 21 clearing 00000002.00000000 to 00000002.80000000
{/N0/SB3/P0/C0}  CORE 22 clearing 00000002.80000000 to 00000003.00000000
{/N0/SB3/P0/C0}  CORE 23 clearing 00000003.00000000 to 00000003.80000000
{/N0/SB3/P0/C0}  CORE 12 clearing 00000003.80000000 to 00000004.00000000
{/N0/SB3/P0/C0}  CORE 13 clearing 00000020.00000000 to 00000020.80000000
{/N0/SB3/P0/C0}  CORE 14 clearing 00000020.80000000 to 00000021.00000000
{/N0/SB3/P0/C0}  CORE 15 clearing 00000021.00000000 to 00000021.80000000
{/N0/SB3/P0/C0}  CORE 20 clearing 00000021.80000000 to 00000022.00000000
{/N0/SB3/P0/C0}  CORE 21 clearing 00000022.00000000 to 00000022.80000000
{/N0/SB3/P0/C0}  CORE 22 clearing 00000022.80000000 to 00000023.00000000
{/N0/SB3/P0/C0}  CORE 23 clearing 00000023.00000000 to 00000023.80000000
{/N0/SB3/P0/C0}  CORE 12 clearing 00000023.80000000 to 00000024.00000000
{/N0/SB3/P0/C0} Decompress OBP done
{/N0/SB3/P0/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P1/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P0/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P1/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P2/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P3/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P2/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB3/P3/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P0/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P1/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P0/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P1/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P2/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P3/C0}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P2/C1}  DCB_ENTER_OBP  command succeeded
{/N0/SB5/P3/C1}  DCB_ENTER_OBP  command succeeded

ChassisSerialNumber 0825MM2013

Sun Fire E6900
OpenFirmware version 5.20.8 (11/20/07 10:32)
Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
SmartFirmware, Copyright (C) 1996-2001.  All rights reserved.
32768 MB memory installed, Serial #71419714.
Ethernet address 0:14:4f:41:c7:42, Host ID: 8441c742.
 

再次执行启动命令,没有报错,正常启动。

====================================================================================

查到的其它原因和解决方案   PS:

There are two common Solaris[TM] 2.5.1 or 2.6 boot problems that are similar, but
have different solutions. There is some confusion as to what solution to use for
what problem. Both problems are that systems fail to boot to a Solaris 2.5.1 or
2.6 CD-ROM or JumpStart[TM] p_w_picpath. SRDB documents have been written for each
problem and this is a description of each of these and how to tell the difference.



PROBLEM #1

(see also SRDB 20576)

Any system that has a 450MHz UltraSPARC(R) II CPU or the 440MHz UltraSPARC IIi CPU
will not be able to boot Solaris 2.5.1 HW 11/97, Solaris 2.6 HW 3/98 or Solaris
2.6 HW 5/98 CD-ROM or JumpStart p_w_picpaths of these Operating Systems. It will fail
with "hme0:link down" or keyboard input multiple repeats.

These systems must be booted with a cdrom called the Operating Environment
Installation CD February 2000 (OECD), which is Part No. 704-7076-10. Booting
this OECD will then prompt for the 2.5.1 or 2.6 Solaris CD for the install
of the OS. This OECD can also be used to modify a JumpStart p_w_picpath to boot/load
2.5.1 or 2.6 on these systems. See the manual "Installing Solaris Software for
Selected Hardware" Part No. 806-4005-10 for procedure for modifying a jumpstart
p_w_picpath.

Once a jumpstart p_w_picpath is modified with the OECD, this p_w_picpath can be used to load
Solaris 2.5.1/2.6 on these systems. If the method of installation is via a cdrom,
then the OECD must always be used. The manual (included with the OECD) also
describes how to use the OECD for CD-ROM-based installs. Here is a list of the
systems affected:

        Ultra 5         Ultra 10
        Netra T105      Ultra 60
        Netra T1120     Netra T1125
        Ultra 80        Enterprise 420R
        Netra T1400     Netra T1405
        Ultra 450       Enterprise 220R

Only the above systems with 450MHz or 440MHz CPUs are effected by this problem.
You can check the speed of the cpu(s) by running the command ".speed" from the
OK prompt. The problem is permanently fixed by loading the most current kernel
patch. Patching the kernel to this level is one of the functions of the OECD. The
OECD actually has a Solaris 7 kernel that it boots to, then it kicks off an install
that prompts for the 2.5.1 or 2.6 cdrom for the OS install. It will load the kernel
patch during this install, so the kernel will be patched upon the first boot to
the new OS on the disk.

This CD-ROM is not included with Solaris and is usually included with the system
in the "Binary Code License" package. The OECD can be ordered by Sun personnel
from this website: http://acac.central/SAG/Templates/CD_Zero.html

NOTE: The OECD can also be used to fix the "NOTICE: Can't find driver for console
frame buffer" install/boot problem mentioned in SRDB 19271.

NOTE: The Ultra Enterprise[TM] 450 with a 480MHz CPU requires a special procedure.
See SRDB 24408.



PROBLEM #2

(see also SRDB 20149)

Enterprise Server systems that use the 400MHz UltraSPARC II with 8mb ecache CPU
module(s) will fail to boot to Solaris 2.5.1 or Solaris 2.6 cdrom (or JumpStart
p_w_picpath) with "Fast Data Access MMU Miss" and/or "mutex_enter: bad mutex". SRDB
20149 describes this problem and has the exact procedure to boot/load these
systems with 2.5.1/2.6.

The problem is that the kernel on the 2.5.1/2.6 CD-ROM cannot load with the
400MHz/8MB ecache CPU until the kernel patch is loaded. The workaround (seen
in the SRDB) is to run the command "limit-ecache-size" from the OK prompt before
booting to the 2.5.1/2.6 cdrom or JumpStart p_w_picpath. It will be able to boot the
cdrom or p_w_picpath immediately after this command is run. After the OS is loaded,
you will need to run this command again from the OK prompt to boot the system
until you load the latest kernel patch. Here are the systems (that potentially
have the 400MHz 8mb cache cpu installed) affected by this problem:

        E3000   E3500   E4000
        E4500   E5000   E5500
        E6000   E6500   

Only systems that have the 400MHz/8MB cache cpu(s) are affected.



Final Note:

Do not confuse these two different problems. Do not use the OECD on an Enterprise
server in the model range E3000 - E6500 (with 400MHz/8MB cache). These servers
are fixed with the limit-ecache-size workaround to boot/load 2.5.1/2.6. The
Ultra[TM] 5 - Ultra 450 group of systems (with 440/450 MHz CPU) are the ones that
need the OE cdrom to boot/load 2.5.1/2.6.