Editing Syscon Error Codes
Jump to navigation
Jump to search
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
== Description == | == Description == | ||
[[SC EEPROM|Syscon memory]] | [[SC EEPROM|Syscon memory]] contains a table of size 0x100 bytes intended to store error codes, every error code is composed by 4 bytes + another 4 bytes for its timestamp, in total the table can store 32 errors. When the table is full of errors and a new error needs to be stored syscon deletes the oldest<br> | ||
The timestamps are in UTC format (number of elapsed seconds since 2000) | |||
== How to get the syscon error log == | == How to get the syscon error log == | ||
If the PS3 still boots to the XMB and is able to install and run | If the PS3 still boots up to the XMB and is able to install and run apps you can use programs like the ones mentioned at top of [[Platform_ID#Apps|Platform ID]] page<br> | ||
If the PS3 | If the PS3 doesnt boots is still posible to retrieve the syscon error log by connecting a PC to syscon UART port using a "USB to TTL UART adapter" and running the command '''errlog'''. There is also the command '''clearerrlog''' to empty the error table (handy to prevent confusions with old error codes that could be cummulated along the months/years and not related with the actual problem) | ||
== Error code format == | == Error code format == | ||
The error codes follows the format: '''<span style="background:#000000; color:#ffffff;"> | The error codes follows the format: '''<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">R</span><span style="background:#ffff80;">SS</span><span style="background:#a0a0ff;">C</span><span style="background:#ff8080;">EEE</span>''', where: | ||
*'''<span style="background:#000000; color:#ffffff;">A</span>''' (Fixed) | |||
**A = This is always "A" | |||
*'''<span style="background:#909090; color:#ffffff;">R</span>''' (Reserved) | |||
**0-E = Unknown | |||
**F = Frequent error (For example, Motherboard Damage/Breakdown, etc.) | |||
*'''<span style="background:#ffff80;">SS</span>''' (Step Number) | |||
**00-7F = Step Number of the Power On Sequence (POS). This is the Power On Self Test (POST) process. If successful, the BOOT process begins, which loads the OS. | |||
**80 = Static State (Power ON). The console completed the POST and was in a static state. The error happened when the PS3 was powered on. You can get an error with Step No. 80 if your error occurs in game. For example, 80 1002 errors can happen if your NEC/TOKINs are going bad. | |||
**90 = Static State (Power OFF). The error happened when the PS3 was powering off. For example, if a problem causes the system to hang while shutting down the console will beep before powering off. An error with step no. 90 will be recorded in the errorlog. | |||
**A0 = Immediately after SYSCON reset. A reset pulse is sent to the console's main chipset to coordinate and synchronize them. If an error occurs immediately after SYSCON reset, it means it occurred before anything else can happen. For example, if the CPU is completely dead it will not respond to the reset pulse and an error will be generated immediately after reset. | |||
*'''<span style="background:#a0a0ff;">C</span>''' (Category) | |||
**1 = System Error | |||
**2 = Fatal Error | |||
**3 = Boot Error | |||
**4 = Data Error | |||
*<span style="background:#ff8080;">EEE</span> (Error) | |||
**Any number in hex | |||
Examples: | |||
<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">80</span><span style="background:#a0a0ff;">1</span><span style="background:#ff8080;">002</span> | |||
= System Error 002 (RSX VRAM Power Fail) which occurred while the System was successfully powered On. | |||
<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">40</span><span style="background:#a0a0ff;">3</span><span style="background:#ff8080;">034</span> | |||
= Fatal Booting Error 034 (RSX/CELL BE Communication Error) which occurred at step no. 40, before the Power On Sequence completed. | |||
While the Reserved Area and Step Number can be useful to figure out when the error occurred and how frequent it is, the last four numbers are the most important for figuring out what the error means. So the following section will only list the last 4 numbers (category + error). | |||
== Error codes == | == Error codes == | ||
Line 201: | Line 41: | ||
---- | ---- | ||
==== 1001 | ==== 1001 ==== | ||
Cell Vram Power | |||
Speculation:<br> | |||
1001 errors happen when the system encounters an unexpected shutdown. They often occur in testing, when the console is turned on/off a lot, instead of graceful shutdown. They have been associated with other errors, but there doesn't appear to be any single cause. | |||
The hypothesis that this error is associated with insufficient Filtering on CPU's core voltage (VDDC) has not been confirmed. There is a range of voltage ripple/noise that "should" cause errors before it gets so bad it causes a CELL BE VDDC Power Failure (3003). There are numerous SMD components involved in filtering, but the main concern are the NEC/TOKIN Proadlizers (capacitors). 1002 errors are the fingerprint of bad tokins on the GPU, but 1001 has not been shown to have the same association with the CPU's filter. However, a connection is strongly suspected. | |||
==== 1002 ==== | |||
RSX Vram Power | |||
This error has been associated with insufficient Filtering on RSX_VDDC. There is a range of voltage ripple/noise that will cause this error before it gets so bad it causes an RSX_VDDC Power Failure (3004). YLOD's causing 1002's range in duration from 2 seconds to only occurring during intense games. <br> | |||
This error | |||
<br> | <br> | ||
There are | There are numerous SMD components involved in filtering, but the main concern are the NEC/TOKIN Proadlizers (capacitors). 1002 errors are the fingerprint of bad tokins. | ||
==== | ==== 1004 ==== | ||
PSU Power | |||
==== 1103 ==== | |||
Thermal | |||
==== 1200 ==== | |||
CELL BE Thermal Error | |||
==== 1200 | |||
CPU Overheat. This is a common error. The usual culprit is failed Thermal Interface Material (TIM). As the material ages it "dries" allowing air inside. Air is a heat insulator, reducing the TIM's ability to transfer enough heat away from the processor. The system fan will steadily get louder over time until it cannot keep up. Once the processor approaches it's Thermal Shutdown Temperature a Yellow LED begins flashing on the console (Early Phat Models). Once it reaches the Thermal Shutdown Temperature the console will beep three times and hard shutdown, flashing red until the console is unplugged and the error state reset. Error 1200 is generated in the SYSCON errorlog. | CPU Overheat. This is a common error. The usual culprit is failed Thermal Interface Material (TIM). As the material ages it "dries" allowing air inside. Air is a heat insulator, reducing the TIM's ability to transfer enough heat away from the processor. The system fan will steadily get louder over time until it cannot keep up. Once the processor approaches it's Thermal Shutdown Temperature a Yellow LED begins flashing on the console (Early Phat Models). Once it reaches the Thermal Shutdown Temperature the console will beep three times and hard shutdown, flashing red until the console is unplugged and the error state reset. Error 1200 is generated in the SYSCON errorlog. | ||
Line 257: | Line 71: | ||
If that still doesn't work, it could be an issue with the temperature monitor chip (IC1101). Beyond that, some users have noted that dead CPU's can throw error 1200. However, that's the limit of our current understanding. It could be dead, or have another unexplained issue, but usually reflowing or reballing is the last ditch effort to revive such a console. | If that still doesn't work, it could be an issue with the temperature monitor chip (IC1101). Beyond that, some users have noted that dead CPU's can throw error 1200. However, that's the limit of our current understanding. It could be dead, or have another unexplained issue, but usually reflowing or reballing is the last ditch effort to revive such a console. | ||
==== 1201 | ==== 1201 ==== | ||
RSX Thermal Error | |||
GPU Overheat. This is the same as error 1200 above, except it's for the GPU. The same repair steps apply, except it's Temperature Monitor Chip is IC2101. | |||
==== | ==== 1203 ==== | ||
Cell voltage regulators thermal | |||
==== | ==== 1204 ==== | ||
Southbridge thermal | |||
==== 1205 ==== | |||
EE/GS thermal | |||
==== 1301 | ==== 1301 ==== | ||
Cell PLL | |||
==== 14FF ==== | |||
Check stop | |||
==== 1601 ==== | |||
BE Livelock Detection | |||
==== | |||
Speculation: | Speculation: | ||
If a YLOD turns into a GLOD after reball/reflow then 1601 (with or without 1701) could mean the RSX RAM was damaged. This is a loose association based on a few user reports. | |||
==== 1701 ==== | |||
Cell attention | |||
==== 1802 ==== | |||
RSX init | |||
==== 1900 ==== | |||
==== 1900 | |||
RTC voltage | RTC voltage | ||
==== 1901 | ==== 1901 ==== | ||
RTC oscilator | RTC oscilator | ||
==== 1902 | ==== 1902 ==== | ||
RTC access | RTC access | ||
---- | ---- | ||
=== Fatal | === Fatal === | ||
---- | ---- | ||
==== 2001 | ==== 2001 ==== | ||
Cell | |||
==== 2002 | ==== 2002 ==== | ||
RSX | |||
==== 2003 | ==== 2003 ==== | ||
Southbridge | |||
==== 2010 ==== | |||
Clock 1 | |||
==== | ==== 2011 ==== | ||
Clock 3 | |||
==== 2012 ==== | |||
Clock 2 | |||
==== 2013 ==== | |||
Clock 4 | |||
==== 2020 ==== | |||
HDMI | |||
==== 2022 | ==== 2022 ==== | ||
DVE Error (CXM4024R MultiAV controller for analog out) | |||
DVE Error ( | |||
==== 2024 ==== | |||
This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD). | This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD). | ||
2124 and 2024 errors | 2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either. | ||
==== 2030 ==== | |||
Thermal Sensor Error (IC1101) CELL BE Temp. Monitor | |||
==== 2030 | |||
Speculation: 2030-33 errors reported in case of dodgy PWR/EJT daughter board. | Speculation: 2030-33 errors reported in case of dodgy PWR/EJT daughter board. | ||
==== 2031 | ==== 2031 ==== | ||
Thermal sensor Error (IC2101) RSX Temp. Monitor | |||
==== 2033 | ==== 2033 ==== | ||
Thermal sensor 3 | |||
==== | ==== 2101 ==== | ||
Cell | |||
==== 2102 ==== | |||
RSX | |||
==== 2103 ==== | |||
Southbridge | |||
==== 2103 | |||
Southbridge | |||
==== | ==== 2110 ==== | ||
Clock 1 | |||
==== 2111 ==== | |||
Clock 3 | |||
==== 2112 ==== | |||
Clock 2 | |||
==== 2113 ==== | |||
Clock 4 | |||
==== 2120 ==== | |||
HDMI | |||
==== 2122 ==== | |||
DVE | |||
==== 2124 ==== | |||
====2124 | |||
This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD). | This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD). | ||
2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either. | 2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either. | ||
====2130 | ==== 2130 ==== | ||
Thermal sensor 1 | |||
====2131 | ==== 2131 ==== | ||
Thermal sensor 2 | |||
====2133 | ==== 2133 ==== | ||
Thermal sensor 3 | |||
==== 2203 | ==== 2203 ==== | ||
Southbridge | |||
---- | ---- | ||
=== Boot === | |||
=== | |||
---- | ---- | ||
====3000==== | ==== 3000 ==== | ||
Power | Power | ||
====3001==== | ==== 3001 ==== | ||
12v Power Failure | 12v Power Failure | ||
Usually this caused by a bad Power Supply Unit (PSU). | Usually this caused by a bad Power Supply Unit (PSU). | ||
Alternatively, a failure on the 12v_main line can cause it. Check fuses, capacitors, resistors, and | Alternatively, a failure on the 12v_main line can cause it. Check fuses, capacitors, resistors, and IC's on the 12v line. Measure resistance of the large 2 prong 12v connector on the motherboard. It should read in the Kilo ohms range if there is sufficient separation. Otherwise you may have a short on the line that need to be found and repaired. | ||
====3002==== | ==== 3002 ==== | ||
Power | Power | ||
====3003 | ==== 3003 ==== | ||
VDDC CELL BE Power Failure | |||
This error will occur in the case of a PWR failure on the main core voltage of the CPU | This error will occur in the case of a PWR failure on the main core voltage of the CPU. For example, if the filtering capacitors (NEC/TOKIN's) are severely damaged. There are other SMD's in that filter, so it could be related to them as well. | ||
==== 3004 ==== | |||
VDDC RSX Power Failure | |||
This error will occur in the case of a PWR failure on the main core voltage of the GPU. For example, if the filtering capacitors (NEC/TOKIN's) are severely damaged. There are other SMD's in that filter, so it could be related to them as well. | |||
==== 3010 ==== | |||
Cell BE Error | |||
Observation: A user triggered this error by injecting 3.3V into PWRGD (power good) of IC6103 (NCP5318 CPU Buck Controller). It generated error 20 1001 and 20 3010. | |||
==== 3011 ==== | |||
Cell | |||
==== 3012 ==== | |||
Cell | |||
==== 3013 ==== | |||
====3013==== | |||
BE_SPI DI/DO ERROR | BE_SPI DI/DO ERROR | ||
CELL not communicating to syscon via SPI (1.2V MC2_VDDIO and 1.2V BE_VCS no output) = Possible shorts on the line, check C4001 and trailing caps. Possible dead CPU? | |||
Another user had one on a CPU he damaged while deliding. | Another user had one on a CPU he damaged while deliding. | ||
==== 3020 ==== | |||
Cell | |||
==== 3030 ==== | |||
Cell | |||
==== 3031 ==== | |||
Cell | |||
==== 3032 ==== | |||
CELL BE Error | |||
+1.2v_YC_RC_VDDIO PWR Fail? | |||
==== 3033 ==== | |||
Cell | |||
==== | ==== 3034 ==== | ||
Cell BE / RSX Communication Error | |||
This is the most common error seen in early Phat model PS3's with the hottest 90nm RSX and CELL processors. It is the hallmark of a BGA defect (such as a cracked solder ball). It is by no means limited to the early models, however. These arrors have been seen in every model of PS3 with varying frequency. The most reliable consoles appear to be those with a CPU/GPU of smaller manufacturing process, such as the Super Slim (SS) models (42xx and later) which have a 45nm CELL BE and 28nm RSX. The least reliable are the PS2 Backwards Compatable A-E Models, which have 90nm RSX/CELL BE. | |||
The root cause is mechanical fatigue due to thermal cycling. The materials used to contruct the motherboard and processors have different properties. For example, the cooefficient of thermal expansion for FR4 Fiberglass used in the Motherboard and Processor Substrate is different than that of the copper BGA pads, which is different than that of the Lead-Free solder used to join them. This means they will expand and contract at different rates as the chip heats up and cools down, which applies shearing force to the BGA. Over many thermal cycle this deforms the solder balls and cause a defect (Such as a solder crack, torn trace, or the ball may pull away from the pad). | |||
3034 is triggered when the voltage or data lines connecting the CPU/GPU are broken. There is often a data error (4XXX) that also appears, but not always. The most common cause is a BGA defect on the RSX, which usually requires a reball/reflow to repair. Something about the RSX construction or workload causes it to fail more frequently, but the CPU can fail too. However, it's not always a BGA defect. The bumps on either chip can fail, Flex IO traces (the data lines that connect the CPU/GPU) can be broken/scratched, or accumulated damage from wear and tear (electromigration) can also cause this error. The true percentage of consoles with BGA defects that can be fixed with a reball/reflow is unknown. However, there is evidence to suggest that the underfill used to reinforce the CPU/GPU die and RSXRam bumps was not as effective when the PS3 was manufactured. This could explain many of the consoles who's reball fails prematurely afterwards. | |||
If a reflow/reball of both the CPU/GPU fails, then the chip is beyond repair and needs replaced. The RSX can be replaced with the same model without modification. It can be replaced with a different model using a modchip that injects the correct RSX ID during boot. This has been nicknamed a "Frankenstein Mod." Since they are married to each other, the CPU can only be replaced if also replacing the chipset (NAND/NOR and SYSCON Chips). Since the CPU can't as easily be replaced, a dead CPU is usually considered unrepairable. | |||
==== 3035 ==== | |||
Cell and RSX | |||
==== 3036 ==== | |||
Cell and RSX | |||
==== | ==== 3037 ==== | ||
Cell and RSX | |||
==== 3038 ==== | |||
Cell and RSX | |||
==== | ==== 3039 ==== | ||
Cell and RSX | |||
==== 3040 ==== | |||
====3040==== | |||
Flash | Flash | ||
---- | |||
=== Data === | |||
---- | |||
==== | ==== 4001 ==== | ||
Cell | |||
=== | ==== 4002 ==== | ||
RSX | |||
==== 4003 ==== | |||
====4003==== | |||
Southbridge | Southbridge | ||
====4011==== | ==== 4011 ==== | ||
Cell | |||
====4101==== | ==== 4101 ==== | ||
Cell | |||
====4102==== | ==== 4102 ==== | ||
RSX | |||
====4103==== | ==== 4103 ==== | ||
Southbridge | Southbridge | ||
====4111==== | ==== 4111 ==== | ||
Cell | |||
====4201==== | ==== 4201 ==== | ||
Cell | |||
====4202==== | ==== 4202 ==== | ||
RSX | |||
====4203==== | ==== 4203 ==== | ||
Southbridge | Southbridge | ||
====4211==== | ==== 4211 ==== | ||
Cell | |||
====4212==== | ==== 4212 ==== | ||
RSX | |||
====4221==== | ==== 4221 ==== | ||
Cell | |||
====4222==== | ==== 4222 ==== | ||
RSX | |||
====4231==== | ==== 4231 ==== | ||
Cell | |||
==== 4261==== | ==== 4261 ==== | ||
Cell | |||
====4301==== | ==== 4301 ==== | ||
Cell | |||
====4302==== | ==== 4302 ==== | ||
RSX | |||
==== 4303==== | ==== 4303 ==== | ||
Southbridge | Southbridge | ||
====4311==== | ==== 4311 ==== | ||
Cell | |||
==== | ==== 4312 ==== | ||
RSX | |||
==== | ==== 4321 ==== | ||
Cell | |||
==== | ==== 4322 ==== | ||
RSX | |||
==== | ==== 4332 ==== | ||
RSX | |||
==== | ==== 4341 ==== | ||
Cell | |||
==== | ==== 4401 ==== | ||
Cell or RSX | |||
==== | ==== 4402 ==== | ||
Cell or RSX | |||
==== | ==== 4403 ==== | ||
Cell or RSX | |||
==== | ==== 4411 ==== | ||
Cell or RSX | |||
==== | ==== 4412 ==== | ||
Cell or RSX | |||
==== | ==== 4421 ==== | ||
Cell or RSX | |||
==== 4422 ==== | |||
Cell or RSX | |||
==== 4432 ==== | |||
Cell or RSX | |||
==== 4441 ==== | |||
Cell or RSX | |||
{{Hardware Modification}}<noinclude> | {{Hardware Modification}}<noinclude>[[Category:Main]]</noinclude> | ||
[[Category:Main]] | |||
</noinclude> |