Syscon Error Codes: Difference between revisions

From PS3 Developer wiki
Jump to navigation Jump to search
(Added a part to the 1701 error detailing my experiences with it, the fix and what causes it in my case)
(178 intermediate revisions by 41 users not shown)
Line 1: Line 1:
== Description ==
== Description ==
[[SC EEPROM|Syscon memory]] contains a table of size 0x100 bytes intended to store error codes, every error code is composed by 4 bytes + another 4 bytes for its timestamp, in total the table can store 32 errors. When the table is full of errors and a new error needs to be stored syscon deletes the oldest<br>
[[SC EEPROM|Syscon memory]] includes a 0x100-byte table for storing error codes. Each error code consists of 4 bytes and an additional 4 bytes for its timestamp. The table can hold up to 32 errors. When the table reaches its maximum capacity and a new error needs to be stored, Syscon will delete the oldest error.<br>
The timestamps are in UTC format (number of elapsed seconds since 2000)


== How to get the syscon error log ==
== How to get the syscon error log ==
If the PS3 still boots up to the XMB and is able to install and run apps you can use programs like the ones mentioned at top of [[Platform_ID#Apps|Platform ID]] page<br>
If the PS3 still boots to the XMB and is able to install and run applications, you can use programs such as those listed at the top of the [[Platform_ID#Apps|Platform ID]] page.
If the PS3 doesnt boots is still posible to retrieve the syscon error log by connecting a PC to syscon UART port using a "USB to TTL UART adapter" and running the command '''errlog'''. There is also the command '''clearerrlog''' to empty the error table (handy to prevent confusions with old error codes that could be cummulated along the months/years and not related with the actual problem)
If the PS3 fails to boot, you can still get the error log by connecting a PC to the PS3's UART port using a "USB to TTLUART Adapter" and running the errlog command. There is also the clearerrlog command to clear the error table (handy to avoid confusion with old error codes that might have accumulated along the months/years and not related to the actual problem).
 
== Error log format ==
There are 2 error log formats that depends of the syscon type: for [[Mullion]], or for [[Sherwood]].<br>
The error codes and the timestamps are stored in little endian (right to left)<br>
The timestamps are in J2000 format (number of elapsed seconds since 2000/1/1 12:00:00). They can be converted to the standard Unix epoch and then summed 30 years minus 12 hours (or 946684800 seconds). Check the link to the right for information: [https://stackoverflow.com/questions/35763357/conversion-from-unix-time-to-timestamp-starting-in-january-1-2000 1]<br>
If the battery was empty or removed when the error was triggered, the timestamp will be recorded as FFFFFFFF.<br>
If the battery is replaced but the time is not configured in GameOS either manually or by network, the error log will seem to store timestamps starting with a date around 2005/12/31 00:00:00 (0x0B488680).<!-- this needs confirmation--><br>
More info about error log timestamp formats and loops in the [[Talk:Syscon_Error_Codes | Talk page]]<br>
*[https://www.epochconverter.com/ Unix epoch] starts counting at 1970/1/1 00:00:00
*[https://en.wikipedia.org/wiki/Julian_year_%28astronomy%29#Epochs J2000 epoch] starts counting at 2000/1/1 12:00:00
 
<!-- It seems the error code FFFFFFFF represents the end of the loop. When the errorlog is cleared (empty) the first error is stored at most top, and the next errors follows an order from top to bottom. But after the errorlog is filled with 32 errors syscon needs to start overwriting the old errors by storing an aditional control error (FFFFFFFF) to indicate where the loop ends. In other words, when the errorlog is filled syscon needs to write 2 errors always, the real one and the control error (FFFFFFFF) displaced to a different offset -->
 
<div style="float:left"><div style="float:top">
{{boxcodelite|float=left|title=Syscon Errorlog from [[CECHAxx]], [[COK-001]], [[CXR713120-201GB]]|code=
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                               
00003700  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
00003710  01 10 80 A0  01 10 80 A0  04 10 80 A0  01 10 80 A0
00003720  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
00003730  01 10 80 A0  04 10 80 A0  01 10 80 A0  01 10 80 A0
00003740  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
00003750  01 10 80 A0  01 10 80 A0  04 30 09 A0  04 30 09 A0
00003760  04 30 09 A0  04 30 09 A0  FF FF FF FF  01 10 80 A0
00003770  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
                                               
00003780  20 CF 6D 16  13 23 A7 16  3E D6 D9 16  87 13 2A 17
00003790  17 3C 7C 17  E4 A2 A3 17  A2 15 D4 17  13 FB EB 17
000037A0  CD 7D EF 17  33 85 EF 17  12 8C EF 17  A7 D9 FB 17
000037B0  58 5E 0E 18  BB C9 66 18  CD 25 B5 18  49 C4 29 19
000037C0  75 D5 F9 19  04 8B 61 1B  17 67 D0 22  2D 67 D0 22
000037D0  03 07 6C 27  12 09 6C 27  FF FF FF FF  FF FF FF FF
000037E0  FF FF FF FF  FF FF FF FF  F0 E7 27 16  06 BD 33 16
000037F0  E5 DE 38 16  DD D4 5C 16  C4 AC 6C 16  EA C7 6D 16
}}
*In the error log sample above:
**Error log looped at least 1 time (1 errorcode FFFFFFFF)
**Timestamps are valid, and the time was configured
**Contains errors: A080<span style="background:#bbbbff;">&thinsp;1&thinsp;</span><span style="background:#ff8080;">&thinsp;001&thinsp;</span>, A080</span><span style="background:#bbbbff;">&thinsp;1&thinsp;</span><span style="background:#ff8080;">&thinsp;004&thinsp;</span>, A009</span><span style="background:#bbbbff;">&thinsp;3&thinsp;</span><span style="background:#ff8080;">&thinsp;004&thinsp;</span>
</div></div>
 
<div style="float:left"><div style="float:top">
{{boxcodelite|float=left|title=Syscon Errorlog from [[CECHHxx]], [[DIA-001]], [[CXR714120-301GB]]|code=
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                               
00003700  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003710  22 43 40 A0  34 30 40 A0  FF FF FF FF  34 30 40 A0
00003720  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003730  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003740  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003750  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003760  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003770  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
                                               
00003780  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
00003790  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037A0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037B0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037C0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037D0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037E0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037F0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
}}
*In the errorlog sample above:
**Errorlog looped at least 1 time (1 errorcode FFFFFFFF)
**Timestamps are invalid
**Contains errors: A040<span style="background:#bbbbff;">&thinsp;4&thinsp;</span><span style="background:#ff8080;">&thinsp;322&thinsp;</span>, A040</span><span style="background:#bbbbff;">&thinsp;3&thinsp;</span><span style="background:#ff8080;">&thinsp;034&thinsp;</span>
</div></div>
{{clear}}
 
<div style="float:left"><div style="float:top">
{{boxcodelite|float=left|title=Syscon Errorlog from [[CECHLxx]]/[[CECHMxx|Mxx]]/[[CECHPxx|Pxx]]/[[CECHQxx|Qxx]], [[VER-001]], [[SW-301]]|code=
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                               
00000900  31 21 80 A0  73 D4 50 0B  30 20 40 A0  89 D4 50 0B
00000910  31 21 80 A0  8C D4 50 0B  30 20 80 A0  8E D4 50 0B
00000920  30 21 80 A0  8E D4 50 0B  31 21 80 A0  8F D4 50 0B
00000930  01 30 00 A0  FF FF FF FF  FF FF FF FF  FF FF FF FF
00000940  30 20 80 A0  75 D1 50 0B  30 20 80 A0  78 D1 50 0B
00000950  30 20 80 A0  B2 D1 50 0B  31 20 80 A0  B3 D1 50 0B
00000960  30 21 80 A0  BD D1 50 0B  30 21 80 A0  D5 D1 50 0B
00000970  30 21 80 A0  DF D1 50 0B  30 20 80 A0  E0 D1 50 0B
00000980  31 21 80 A0  84 D2 50 0B  30 21 80 A0  DC D2 50 0B
00000990  31 21 32 A0  4F D3 50 0B  31 20 40 A0  50 D3 50 0B
000009A0  31 21 80 A0  51 D3 50 0B  30 21 80 A0  57 D3 50 0B
000009B0  31 21 80 A0  59 D3 50 0B  31 21 80 A0  FF D3 50 0B
000009C0  31 21 80 A0  05 D4 50 0B  30 20 80 A0  06 D4 50 0B
000009D0  30 20 80 A0  07 D4 50 0B  31 21 80 A0  2D D4 50 0B
000009E0  31 20 80 A0  3A D4 50 0B  30 21 80 A0  42 D4 50 0B
000009F0  30 21 80 A0  72 D4 50 0B  31 21 80 A0  72 D4 50 0B
}}
*In the errorlog sample above:
**Errorlog looped at least 1 time (1 errorcode FFFFFFFF)
**Timestamps are valid, but the time was not configured
**Contains errors: A080<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;131&thinsp;</span>, A040<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;030&thinsp;</span>, A080<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;030&thinsp;</span><br>A080<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;130&thinsp;</span>, A000<span style="background:#bbbbff;">&thinsp;3&thinsp;</span><span style="background:#ff8080;">&thinsp;001&thinsp;</span>, A080<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;031&thinsp;</span><br>A032<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;131&thinsp;</span>, A040<span style="background:#bbbbff;">&thinsp;2&thinsp;</span><span style="background:#ff8080;">&thinsp;031&thinsp;</span>
</div></div>
 
<div style="float:left"><div style="float:top">
{{boxcodelite|float=left|title=Syscon Errorlog from [[CECH-42xx]], [[PQX-001]], [[SW3-304]]|code=
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                               
00000900  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000910  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000920  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000930  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000940  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000950  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000960  02 18 61 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
00000970  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
00000980  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
00000990  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
000009A0  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
000009B0  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
000009C0  34 30 40 A0  FF FF FF FF  FF FF FF FF  FF FF FF FF
000009D0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000009E0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000009F0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
}}
*In the errorlog sample above:
**Errorlog not looped (more than 1 errorcode FFFFFFFF)
**Timestamps are invalid
**Contains errors: A061<span style="background:#bbbbff;">&thinsp;1&thinsp;</span><span style="background:#ff8080;">&thinsp;802&thinsp;</span>, A040</span><span style="background:#bbbbff;">&thinsp;4&thinsp;</span><span style="background:#ff8080;">&thinsp;002&thinsp;</span>, A040</span><span style="background:#bbbbff;">&thinsp;3&thinsp;</span><span style="background:#ff8080;">&thinsp;034&thinsp;</span>
</div></div>{{clear}}


== Error code format ==
== Error code format ==
The error codes follows the format: '''<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">R</span><span style="background:#ffff80;">SS</span><span style="background:#a0a0ff;">C</span><span style="background:#ff8080;">EEE</span>''', where:
The error codes follows the format: '''<span style="background:#000000; color:#ffffff;">&thinsp;A&thinsp;</span><span style="background:#909090; color:#ffffff;">&thinsp;R&thinsp;</span><span style="background:#ffff80;">&thinsp;ST&thinsp;</span><span style="background:#bbbbff;">&thinsp;C&thinsp;</span><span style="background:#ff8080;">&thinsp;ERR&thinsp;</span>''', where:
*'''<span style="background:#000000; color:#ffffff;">A</span>''' (Fixed)
 
**A = unknown
'''<span style="background:#000000; color:#ffffff;">&thinsp;A&thinsp;</span>''' (Fixed)
*'''<span style="background:#909090; color:#ffffff;">R</span>''' (Reserved)
'''A''' = This is always "A"
**0-E = Unknown
 
**F = Frequent error
 
*'''<span style="background:#ffff80;">SS</span>''' (Step)
'''<span style="background:#909090; color:#ffffff;">&thinsp;R&thinsp;</span>''' (Reserved)
**00-7F = Step of the power on sequence where the error happened
'''0-E''' = Unknown
**80 = The error happened when the PS3 was powered on
*'''F''' = Frequent error (For example, Motherboard Damage/Breakdown, etc.)
**90 = The error happened when the PS3 was powered off
 
**A0 = The error happened after a syscon reset
 
*'''<span style="background:#a0a0ff;">C</span>''' (Category)
'''<span style="background:#ffff80;">&thinsp;ST&thinsp;</span>''' (Step Number)
**1 = System
'''00-7F''' = Step Number of the Power On Sequence (POS). This is the Power On Self Test (POST) process. If successful, the BOOT process begins, which loads the OS.
**2 = Fatal
 
**3 = Boot
 
**4 = Data
'''80''' = Static State (Power ON). The console completed the POST and was in a static state. An error occurred during the PS3 boot-up process.  If you encounter an error during gameplay, it may happen with Step No. 80. For instance, if your NEC/TOKINs are failing, you may experience 80 1002 errors.
*<span style="background:#ff8080;">EEE</span> (Error)
 
**Any number in hex
 
'''90''' = Static State (Power OFF). The error happened when the PS3 was powering off. For example, if a problem causes the system to hang while shutting down the console will beep before powering off. An error with step no. 90 will be recorded in the errorlog.
 
 
'''A0''' = Immediately after SYSCON reset. When power is supplied to the PS3, it should enter Standby mode. Early "Phat" models will have a solid red LED illuminated to indicate Standby mode. The PS3 contains a Standby circuit that requires constant power so that it can wait for the user to turn on the console. Many other electronics also utilize this feature. This is also referred to as a Vampire circuit, as it uses constant power even when a user is not using the console. It enables the ability to turn it on remotely, thus mitigating the need to physically flip a switch.
 
The PS3 reset circuit comprises the SYSCON and its clock generating crystal, Bluetooth/WIFI card, front power/eject and LED panel, and thermal monitor ICs. The SYSCON must detect whether the console is being manually started with the power button or over Bluetooth using the controller, hence these modules must be powered. It is important to ensure proper functioning of the thermal monitors before releasing power to the Southbridge, CPU, or GPU. Neglecting this step could pose a fire hazard. The thermal monitors ICs serve as the PS3's fire alarms, and are critical safety equipment.
 
If there is a hardware issue in the circuit, an error will occur after the SYSCON reset, preventing the console from powering on. Instead of a solid red LED, the front LED will flash red indefinitely upon plugging in the console, indicating an error with the standby circuit. There will also be associated error log in the syscon which can help identify the specific component causing the issue.
 
5v_MISC powers the reset circuit, consisting of SMD/SMT components and ICs that power the modules above. Please refer to the Service manual (if available) for details. In particular, in the COK-001 service manual, the circuit diagram is provided on page 23/45. From the diagram, it's evident that DC/DC converter IC6005 generates +3.3v_EVER, while IC6006 produces +1.8V_EVER, and IC6009 is responsible for generating +3.3V_THERMAL. These are the primary voltages utilized by the component in the reset circuit, such as the Wi-Fi/Bluetooth card necessary for initiating console startup remotely by pressing the PS button on the controller.
 
 
'''<span style="background:#bbbbff;">&thinsp;C&thinsp;</span>''' (Category)
*'''1''' = System Error
*'''2''' = Fatal Error
*'''3''' = Booting Error
*'''4''' = Data Error
 
 
'''<span style="background:#ff8080;">&thinsp;ERR&thinsp;</span>''' (Error)
*This is a 3-Digit error code that gives specific information about the issue. For example, System error 002 (1002) means "RSX VRM Failure."
 
'''Discussion'''
----
The three-digit error code may repeat in other categories, but that does not necessarily indicate the same issue. System error <span style="background:#ff8080;">001</span> and Fatal error <span style="background:#ff8080;">001</span> don't mean the same thing. <span style="background:#bbbbff;">1</span><span style="background:#ff8080;">001</span> is "BE VRM Power Failure" and <span style="background:#bbbbff;">2</span><span style="background:#ff8080;">001</span> is "BE Error." We wouldn't be able to distinguish the <span style="background:#ff8080;">error</span> just by referencing it. This alone lacks sufficient information to comprehend the problem.  The <span style="background:#bbbbff;">category</span> gives the <span style="background:#ff8080;">error</span> context. To provide context, we use the 4-digit code <span style="background:#bbbbff;">C</span><span style="background:#ff8080;">ERR</span> to differentiate them from one another.


The list below only includes the last 4 numbers (category + error)
Likewise, the <span style="background:#ffff80;">Step Nump (ST)</span> provides context to the 4-digit code. For example, you can have a CELL VRM Power Failure occur while playing an intense game (Static State, Power on). That would generate an <span style="background:#ffff80;">80</span><span style="background:#bbbbff;">1</span><span style="background:#ff8080;">001</span> (Step number <span style="background:#ffff80;">80</span>). However, this error can also occur when the console is turned on, during the Power On Sequence Testing (POST), before it has a chance to start the boot loader, which loads the operating system, which in turn allows you to load the game. This time, it might generate error <span style="background:#ffff80;">10</span><span style="background:#bbbbff;">1</span><span style="background:#ff8080;">001</span>. Step number <span style="background:#ffff80;">10</span> is lower than Step number <span style="background:#ffff80;">80</span>, telling you this <span style="background:#bbbbff;">1</span><span style="background:#ff8080;">001</span> occurred earlier. The <span style="background:#bbbbff;">Category</span> + <span style="background:#ff8080;">Error</span> tells you "What" happened. The <span style="background:#ffff80;">Step Number</span> tells you "when" it happened. It's building context that can help you figure out what is causing the error.
 
It's crucial to examine the entire error log, not just the 4-digit <span style="background:#bbbbff;">C</span><span style="background:#ff8080;">ERR</span>, to understand "why" the error occurred. In the previous example, the <span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">80</span><span style="background:#bbbbff;">1</span><span style="background:#ff8080;">001</span> may indicate an issue with the NEC/TOKIN Proadlizers, a type of capacitor in the CELL CPU's VRM (voltage regulation module). Conversely, <span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">10</span><span style="background:#bbbbff;">1</span><span style="background:#ff8080;">001</span> may result from various other factors, simply because there is a larger number of things that can go wrong. Therefore, it's possible that the NEC/TOKINs are not the source of the error.
 
All of this means you must be knowledgeable about the hardware to comprehend the errors stored in the SYSCON. Unfortunately, there is not a unique error code for every potential issue. For example, there is no error that can indicate if Capacitor C6900 is short. Nonetheless, there are a few exceptions, where a specific code typically conveys the same meaning. For example, a fuse that has blown "usually" causes said code. But we cannot dismiss the possibility that a cap blowing on the same line could also be the cause, however unlikely that may be.
 
----
'''Examples''':<br>
 
<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">80</span><span style="background:#bbbbff;">1</span><span style="background:#ff8080;">002</span>
*System Error 002 ([[RSX]] VRAM Power Fail) which occurred while the System was successfully powered On.
*1002 errors are known to be caused by faulty NEC/TOKINs, but other factors could also be involved. For additional information, refer to the Error Code section below.
<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">40</span><span style="background:#bbbbff;">3</span><span style="background:#ff8080;">034</span>
*Fatal Booting Error 034 ([[RSX]]/[[CELL BE|CELL]] Communication Error) which occurred at step no. 40 (BitTraining), before the Power On Sequence has completed.
*3034 errors are often caused by defects on the BGA or Bump (among other issues). Experienced PS3 repair technicians have noted that it is almost exclusive to the RSX. Although we cannot completely exclude the possibility of a CELL BE BGA/Bump defect being the cause, it has been the exception to the rule. Time and experience have demonstrated that 3034 errors are primarily an RSX problem. A repair technician needs to decide which processor to reball/replace based upon the more likely candidate and proceed accordingly, using their judgement.
*See Error Code section below for more details.
<span style="background:#000000; color:#ffffff;">A</span><span style="background:#909090; color:#ffffff;">0</span><span style="background:#ffff80;">21</span><span style="background:#bbbbff;">3</span><span style="background:#ff8080;">013</span>
*3013 errors are caused by Dead CELL BE CPU.
 
----
The following error code section will only list the last 4 numbers (category + error). However, remember the Reserved Area and Step Number can be useful to figure out "when" the error occurred and how frequent it is. The last four numbers are the most important for figuring out what specific error means, but you still need to figure out what it means in context of your issue. So you can diagnose the error and then fix it.


== Error codes ==
== Error codes ==


=== System ===
=== System Errors===
----
----


==== 1001 ====
==== 1001 (Power CELL) ====
Cell Vram Power
*Components Involved:
**[[CELL BE|CELL]] (IC1001 on [[COK-001]])
**NEC/TOKIN Proadlizers (C6140/C6141/C6142/C6143 on [[COK-001]])
**Other nearby components of the power block


Speculation:<br>
This error may result from inadequate filtering on the CPU's core voltage (VDDC) or an unexpected system shutdown. Voltage ripple or noise within a certain range could cause errors before they worsen into a [[CELL BE|CELL]] VDDC Power Failure (3003). There are several SMD filters, but the most critical ones are the NEC/TOKIN Proadlizers (capacitors). Bad NEC/TOKINs are responsible for 1002 errors on the GPU, but diagnosing 1001 errors is not as straightforward. You must observe the console experiencing YLOD while under load and confirm the generation of a new 1001 error. Otherwise, the 1001 code may indicate the console was not turned off correctly.
1001 errors happen when the system encounters an unexpected shutdown. They often occur in testing, when the console is turned on/off a lot, instead of graceful shutdown. They have been associated with other errors, but there doesn't appear to be any single cause.


The hypothesis that this error is associated with insufficient Filtering on CPU's core voltage (VDDC) has not been confirmed. There is a range of voltage ripple/noise that "should" cause errors before it gets so bad it causes a CELL BE VDDC Power Failure (3003). There are numerous SMD components involved in filtering, but the main concern are the NEC/TOKIN Proadlizers (capacitors). 1002 errors are the fingerprint of bad tokins on the GPU, but 1001 has not been shown to have the same association with the CPU's filter. However, a connection is strongly suspected.
1001 errors can be recorded automatically when the system experiences an unexpected shutdown or loss of AC power. These errors frequently arise during testing when the console is frequently turned on or off instead of being gracefully shut down. A0801001 errors alone cannot prove that NEC/TOKINs are failing. Such errors are regularly found in the log of properly functioning machines and are not cause for concern unless the system is unexpectedly shutting down on its own.


==== 1002 ====
A machine that can power on but displays graphical artifacts or no video may lead to misinterpretation of 1001 errors. In such cases, the console must be turned off forcibly using the power switch at the back of Phat/Fat models or by pulling the power cord in Slim and Super Slim models. This may cause 1001 or 1004 errors, which can be ignored if they were not generated under normal circumstances. If a console is showing artifacts/GLOD, fix the larger problem first (usually a GPU problem requiring a reball/replacement). Only after that, if stress testing results in 1001 errors, should you troubleshoot the CPU NEC/TOKINs.
RSX Vram Power


This error has been associated with insufficient Filtering on RSX_VDDC. There is a range of voltage ripple/noise that will cause this error before it gets so bad it causes an RSX_VDDC Power Failure (3004). YLOD's causing 1002's range in duration from 2 seconds to only occurring during intense games. <br>
Anecdote: One console, with faulty CPU NEC/TOKINs, displayed an A0901001 error only during shutdown. The Last of Us, a strenuous game, showed no signs of typical bad NEC/TOKIN behavior, and the system remained stable. However, it remained in shutdown for a prolonged period, resulting in the YLOD (3 beeps and flashing red light). It required a reset to power back on. Replacing the NEC/TOKINs resolved the problem.
 
==== 1002 (Power RSX) ====
*Components Involved:
**[[RSX]] (IC2001 on [[COK-001]])
**NEC/TOKIN Proadlizers (C6229/C6230/C6231/C6232 on [[COK-001]])
**Other nearby components of the power block
 
This error is linked to inadequate filtering on the '''RSX_VDDC''' power line. A certain range of voltage ripple and noise can trigger this error before it becomes severe enough to cause an RSX_VDDC Power Failure (3004). YLOD's that lead to 1002 errors vary in length, lasting from 2 seconds to only happening during intense games. <br>
<br>
<br>
There are numerous SMD components involved in filtering, but the main concern are the NEC/TOKIN Proadlizers (capacitors). 1002 errors are the fingerprint of bad tokins.
There are a number of SMD components used in filtering, but the major concern is the NEC/TOKIN proadlizers (capacitors). 1002 errors indicate faulty NEC/TOKINs.
 
==== 1004 (Power AC/DC) ====
*Components Involved:
**[[Power Supply]]
When a console loses AC power, it may generate error A0801004. During the shutdown sequence, voltage regulators are turned off sequentially in reverse order to the Power On Sequence (POS). This allows components enough time to enter a power off state. It helps to prevents data corruption and voltage spikes or discharges that can damage sensitive components. 1004 errors commonly occur in machines that can power on but have graphical artifacts or no video, also known as GLOD. In such cases, the user has no choice but to improperly power off. Forcing a shutdown using the power rocker at the back of the console (for Phat models), or by pulling the power cord (for slim and super slim models) will cause an unexpected loss of AC power and prevent the SYSCON from completing the shutdown sequence.
 
 
This error may be disregarded if it occurred due to abnormal circumstances, such as a power outage or accidental unplugging. Since it did not result from a hardware malfunction, it is not a significant concern. If a console displays artifacts or the GLOD, the main issue should be addressed first, typically involving a GPU problem requiring a reball or replacement. Afterward, if 1004 errors reoccur, the AC/DC line, PSU, and its connection to the DC-DC converters should be diagnosed.
 
==== 1005 ====


==== 1004 ====
On an NPX-001 Super Slim Motherboard I was experiencing an A0091005 error, started looking at the CELL power rail and started replacing components, went to replace the Aluminum Polymer Caps and after replacing all 5 on the Cell Side the error was gone.  Seems to be a super slim error only but more testing will be needed to be done to see if it occurs on other motherboards.  Another user reported being able to fix the same error code by replacing the same capacitors.  Sometimes this error will be paired with a 3003 error.  I also had a PQX-001 board with A0801005 followed by A0043005, found a short on the RSX on what I believe to be FBVDDQ for the 28nm, I was unable to fix this issue as after removing every component the line was still short.
PSU Power


==== 1103 ====
==== 1103 (Thermal Alert SYSTEM) ====
Thermal
*Components Involved:
**[[CELL BE|CELL]]
**[[Thermal|CELL temperature monitor]] (only in mullion syscons, the CELL temperature monitor for PS3 slims and super slims are unable to send this error code)
Syscons have a pad/pin specifically for this signal. It was given an official generic name (not indicating anything that triggers it) because several components can send it. In the first PS3 models (with mullion syscon?), the signal can be sent by CELL or the CELL temperature monitor using the official function names SYS_THR_ALRT or THERMAL_OVERLOAD.<br>


==== 1200 ====
But this electrical design is not specific for the PS3, there could be other devices based on the IBM CELL, and developed by SONY, where this error code is sent by other components, which could have more than one CELL so in general we could say this error code indicates one (or more) of the CELL processors (and maybe other components not present in retail PS3 models or his temperature monitor chips) is overheating.
CELL BE Thermal Error
 
At least 3 consoles that had CPU trace damage resulting from failed IHS delid attempts exhibited A0801103 and A0902203 errors.
 
==== 1200 (Thermal CELL) ====
*Components Involved:
**[[CELL BE|CELL]]


CPU Overheat. This is a common error. The usual culprit is failed Thermal Interface Material (TIM). As the material ages it "dries" allowing air inside. Air is a heat insulator, reducing the TIM's ability to transfer enough heat away from the processor. The system fan will steadily get louder over time until it cannot keep up. Once the processor approaches it's Thermal Shutdown Temperature a Yellow LED begins flashing on the console (Early Phat Models). Once it reaches the Thermal Shutdown Temperature the console will beep three times and hard shutdown, flashing red until the console is unplugged and the error state reset. Error 1200 is generated in the SYSCON errorlog.
CPU Overheat. This is a common error. The usual culprit is failed Thermal Interface Material (TIM). As the material ages it "dries" allowing air inside. Air is a heat insulator, reducing the TIM's ability to transfer enough heat away from the processor. The system fan will steadily get louder over time until it cannot keep up. Once the processor approaches it's Thermal Shutdown Temperature a Yellow LED begins flashing on the console (Early Phat Models). Once it reaches the Thermal Shutdown Temperature the console will beep three times and hard shutdown, flashing red until the console is unplugged and the error state reset. Error 1200 is generated in the SYSCON errorlog.
Line 64: Line 257:
If that still doesn't work, it could be an issue with the temperature monitor chip (IC1101). Beyond that, some users have noted that dead CPU's can throw error 1200. However, that's the limit of our current understanding. It could be dead, or have another unexplained issue, but usually reflowing or reballing is the last ditch effort to revive such a console.
If that still doesn't work, it could be an issue with the temperature monitor chip (IC1101). Beyond that, some users have noted that dead CPU's can throw error 1200. However, that's the limit of our current understanding. It could be dead, or have another unexplained issue, but usually reflowing or reballing is the last ditch effort to revive such a console.


==== 1201 ====
==== 1201 (Thermal RSX) ====
RSX Thermal Error
*Components Involved:
**[[RSX]]
 
GPU Overheat. This is the same as error 1200 above, except it's for the GPU. The same repair steps apply, except it's Temperature Monitor Chip is IC2101. This error is rare. Out of hundreds of consoles and years of user reports this error has only occurred when the user forgot to replace the RSX heatsink when testing the console. It has not been reported under normal circumstances. The RSX tends to fail long before the TIM degrades to the point thermal shutdown is reached.
 
==== 1203 (Thermal CELL VR) ====
*Components Involved:
**[[CELL BE|CELL]] voltage regulators
**[[Thermal#Temperature_Monitors|Temperature Monitors]]
 
Thermal monitor No Command tag Specified Error when the thermal monitor was communicating with the SYSCON. Or Thermal Shutdown (Hardware Initialized). Possibly because the thermal monitor is failing or the connections on it are iffy. Has been seen with CPU or GPU Overheats (1200 or 1201) . A0801103/A0902203 seen from CPU Trace damage (failed delid). Sometimes seen with 1001/1002 combo associated with Bad NEC/TOKINs (Rare). Or when the PS3 was powered on when hot (after heatgun).
 
Some PS3 motherboards ([[TMU-520]], [[COK-001]], [[COK-002]]), have a temperature monitor located somewhere in the CELL power block. The other retail PS3 motherboard models doesn't measure the temperature of the CELL VR
 
All the PS3 temperature monitor chips have a internal thermal sensor integrated + 2 pins for an optional external sensor. The temperature monitors for CELL and RSX are configured to use the external sensor, but this one for CELL VR probably uses the internal
 
==== 1204 (Thermal South Bridge) ====
*Components Involved:
**[[South Bridge]]
 
==== 1205 (Thermal EE/GS) ====
*Components Involved:
**[[CXD2953AGB]] or [[CXD2972GB]]
**See also: [[PS2 Compatibility|Emotion Engine / Graphics Synthesizer]]
 
This error is specific for [[COK-001]]/[[CXD2953AGB]] (with full PS2 hardware compatibility, EE+GS) or [[COK-002]]/[[CXD2972GB]] (with partial PS2 hardware compatibility, GS only)
 
==== 1301 ([[CELL BE|CELL]] PLL Unlock) ====
 
Has been reported in a console where the CPU die was chipped during a delid attempt gone wrong. The console exhibited a Green Light of Death (GLOD) and shutting down periodically with A0801301.
 
On another console 1301 occurred after both the CPU and RSX were reballed. The reball of cell likely failed, or damaged it.
 
On a third console it was reported after nearly every chip on the Motherboard was heat gunned. They probably didn't achieve the necessary temperatures to reflow the CPU, and if they did they probably damaged it by using too much heat.
 
In all 3 cases the CPU was damaged or heated in some way.
 
401001/401301/402120 occured when IC6002 pins 17/18 were accidentally short. It blew out C6025 & PS6001. Dead CPU possible. Double check CELL_PLL Voltage and clock generators. Rare code.
 
Occured while overvolting a 40nm CXD5300 RSX to achieve a higher overclock on a CECH-2501A. An A0801002/A0801301 occured when the voltage regulation module exceeded it's design specifications. SB UART reported there was a Busy loop detected, suggesting that poor filtering under these extreeme conditions can cause a PLL unlock. It did not occur on every YLOD. Note, the 25xx models do not have NEC/Tokins.
 
==== 14FF (Check Stop) ====


GPU Overheat. This is the same as error 1200 above, except it's for the GPU. The same repair steps apply, except it's Temperature Monitor Chip is IC2101.
This error can occur when the console was on at the time the YLOD occurred. On consoles exhibiting this error, subsequent attempts to start the console resulted in a GLOD with 1601/1701 errors, or a YLOD within 2 seconds. SYSCON errors usually show one A0801601/A0801701 occurring at the same timestamp, followed thereafter by 3034/4xxx errors for all subsequent attempts to PWR it on. Or it'll GLOD and throw more 1701/1601's. The working theory is that there is a precarious solder joint (BGA or bump defect) teetering on the edge of breaking. It'll soon switch to 3034 with or without 4xxx errors.


==== 1203 ====
Complicating the issue is the fact that sometimes people will get a 1301 or 1802 also. It likely has to do with where the joints are failing and it's involving those sub systems briefly before fully breaking.
Cell voltage regulators thermal


==== 1204 ====
Unlike Livelocks, which can be caused by both hardware and software conditions, Checkstops are a hardware issue. A checkstop occurs when the CPU or GPU, cache, memory, or I/O bus controller, finds something in an impossible state (impossible unless the Hardware is broken). The error isn't identified as a particular bus transfer in progress, or the CPU/GPU detects the console is stuck (frozen, no progress being made with that operation). When nothing can be done for a long enough period of time, the checkstop errors is logged and BE ATTENTION is driven High. SYSCON immediately shuts the console down with error A08014FF and A0801701.
Southbridge thermal


==== 1205 ====
The most likely cause of the error is a failing GPU (RSX) solder joint (BGA or Bumps). A distant second is a failing CPU (CELL) solder joint (BGA).
EE/GS thermal


==== 1301 ====
==== 1601 ([[CELL BE|CELL]] Livelock Detection) ====
Cell PLL


==== 14FF ====
CPU is deadlocked and cannot proceed. Some kind of error occurred, preventing a process from completing. It is the software equivalent of trying to pass someone in a hallway and you both keep choosing the same direction to swerve. Now imagine you had exactly 30s to make it to the other end of the hallway to catch an elevator, and it takes 29s to get there. Neither of you can pass and miss your elevators because of it. Now imagine you were supposed to pass an envelope to a person on the 3d floor, who had 30s to read it and enter it in a spreadhseet. Now he misses his deadline too. And imagine the entire organization was micromanaged like this. One disruption can cause the whole operation to grind to a halt! That's kinda how this works.
Check stop


==== 1601 ====
Basically this means the console froze and had to reboot. In the PS3 this is often preceded by graphical artifacting. The cause is often a solder joint on the RSX (BGA or Bumps). Generally these errors are seen in the early stages of a GPU failure. However CPU failures cannot be ruled out. They are just less likely.
BE Livelock Detection


Speculation:
Speculation:
If a YLOD turns into a GLOD after reball/reflow then 1601 (with or without 1701) could mean the RSX RAM was damaged. This is a loose association based on a few user reports.


==== 1701 ====
As the impedance of propagating solder cracks increases, the digital logic core has a harder time calibrating the FlexIO during BitTraining. Once impedance reaches the limits of the compensation network, interference causes random issues during software execution. LiveLock conditions cause BE Attention signal to be driven High and the SYSCON shuts the console down (YLOD) with errors A0801601 and A0801701.
Cell attention
 
As the console cools the microscopic gaps in the solder can be physically reconnected by thermal warping. Warping is due to differences in the Coefficients of Thermal Expansion (CTE) between materials in the motherboard and processor. This expansion and contraction can reconnect the solder joints just enough to allow the console to boot. Or it may disconnect them.
*If they reconnected, the console will boot until it experiences another 1601/1701 event.
*It they do not reconnect, the console cannot complete BitTraining and will fail in POST with error A0403034. Often with an associated Data error, such as A0404401 (if the broken solder joint affected a Data line on one of the SPI lines). If there is no Data error, the broken joint only affected the voltage for the SPI line. Either RSX_VDDR or YC_RC_VDDIO.
 
If a YLOD turns into a GLOD after reball/reflow then 1601 (with or without 1701) could mean the [[RSX]] RAM was damaged. This is a loose association based on a few user reports.
 
==== 1701 ([[CELL BE|CELL]] BE ATTENTION) ====
 
BE ATTENTION is an active-high output flag sent by the CPU to the SYSCON. During initialization & configuration it is used to request an operation by the SYSCON. When ATTN goes High the syscon reads the SPI Status Register to determine the cause of the Attention signal. It remains high until software resets the condition that caused it.
 
After Power On Reset the BE attention signal is driven low and is supposed to stay there! If there is a Checkstop error (14FF), Livelock Detection (1601), or PLL Unlock (1301) the CPU enters a fault condition and raises the Attention signal (1701) during operation. The SYSCON sees this and immediately shuts the console down with error code A0801701 and usually another error indicating the cause. One common way this happens is when a solder connection breaks while the system is on. This could be the BGA (Ball Grid Array) or the Solder Bumps under the die.
 
Going into more detail, BE Attention is used during Power On Reset (POR)...
*To load CPU VID voltage from the VRM internal registers.
*To Write configuration-ring data (Important CPU Config settings that should only be modified at boot, otherwise errors can occur).
*To calibrate the FlexIO interface (BitTraining).


==== 1802 ====
If Attention occurs during the Power ON State (Step# 80) it indicates an error condition. Basically, something is flagged by the Processor as abnormal. It's forced to attempt to resolve the problem before it can continue with whatever it was trying to do. If the error condition cannot be resolved, the CPU sends the ATTENTION signal to the SYSCON. The SYSCON immediately shuts off the console, then reads the SPI Status Register to determine the cause. Then it records the A0801701 in it's errorlog along with the specific cause (if it determined one). Errors that can cause the Attention include:
RSX init
*Unresolved Checkstop errors (14FF)
*Livelock Detection (1601)
*PLL Unlock Condition (1301)
*BGA/Bump Defect that occurs while the Console was On (Step# 80). Subsequent attempts to power on the console would result in 3034/4xxx errors.


==== 1900 ====
A user get this error code with a damaged hard drive. He was transferring some games via FTP, and his console turned off with YLOD. When he tried to turn on again, he get a GLOD. Problem was fixed just by changing the HDD.
 
1701 has been reported from using homebrew apps that caused a software conflict. Uninstalling the software can resolve the issue. It that's not possible because the system is locked up, it may be necessary to restore the operating system (OS).
 
1701 and 1601 can also occur when exiting heavier games on a dex firmware paired with multiman due to I'm guessing dex and webman allocating 2 more megabytes of ram [as shown by the webman fps counter allocating 18 to xmb on cex and 20 on dex], interfering with the syscon fan curve causing it to get "stuck" resulting in overheating and causing a ylod upon exiting the game though the console reboots fine which may trick people into thinking webman or dex overheats their consoles. I haven't looked into this issue too much but converting from cex to dex has fixed the issue and the issue arose when converting to dex so I belive there are fringe cases where this is an issue
 
==== 1802 ([[RSX]] Initialization) ====
 
A0801802 occurring after the console has booted (step# 80) and causes BE Attention (1701) alarm raised when a Checkstop error (14FF) occurs. Likely the 1802 was the hardware failure that caused the checkstop error. That causes BE ATTENTION to be driven High and the SYSCON shuts the console down with A0801802, A08014FF, and A0801701. That makes sense because the CPU couldn't continue with it's process when the RSX interrupt occurred. These errors have been seen in consoles that were repaired by an RSX reball/replacement.
 
1802 is confirmation that the RSX was involved, if there's any doubt about what's cauing a 3034.
 
==== 1900 (RTC Voltage) ====
RTC voltage
RTC voltage


==== 1901 ====
==== 1901 (RTC Oscilator) ====
RTC oscilator
RTC oscilator


==== 1902 ====
==== 1902 (RTC Access) ====
RTC access
RTC access
'''1b01 ([[CELL]] Initialization)'''
CPU Thermal Sense Error. Thermal Monitor (IC1101) external sense line. Check C1103, R1106/7 & replace IC1101 (COK-00x) before reballing CPU. If all else fails the CPU's thermal diode is dead.
This error tends to occur at step number 20, during core intialization.
Suspected that removing R1106/7 or the CPU iteslf will cause A0201b01/A0A02030 errors. Hasn't been confirmed through sabotauge testing.
==== 1b02 ([[RSX]] Initialization) ====
RSX Thermal Sense Error. Thermal Monitor (IC2101) external sense line. Check C2103, R2101/2 & replace IC2101 (COK-00x) before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead.
This error tends to occur at step number 20, during core intialization.
Confirmed that removing R2101/2 or the GPU iteslf causes A0201b02/A0A02031 errors.


----
----
=== Fatal ===
 
=== Fatal Errors ===
----
----
*This fatal error codes seems to be repeated up to 3 times for 3 special cases, as example, errors '''20'''03, '''21'''03, and '''22'''03 are related with southbridge, the only thing that changes in the error code is the second digit (located immediately after the category 2). If at some point we find what means that second digit we can join the wiki page sections together (with titles: "2001 & 2101", "2002 & 2102", "2003 & 2103", etc...)<br>
In other words, there are 3 groups: '''20xx''' (composed by 13 errors), '''21xx''' (composed by 13 errors), and '''22xx''' (composed by 1 error). See {{Talk}}
==== 2001 (CELL) ====
[[CELL BE|CELL]] (IC1001)
==== 2002 (RSX) ====
[[RSX]] (IC2001)
==== 2003 (South Bridge) ====
[[South Bridge]] Error (IC3001)
==== 2010 (Clock Subsystems) ====
Clock Generator Error (IC5001)
==== 2011 (Clock CELL) ====
Clock Generator Error (IC5003)


==== 2001 ====
==== 2012 (Clock CELL) ====
Cell
Clock Generator Error (IC5002)


==== 2002 ====
==== 2013 (Clock CELL, RSX, South Bridge) ====
RSX
Clock Generator Error (IC5004)


==== 2003 ====
'''2014 (Unknown)'''
Southbridge
 
Bad GPU NEC/TOKINs if assiciated with 1002. Was reported in a DIA-002 with other errors. A0091002/A0102014, A0101002/A0102113, and also had A0101002 + 10x A0202120s. Presumably caused by failed RSX tokins.
 
==== 2020 (HDMI) ====
HDMI Error (IC2502)
 
This code is not diagnostic on its own. When coinciding with 1601/1701, 14FF, 1301, and 3034 it usually means a GPU issue. When coinciding with a 1002 it's usually NEC/TOKIN proadlizers. When they occur in bunches AND without more diagnostic codes, all in the same power on, it may be the MultiAV or HDMI Transmitter ICs. The presence of other codes give you context to their meaning.
 
'''2021 (Unknown)'''
 
Rare code. Occurred in a CECHBxx model with 10x A0202121 occuring throughout the power on sequencing before starting the bootloader. It did not prevent boot before the console overheated. Errorlog shows two 10x 2121 + 1200 error combos. One of which also had A0802021 coinside. Log shows the console originally had a NEC/Tokin issue (A0801002) before this started. While this is an Unknown error combo, it may be similar to 2120. On another console, timestamps showed A0101002 occurred 1st and then 10x 2120's occurred over the next 10 seconds. Replacing tokins fixed that console. It's possable this is a similar situation, but with a new code.
 
==== 2022 (DVE) ====
 
DVE Error (IC2406, CXM4024R MultiAV controller for analog out)


==== 2010 ====
This error may be normal in an otherwise working console. They have been observed in th errorlogs of perfectly operational units and can occur naturally from AV issues.
Clock 1


==== 2011 ====
This error has been observed with no video out using HDMI on a Samsung Smart TV. They reproduced the error by making the TV detect another console first (a PS4), turn off the TV, swap the HDMI cable from the PS4 to the PS3, and turning back on the TV.
Clock 3


==== 2012 ====
This error is also present when the console produces graphical artifacts on the screen. The console freezes and cannot be used, forcing the user to turn off the console. This produces the 2022 error code and is an early sign of GLOD.
Clock 2


==== 2013 ====
It is often seen coinciding with 1601/1701, 14FF, 1301, and 3034 in case of Bad GPU (Common). DVE or HDMI Transmitter possible. If so, multiple errors at the same timestamp allow you to distinguish between causes.
Clock 4


==== 2020 ====
This error could also show when opening and closing a PS2 emulated game in a CFW console, both in Evilnat and Rebug. The errors would be in dyads. If this is the case there is no reason of concern.
HDMI


==== 2022 ====
==== 2024 (AV) ====
DVE Error (CXM4024R MultiAV controller for analog out)
This code is not diagnostic on its own. When coinciding with 1601/1701, 14FF, 1301, and 3034 it usually means a GPU issue. When coinciding with a 1002 it's usually NEC/TOKIN proadlizers. When they occur in bunches AND without more diagnostic codes, all in the same power on, it may be the MultiAV or HDMI Transmitter ICs. The presence of other codes give you context to their meaning.


==== 2024 ====
This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).  
This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).  


2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.
2124 and 2024 errors occuring in random bunches registering several per power on attempt been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.
 
A0A02024 Occurred in a KTE-001 with a failed Bluetooth/Wifi module step-up voltage converter. A0002024/A0002124/A0003001 occured when attempting to power without 12v connected. A0A02024 also recorded. When 12v was connected the same codes would occur at step no. 09 instead of 00.
 
==== 2030 (Thermal Sensor, CELL) ====
*Components Involved:
**[[CELL BE|CELL]]
**[[CELL BE|CELL]] [[Thermal#Temperature_Monitors|Temperature Monitor]] (IC1101 on [[COK-001]])


==== 2030 ====
Thermal Monitor (IC1101) external sense line. Check C1103, R1006/7 & replace IC1101 before reballing CPU. If all else fails the CPU's thermal diode is dead. Was seen in a PS3 that was destroyed by a heatgun. Also had A0A02031/2033 & A0902031.  
Thermal Sensor Error (IC1101) CELL BE Temp. Monitor


Speculation: 2030-33 errors reported in case of dodgy PWR/EJT daughter board.
Speculation: 2030-33 errors reported in case of dodgy PWR/EJT daughter board.


==== 2031 ====
==== 2031 (Thermal Sensor, RSX) ====
Thermal sensor Error (IC2101) RSX Temp. Monitor
*Components Involved:
**[[RSX]]
**[[RSX]] [[Thermal#Temperature_Monitors|Temperature Monitor]] (IC2101 on [[COK-001]])
GPU Thermal Monitor (IC1002) external sense line. Check C2103, R2101/2 & replace IC2101 before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead. Confirmed when the RSX is removed, you'll get 1b02/2031 at step number 20.  Was seen in a PS3 that was destroyed by a heatgun, which also had A0A02030/2033 & A0902031. Once reported to be caused by a checksum mismatch at address 3dfe.
 
==== 2033 (Thermal Sensor, South Bridge) ====
*Components Involved:
**[[South Bridge]]
**[[South Bridge]] [[Thermal#Temperature_Monitors|Temperature Monitor]] (IC3101 on [[COK-001]])
Typically a dead SB Thermal Monitor IC. Check nearby SMDs & traces. Was seen in a PS3 that was destroyed by a heatgun. Also had A0A02030/2031 & A0902031.
 
==== 2040 ====
Found during sabotage testing on a KTE-001 Board that removing F6300 caused a A0012040 error, this fuse appears to be on the 12v line.
 
for super slim reflow or reball CPU
 
==== 2044 (Super Slim short circuit - BT/Wi-Fi and 5Volt) ====
 
==== 2101 (CELL) ====
[[CELL BE|CELL]] (IC1001)
 
Often coincides with A0403034 indicating the GPU needs replaced (usually). Deliding can cause this, look for trace damage. In one case, errors A0402101 / A0403034 occured because RSX TX1 was shorted to ground by a nicked RSX trace during the delid. TX is the transmit line, so the CPU didn't recieve data from it, and noted the error (BitTraining BE:RRAC:BX0:BX:FLEXIO_ID).
 
==== 2102 (RSX) ====
[[RSX]] (IC2001)
 
In several reports IC6301 replacement fixed it. In one case, 10x 2120 / 1x 2102 combo was fixed by replacing RSX_VDDIO voltage controler (IC6317).  RSX_FBVDDQ (VRAM voltage) implicated. In most cases, it's an RSX Failure. Sometimes coinciding with A0403034 or other codes indicating GPU fail. Often after reflow attempt.


==== 2033 ====
==== 2103 (South Bridge) ====
Thermal sensor 3
Southbridge Error (IC3001)


==== 2101 ====
==== 2110 (Clock Subsystems) ====
Cell


==== 2102 ====
Clock Generator Error (IC5001)
RSX


==== 2103 ====
This error can be caused by a 5V_MISC short to ground. One user had an A0022110 after replacing IC6105 (Buck Converter) and accidentally bridging the 5V voltage input. So check the 5V line for shorts.
Southbridge
 
This error has been resolved by a number of users who had a short on F6001. It is important to note that something usually causes that fuse to blow, like a short. So it's important to troubleshoot the board to find and repair the shorting component before replacing the fuse. Otherwise the new one will blow too.
 
One user, who resolved this error on his C model PS3, noted "very short YLOD. Error code shows 2110[...]Some earlier code shows 1001 and 1002." The 1001 & 1002 errors he noted in the log before the 2110 appeared may have been a clue that C6019 or C6020 (as they are in parallel) was deteriorating. Further investigation is needed to confirm this hypothesis, however. In his case, C6019 was shorting and caused F6001 to blow. This short overloaded F6001 and cut power to many Subsystems, such as the HDD, USB ports, South bridge, CPU, GPU, etc.
Another user confirmed this. The error log was showing code 2110 and one entry earlier was showing code 1001. Checking both capacitors after removing them from the board, confirmed that one capacitor was reading 140 ohms and not reading as a capacitor, so it was working as a resistor causing extra load in the fuse.
 
One particularly noteworthy component is IC6020, which supplys +3.3v_MK_Vdd to the clock generator (IC5001). When F6001 blows, a 02 2110 is generated. A step number of 02 is very early in the power on sequence (POS), which explains why 2110 is triggered instead of another error code. Since the clock generator is critical for timing, it is one of the first things the SYSCON checks during the POS.
 
==== 2111 (Clock CELL) ====
Clock Generator Error (IC5003)
 
Once reported in a console with a bad RSX Thermal monitor. Had mostly 2031 errors at various step numbers. The 2111 was a rare occurance. SYSCON reported it as an "Unrecoverable FATAL ERROR by thermal." Check C2103, R2101/2 & replace IC2101 before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead.
 
==== 2112 (Clock CELL) ====
Clock Generator Error (IC5002)
 
==== 2113 (Clock CELL, RSX, South Bridge) ====
Clock Generator Error (IC5004)
 
Analog Voltage for the core PLL of IC5004, which is an ICS9214 Clock Generator used to support the Rambus XDR memory subsystem and Redwood logic interface.
 
SW_1_B enables control Pin 5 on IC6013, which generates +2.5V_LREG_XCG_500_MEM. If that fails it generates A0092113.​
 
Reportedly fixed by replacing IC5001. One person tried replacing X5301, but short C5142 (2.5v to GND). This killed power to IC5004 (RSX/CELL/SB Clock Generator for FlexIO) and caused error A0092113. IC5004 relies on +1.2V_YC_RC_VDDIO refrence voltage to carry the signals. That can be affected by RSX/CPU faults. Another possability is F6302 or nearby SMDs, which supplys 1.7V_MISC to IC6303 to generate +1.2V_YC_RC_VDDIO, among other voltages required to start CPU/SB/GPU.​
 
'''2114 (Unknown)'''​
 
Bad GPU NEC/TOKINs if assiciated with 1002 and/or 3004. Has been reported in VER-001 and DYN-001 motherboard revisions. Related codes have been reported in a DIA-002, with A0091002/A0102014, A0101002/A0102113. That was presumably caused by failed RSX tokins. ​
 
A lone 2114 or one assiciated with 2124, 3020, 1301, and/or 3034 may be GPU/BGA related. Possably HDMI encoder (MN864709), or Texas instrument 88J9LKK C5714 G4 Clock generator, but evidence for both of those cases is weak. However, given it's similarity to error code A0092113, which is related to the clock generators, a connection is suspected.​
 
==== 2120 (HDMI I/O Error) ====
NOTE: Context matters with this error code! The step number and the number of codes per YLOD is different. Careful observation allows you to diagnose the most likely cause.


==== 2110 ====
2120 means an issue with the high speed data buss connection between the DVE <--> RSX <--> HDMI transmitter has occurred. This I/O error DOES NOT mean the HDMI encoder (IC2502) is bad. It is context based. Associative, not diagnostic by itself. You must infer the diagnosis by using other, more diagnostic codes and observe console behavior to identify the cause.
Clock 1


==== 2111 ====
Count the number of 2120's your SYSCON records per YLOD event; look at the timestamp. 10x A0202120 + A0213013 error combinations appear to be related to VDDIO, the reference voltage powering the I/O buss. IC6301 is involved in the formation of +1.7V_MISC, which among other things provides input power to the DC-DC converters that output +1.2V_YC_RC_VDDIO, +1.5V_YC_RC_VDDA, +1.2V_SB_VDDC and +1.2V_SB_VDDR. Lack of voltage to these DC/DC converters downstream of IC6301 suggests F6302 has blown. A number of people have fixed these 2120/3013 error combos by finding shorts at or near C6320 and replacing Fuse F6302. But there are many other SMD nearby that might cause these fuses to blow. So you will need to track the source of the short and fix it, or the fuse will just blow again.
Clock 3


==== 2112 ====
A bad thermistor (TH2501) has been reported to cause A0002120. It provides over current protection for the HDMI transmitter and output device in case there's a 5v short. This might happen if pins 17 (GND) and 18 (+5v) are damaged on your HDMI port or cable. Or if C2558 or C2570 short. See the service manual circuit diagrams as there are other SMDs that could malfunction and cause this error.
Clock 2


==== 2113 ====
A0802120 and A0902120 errors can be caused by BGA or Bump defects that affect I/O, either the RSX or CELL. BGA defects on RSX VDDIO pads have been confirmed with a pressure test to have caused 2120 errors, but usually only one of them occurs per YLOD event. For example, one YLOD event may generate A0403034, A0404412 and an A0902120 error. This would indicate a bad GPU, not a bad HDMI transmitter. And since it occurred during the shutdown state (step number 90) this excludes issues that would have generated an error earlier in POST, like a fuse or a short in the Voltage regulation module (VRM).
Clock 4


==== 2120 ====
The HDMI transmitter (IC2502) can also cause A0802120 and A0902120 errors. The IC itself or any of the SMDs between it and the RSX. You can tell a genuine HDMI transmitter issue apart because there multiple A0802120 errors occurring during the bootloader after the console has completed the power on self test (POST). This excludes a fuse and VRM issues, as indicated by the step number 80 (power on state). You will usually see a different number of 2120s at random. Like 4 or 6 of them. This is different than the 10x 2120/3013 error combo or 3034/4xxx/2120 combo described earlier.
HDMI


==== 2122 ====
====2122 (DVE)====
DVE
DVE Error (IC2406, CXM4024R MultiAV controller for analog out)


==== 2124 ====
====2124 (AV) ====
This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).  
This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).  


2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.
2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.


==== 2130 ====
====2130 (Thermal Sensor, CELL)====
Thermal sensor 1
*Components Involved:
**[[CELL BE|CELL]]
** [[CELL BE|CELL]] [[Thermal#Temperature_Monitors|Temperature Monitor]] (IC1101 on [[COK-001]])
CPU Thermal Monitor (IC1101) external sense line. Check C1002, R1003/4 & replace IC1002 before reballing CPU. If all else fails the CPU's thermal diode is dead.


==== 2131 ====
====2131 (Thermal Sensor, RSX)====
Thermal sensor 2
*Components Involved:
** [[RSX]]
**[[RSX]] [[Thermal#Temperature_Monitors|Temperature Monitor]] (IC2101 on [[COK-001]])
GPU Thermal Monitor (IC1002). Check C2103, R2101/2 & replace IC2101 before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead.


==== 2133 ====
====2133 (Thermal Sensor, South Bridge)====
Thermal sensor 3
*Components Involved:
** [[South Bridge]]
**[[South Bridge]] [[Thermal#Temperature_Monitors|Temperature Monitor]] (IC3101 on [[COK-001]])


==== 2203 ====
==== 2203 ([[South Bridge]])====
Southbridge
 
From sabotage tests it was found that disabling +2.5V_SB_PLL_VDDC
produced four A0802203 errors.​ Also, disabling +1.2V_SB_VDDR produced A0302203 & A0403034.
 
Sometime seen with a "SB Counter Error -  Explicit Bug" in bringup log. Oftern accompanies CPU (1200) or GPU (1201) Overheats. Once occurred in GLOD, after holding power "SB (FATAL) XDR Link not initilized."


====2310====
----
----
=== Boot ===
 
===Fatal Boot Errors===
----
----


==== 3000 ====
====3000====
Power
Power Failure


==== 3001 ====
====3001====
12v Power Failure
12v Power Failure


Usually this caused by a bad Power Supply Unit (PSU).  
Usually this caused by a bad Power Supply Unit (PSU).  


Alternatively, a failure on the 12v_main line can cause it. Check fuses, capacitors, resistors, and IC's on the 12v line. Measure resistance of the large 2 prong 12v connector on the motherboard. It should read in the Kilo ohms range if there is sufficient separation. Otherwise you may have a short on the line that need to be found and repaired.
Alternatively, a failure on the 12v_main line can cause it. Check fuses, capacitors, resistors, and ICs on the 12v line. Measure resistance of the large 2 prong 12v connector on the motherboard. It should read in the Kilo ohms range if there is sufficient separation. Otherwise you may have a short somewhere on the line.


==== 3002 ====
====3002====
Power
Power Failure


==== 3003 ====
====3003 ([[CELL BE|CELL]] Core Power Failure)====
VDDC CELL BE Power Failure


This error will occur in the case of a PWR failure on the main core voltage of the CPU. For example, if the filtering capacitors (NEC/TOKIN's) are severely damaged. There are other SMD's in that filter, so it could be related to them as well.
This error will occur in the case of a PWR failure on the main core voltage of the CPU (VDDC). CPU Bulk filter caps (Eg. NEC/TOKIN) or any SMD in the Feedback and Compensation network of the Voltage Regulation module (VRM). Including the Buck Converters (AKA IOR Power Blocks).


==== 3004 ====
A short Blu-Ray drive can cause this error as well. Be sure that your drive is going well before doing anything on your console.
VDDC RSX Power Failure


This error will occur in the case of a PWR failure on the main core voltage of the GPU. For example, if the filtering capacitors (NEC/TOKIN's) are severely damaged. There are other SMD's in that filter, so it could be related to them as well.
====3004 ([[RSX]] Core Power Failure)====


==== 3010 ====
This error will occur in the case of a PWR failure on the main core voltage of the GPU (VDDC). Bulk filter caps (Eg. NEC/TOKIN) or any SMD in the Feedback and Compensation network of the Voltage Regulation module (VRM). Including the Buck Converters (AKA IOR Power Blocks).
Cell BE Error


Observation: A user triggered this error by injecting 3.3V into PWRGD (power good) of IC6103 (NCP5318 CPU Buck Controller). It generated error 20 1001 and 20 3010.
====3005====


==== 3011 ====
Had A0043005 on a PQX-001, found that the RSX was shorted out and causing A0043005, I was unable to fix the error.
Cell


==== 3012 ====
This error will occur in the case of fuse F7601 is burning in PQX-001.
Cell


==== 3013 ====
====3010====
[[CELL BE|CELL]] Error
 
Observations:
A user triggered this error by injecting 3.3V into PWRGD (power good) of IC6103 (NCP5318 CPU Buck Controller). It generated error 20 1001 and 20 3010.
Another user (Razmann4k) got this error on their CECHL04 by attempting the eraser mod on an already delidded Cell and noticed a crack running down the middle of the Cell die. It caused 20 3010. A 20 3010 error was also observed on a CELL that was physically damaged during a delidding attempt by the console owner.
 
 
This problem may be related to the PLL signal generator circuit, open resistors, crystal oscillator or even the integrated itself (CDC735/CDC736/4227ANLG)
 
RSX FBVDDQ shorts, BE thermal/PLL VDDA open line, PWM signal disruption to CPU Buck Converters at startup have all been known to cause A0203010 errors. Seen in consoles that also had or developed 3034/4412.
 
====3011====
[[CELL BE|CELL]]
 
====3012 ====
[[CELL BE|CELL]]
 
====3013====
BE_SPI DI/DO ERROR
BE_SPI DI/DO ERROR


CELL not communicating to syscon via SPI (1.2V MC2_VDDIO and 1.2V BE_VCS no output) = Possible shorts on the line, check C4001 and trailing caps. Possible dead CPU?  
[[CELL BE|CELL]] not communicating to syscon via SPI (1.2V MC2_VDDIO and 1.2V BE_VCS no output) = Possible shorts on the line, check C4001 and trailing caps. Possible dead CPU?  


Another user had one on a CPU he damaged while deliding.
Another user had one on a CPU he damaged while deliding.


==== 3020 ====
A0212120/A0213013 error combinations are common. They appear to be related to VDDIO. IC6301 is involved in the formation of +1.7V_MISC, which among other things provides input power to the DC-DC converters that output +1.2V_YC_RC_VDDIO, +1.5V_YC_RC_VDDA, +1.2V_SB_VDDC and +1.2V_SB_VDDR. Lack of voltage to these DC/DC converters downstream of IC6301 suggests F6302 has blown. A number of people have fixed these 2120/3013 errors by finding shorts at or near C6320 and replacing Fuse F6302. But there are many other SMD nearby that might cause these fuses to blow. So you will need to track the source of the short and fix it, or the fuse will just blow again.
Cell


==== 3030 ====
One person reported A0202120/A0213013 when his CPU substrate (interposer) was cracked in half by a failed delid attempt.
Cell


==== 3031 ====
Through sabotage testing is was found that disabling +1.2V_YC_RC_VDDIO caused A0213013​.
Cell


==== 3032 ====
Also through sabotage testing, it was found that when L6305 is removed it cuts off +1.8V_RSX_FBVDDQ (VRAM voltage). It caused a 10x A0202120 & 1x A0213013 error combo.
CELL BE Error


+1.2v_YC_RC_VDDIO PWR Fail?
====3020====
[[CELL BE|CELL]]


==== 3033 ====
A0233020 occurred during the readiness check after VDDC is formed. It suggests a voltage instability or error preventing the CPU from reporting power good back to the SYSCON. Has occurred in a console where every chip was heatgunned. Associated errors in the log were, A0A02031/A0201802 (RSX thermal monitor and interrupt). Before the heatgut it had, A0801301/A0802120 (BE_PLL & VDDIO error). In another console it coincided with A0231002 (RSX VDDC filtering). That console had A0003001, A0002120/A0221002, A0221002/A0222120, A0231002/A0233020. Indicating a more serious issue with the PSU, Fuses or Core voltage ICs.
Cell


==== 3034 ====
====3030====
Cell BE / RSX Communication Error
[[CELL BE|CELL]]


This is the most common error seen in early Phat model PS3's with the hottest 90nm RSX and CELL processors. It is the hallmark of a BGA defect (such as a cracked solder ball). It is by no means limited to the early models, however. These arrors have been seen in every model of PS3 with varying frequency. The most reliable consoles appear to be those with a CPU/GPU of smaller manufacturing process, such as the Super Slim (SS) models (42xx and later) which have a 45nm CELL BE and 28nm RSX. The least reliable are the PS2 Backwards Compatable A-E Models, which have 90nm RSX/CELL BE.
Reportedly, a CPU BGA defect caused by delid. No trace damage or knocked SMD's observed.


The root cause is mechanical fatigue due to thermal cycling. The materials used to contruct the motherboard and processors have different properties. For example, the cooefficient of thermal expansion for FR4 Fiberglass used in the Motherboard and Processor Substrate is different than that of the copper BGA pads, which is different than that of the Lead-Free solder used to join them. This means they will expand and contract at different rates as the chip heats up and cools down, which applies shearing force to the BGA. Over many thermal cycle this deforms the solder balls and cause a defect (Such as a solder crack, torn trace, or the ball may pull away from the pad).
====3031====
[[CELL BE|CELL]] XGC REF Voltage Error


3034 is triggered when the voltage or data lines connecting the CPU/GPU are broken. There is often a data error (4XXX) that also appears, but not always. The most common cause is a BGA defect on the RSX, which can usually requires a reball/reflow to repair. Something about the RSX construction or workload causes it to fail more frequently, but the CPU can fail too. However, it's not always a BGA defect. The bumps on either chip can fail, Flex IO traces (the data lines that connect the CPU/GPU) can be broken/scratched, or accumulated damage from wear and tear (electromigration) can also cause this error. The true percentage of console with BGA defects that can be fixed with a reball/reflow is unknown. However, there is evidence to suggest that the underfill used to reinforce the CPU/GPU die and RSXRam bumps was not as effective when the PS3 was manufactured. This could explain many of the consoles who's reball fails prematurely afterwards.
Error during CPU initialization. This error appears to be a CPU BGA defect. In one PS3, it was caused by an "eraser mod," which puts pressure underneath the CPU (bad idea). In another, after delidding GPU/CPU an A0313032 was reported by knocking R5167 off. Which is +1.2V_YC_RC_VDDIO refrence voltage for the CPU's Redwood FlexIO ADC differential reference clock pair (BE_RC_REFCLK_P). An open line fault. He replaced the resistor and got A0402101 / A0403034 because RSX TX1 was shorted to ground by a nicked RSX trace during the delid. TX is the transmit line, so the CPU didn't recieve data from it, and noted the error (BitTraining BE:RRAC:BX0:BX:FLEXIO_ID). He messed with the nick and the error changed to A0313031.


If a reflow/reball of both the CPU/GPU fails, then the chip is beyond repair and needs replaced. The RSX can be replaced with the same model without modification. It can be replaced with a different model using a modchip that injects the correct RSX ID during boot. This has been nicknamed a "Frankenstein Mod." Since they are married to each other, the CPU can only be replaced if also replacing the chipset (NAND/NOR and SYSCON Chips). Since the CPU can't as easily be replaced, a dead CPU is usually considered unrepeatable.
====3032====
[[CELL BE|CELL]] BE XGC REF Voltage Error


==== 3035 ====
Error during CPU initialization. This error appears to be a CPU BGA defect. In one PS3, A0313031 was caused by an "eraser mod," which puts pressure underneath the CPU (bad idea). In another, after delidding GPU/CPU an A0313032 was reported by knocking R5167 off. Which is +1.2V_YC_RC_VDDIO refrence voltage for the CPU's Redwood FlexIO ADC differential reference clock pair (BE_RC_REFCLK_P). An open line fault. He replaced the resistor and got A0402101 / A0403034 because RSX TX1 was short to ground by a nicked RSX trace during the delid. TX is the transmit line, so the CPU didn't recieve data from it, and noted the error (BitTraining BE:RRAC:BX0:BX:FLEXIO_ID). He messed with the nick and the error changed to A0313031.
Cell and RSX


==== 3036 ====
It was discovered through sabotage testing that disabling +1.5V_YC_RC_VDDA caused error A0313032
Cell and RSX


==== 3037 ====
====3033====
Cell and RSX
[[CELL BE|CELL]]


==== 3038 ====
This error has been triggered when pad N12 (RSXVRM_VID0) was damaged, preventing RSX VDDC voltage from being set correctly. SYSCON sets the CPU VID just before the Config ring data is loaded. Apparently, SYSCON sets RSX VID on IC6201 (Buck Controller) at step number 32, which is just after. These voltages must be stable before the FlexIO can calibrated (BitTraining at Step No. 40 & ByteTraining at 50 & 51).
Cell and RSX


==== 3039 ====
====3034 ====
Cell and RSX
[[CELL BE|CELL]] / [[RSX]] / [[South Bridge]] error during Bit-Training


==== 3040 ====
This error occurs when Bit Training fails. Bit Training, also know as bit calibration, is a critical process during the power-on-reset (POR) sequence of the CELL BE processor. It fine-tunes the behavior of individual bits within the 8-bit-wide Rambus channels. This adjustment accounts for variations in circuitry, wiring, and loading delays. Bit training plays a pivotal role in optimizing signal quality by calibrating the signal driver current, driver impedance, and ensuring that the timing of each of the eight data bits aligns with clock edges, effectively centering the data "eye" allowing for more accurate and reliable data transmission.
 
'''Remember ITS NOT ALWAYS bad connection between the CPU and GPU.''' Bit training calibrates the connection between the GPU, CPU AND South bridge. For example, A0403034 occurred on a VER-001 with a probably BGA defect. By putting pressure on the southbridge the console would boot. Look at the data error and other information of the console before assuming a bad GPU.
 
This is the most common error seen in early Phat model PS3's with the 90nm [[RSX]]. It is the hallmark of solder fatigue (such as a cracked solder ball or bump defect) which affects the Flex IO interface that allows the CPU, GPU, and SB to communicate. It is by no means limited to the early models, however. These errors have been seen in every model of PS3 with varying frequency. However, it's most common in the earliest models, likely due to a manufacturing defect in the 90nm RSX material set. Namely a CTE mismatch between underill and bump material that leads to premature solder fatigue and GPU failure. Dubbed "BumpGate," this is a well known failure modality among GPUs manufactured from 2005-2008. Although it has not been proven unequivocally that the 90nm RSX is affected by Bumpgate, members of the community have shown the 90nm RSX has an increased failure rate, similar material set, and exhibits similar symptoms to known bumpgate affected chipsets - such as black screens (GLOD), graphical artifacts like lines, double images, color splotches and pixelation.
 
While Bumpgate is a plausible explanation, it's not the only one. The materials used to construct the motherboard and processors have different coefficient of thermal expansion (CTE). This means they will expand and contract at different rates as the chip heats up and cools down, which applies force to solder connections. Over many thermal cycle this deforms the solder and causes a defect. That may affect the Bumps, which attach the silicon die to the interposer (sometimes referred to as substrate) or the Ball-Grid Array (BGA) which connects the interposer to the Motherboard.
 
3034 is triggered when Bit calibration, also known as BitTraining, cannot complete correctly. So it is not limited to a singular cause. BGA defects from thermal cycling, drop damage, pulling force from separating the heat sink from the processors while disassembling, or delidding can occur. The bumps on CPU, GPU, or SB can fail, Flex IO traces that connect them can be broken/scratched, or accumulated damage from wear and tear (electromigration) can also cause BitTraining to fail. Anything that can disrupt the impedance of the FlexIO can cause BitTraining to fail. A skilled technician will need to use deductive reasoning to diagnose the cause and choose the appropriate repair.
 
A qualitative test known as a "pressure test" may be used to help make a diagnosis. Applying slight pressure, within reason (not your body weight or clamping force which could cause a BGA defect), to the processor flexes the motherboard beneath the BGA and "may" temporarily reconnect a solder ball with it's pad. Like holding 2 wires together. This can cause flickering on screen, a console to power on when it couldn't before, etc. If the console or error responds differently when pressure is applied, this may be taken as  evidence of a BGA defect. It is not definitive, but tips the odds in favor of that diagnosis. A reball in that case may be successful. However, if it does not respond to pressure is not likely to be the BGA and another explanation, such as bumps are more likely. It should be noted that bumps can be affected by force as well, but because the underfill supports them, it generally requires more force to reconnect them using this method. This is what the "Bolt mod," commonly performed on the XBOX 360 did. That much force permanently deforms the motherboard and causes irreparable damage. DO NOT DO THIS! But it illustrates the point. You don't need much force to see if the BGA is affected and if it responds to light pressure, it's unlikely to be the bumps. Therefore, taken together with other clues, it can be helpful to a skilled technician gathering evidence for a diagnosis.
 
In consoles with a 90nm [[RSX]] (CECH-Axx/Bxx/Cxx/Exx/Gxx/Hxx, M03 and Q00 models) the most likely cause of a 3034 is the GPU itself. It can be replaced with another 90nm RSX without modification. However, it can also be replaced with a more reliable 65nm or 40nm model, using a process nicknamed a "Frankenstein Mod." SONY service technicians performed this modification in some officially refurbished consoles. The PS3 community has developed a method as well. Since there is a question about the 90nm RSX's reliability and both a reball and Frankenstein mod require the 90nm to be desoldered, it is advisable to replace the 90nm GPU with a more reliable model instead of risking another 90nm GPU. Rework is hard on the motherboard and surrounding components, so choosing a repair with the fewest uncertainty's is wise.
 
In models without the 90nm RSX, 3034 is still possible, but far less likely to be caused by the GPU. CPU BGA defects are common in dropped consoles, those that have been delidded or have trace damage to the area around the processors. So troubleshooting is necessary to make a diagnosis.
 
====3035====
[[CELL BE|CELL]] and [[RSX]] error during Byte-Training
 
Failing GPU. RSX BGA or Bump Defect. Gradual decline in the solder connection affected Byte Calibration, but it managed to pass bit calibration 1st. A0403034 is soon to follow.  As electromigration wears down RSX Core, A0801601/A0801701 become A0501802/A0503037, A0503035, and finally A0403034.
 
====3036====
[[CELL BE|CELL]] and [[RSX]]
 
==== 3037====
[[CELL BE|CELL]] and [[RSX]]
 
RSX BGA or Bump Defect have cause A0503037/1802. A gradual decline in the solder connection affected Byte Calibration, but it managed to pass bit calibration 1st. A0403034 is soon to follow.
 
====3038====
[[CELL BE|CELL]] and [[RSX]]
 
====3039====
[[CELL BE|CELL]] and [[RSX]]
 
Occurred in a CECHL04 coinciding with a check stop error (14FF) during IO initialization at step# 52, which is after Byte-Training, but before the flash firmware sequence at step# 60. So maybe it's starship 2 related? Or it could be CPU/GPU related. Unknown.
 
====3040====
Flash
Flash


----
A0603040 is known to be caused by not soldering the flash (NAND/NOR) back on properly. It happens when the flash is not powered. Step #60 is when the StarShip 2 flash controller and NAND/NOR are initialized, kicking off the firmware sequence that loads the Operating System. Check their voltages and be sure the FW is not corrupt. If you have a backup, you could try replacing the Flash to see if a module failed.
=== Data ===
 
----
====3041====
A0523041 only reported once. Step #s 50-60 are when Southbridge paripherals are initialized. Step 52 is the last step before 60, when the flash and controller (SS2) are initialized. A0603040 will occur. Speculation: Perhaps 3041 is related to the SS2 or another SB paripheral. Perhaps a flash solder connection, or corruption issue. We don't know. Too few reports.
 
===Data Errors===
----  
*This error codes seems to be repeated up to 5 times for 5 special cases, as example, errors 4'''0'''01, 4'''1'''01, 4'''2'''01, 4'''3'''01, 4'''4'''01 are related to CELL, the only thing that changes in the error code is the second digit (located immediatly after the category). If at some point we find what means that digit we can join the wiki page sections together (with titles: "4001, 4101, 4201, 4301, 4401", etc...)


==== 4001 ====
====4001====
Cell
[[CELL BE|CELL]]


==== 4002 ====
====4002====
RSX
[[RSX]]


==== 4003 ====
====4003====
Southbridge
Southbridge


==== 4011 ====
====4011====
Cell
[[CELL BE|CELL]]


==== 4101 ====
====4101====
Cell
[[CELL BE|CELL]]


==== 4102 ====
====4102====
RSX
[[RSX]]


==== 4103 ====
====4103====
Southbridge
Southbridge


==== 4111 ====
====4111====
Cell
[[CELL BE|CELL]]


==== 4201 ====
====4201====
Cell
[[CELL BE|CELL]]


==== 4202 ====
====4202====
RSX
[[RSX]]


==== 4203 ====
====4203====
Southbridge
Southbridge


==== 4211 ====
====4211====
Cell
[[CELL BE|CELL]]


==== 4212 ====
====4212====
RSX
[[RSX]]


==== 4221 ====
====4221====
Cell
[[CELL BE|CELL]]


==== 4222 ====
====4222====
RSX
[[RSX]]


==== 4231 ====
====4231====
Cell
[[CELL BE|CELL]]


==== 4261 ====
==== 4261====
Cell
[[CELL BE|CELL]]


==== 4301 ====
====4301====
Cell
[[CELL BE|CELL]]


==== 4302 ====
====4302====
RSX
[[RSX]]


==== 4303 ====
==== 4303====
Southbridge
Southbridge


==== 4311 ====
====4311====
Cell
[[CELL BE|CELL]]
 
====4312====
[[RSX]]
 
====4321====
[[CELL BE|CELL]]
 
====4322====
[[RSX]]
 
====4332====
[[RSX]]


==== 4312 ====
====4341====
RSX
[[CELL BE|CELL]]


==== 4321 ====
====4401====
Cell
[[CELL BE|CELL]] or [[RSX]]


==== 4322 ====
====4402====
RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4332 ====
====4403====
RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4341 ====
====4411====
Cell
[[CELL BE|CELL]] or [[RSX]]


==== 4401 ====
====4412====
Cell or RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4402 ====
====4421====
Cell or RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4403 ====
====4422====
Cell or RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4411 ====
====4432====
Cell or RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4412 ====
====4441====
Cell or RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4421 ====
====5FFF====
Cell or RSX
[[CELL BE|CELL]] or [[RSX]]


==== 4422 ====
In recent times, this error has been known for the CPU (CELL), but it is actually due to an error in the NOR of the Playstation 3 SLIM/SUPER SLIM. Due to a failure when performing the exploit, you can end up having a console Bricked, for this use E3 FLASHER, Tennsy.etc
Cell or RSX


==== 4432 ====
For 3XXX, 4XXX consoles, the BRICK WITH Tennsy can be solved. in 4XXX keep in mind that the NOR can be emmc (12GB) therefore it will not be possible to solve it (for now...)
Cell or RSX


==== 4441 ====
for Super Slim reflow or reball RSX
Cell or RSX


{{Hardware Modification}}<noinclude>[[Category:Main]]</noinclude>
{{Hardware Modification}}<noinclude>
[[Category:Main]]
</noinclude>

Revision as of 08:13, 21 September 2024

Description

Syscon memory includes a 0x100-byte table for storing error codes. Each error code consists of 4 bytes and an additional 4 bytes for its timestamp. The table can hold up to 32 errors. When the table reaches its maximum capacity and a new error needs to be stored, Syscon will delete the oldest error.

How to get the syscon error log

If the PS3 still boots to the XMB and is able to install and run applications, you can use programs such as those listed at the top of the Platform ID page. If the PS3 fails to boot, you can still get the error log by connecting a PC to the PS3's UART port using a "USB to TTLUART Adapter" and running the errlog command. There is also the clearerrlog command to clear the error table (handy to avoid confusion with old error codes that might have accumulated along the months/years and not related to the actual problem).

Error log format

There are 2 error log formats that depends of the syscon type: for Mullion, or for Sherwood.
The error codes and the timestamps are stored in little endian (right to left)
The timestamps are in J2000 format (number of elapsed seconds since 2000/1/1 12:00:00). They can be converted to the standard Unix epoch and then summed 30 years minus 12 hours (or 946684800 seconds). Check the link to the right for information: 1
If the battery was empty or removed when the error was triggered, the timestamp will be recorded as FFFFFFFF.
If the battery is replaced but the time is not configured in GameOS either manually or by network, the error log will seem to store timestamps starting with a date around 2005/12/31 00:00:00 (0x0B488680).
More info about error log timestamp formats and loops in the Talk page


Syscon Errorlog from CECHAxx, COK-001, CXR713120-201GB
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                                 
00003700  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
00003710  01 10 80 A0  01 10 80 A0  04 10 80 A0  01 10 80 A0
00003720  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
00003730  01 10 80 A0  04 10 80 A0  01 10 80 A0  01 10 80 A0
00003740  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
00003750  01 10 80 A0  01 10 80 A0  04 30 09 A0  04 30 09 A0
00003760  04 30 09 A0  04 30 09 A0  FF FF FF FF  01 10 80 A0
00003770  01 10 80 A0  01 10 80 A0  01 10 80 A0  01 10 80 A0
                                                 
00003780  20 CF 6D 16  13 23 A7 16  3E D6 D9 16  87 13 2A 17
00003790  17 3C 7C 17  E4 A2 A3 17  A2 15 D4 17  13 FB EB 17
000037A0  CD 7D EF 17  33 85 EF 17  12 8C EF 17  A7 D9 FB 17
000037B0  58 5E 0E 18  BB C9 66 18  CD 25 B5 18  49 C4 29 19
000037C0  75 D5 F9 19  04 8B 61 1B  17 67 D0 22  2D 67 D0 22
000037D0  03 07 6C 27  12 09 6C 27  FF FF FF FF  FF FF FF FF
000037E0  FF FF FF FF  FF FF FF FF  F0 E7 27 16  06 BD 33 16
000037F0  E5 DE 38 16  DD D4 5C 16  C4 AC 6C 16  EA C7 6D 16
  • In the error log sample above:
    • Error log looped at least 1 time (1 errorcode FFFFFFFF)
    • Timestamps are valid, and the time was configured
    • Contains errors: A080 1  001 , A080 1  004 , A009 3  004 
Syscon Errorlog from CECHHxx, DIA-001, CXR714120-301GB
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                                 
00003700  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003710  22 43 40 A0  34 30 40 A0  FF FF FF FF  34 30 40 A0
00003720  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003730  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003740  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003750  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003760  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
00003770  22 43 40 A0  34 30 40 A0  22 43 40 A0  34 30 40 A0
                                                 
00003780  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
00003790  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037A0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037B0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037C0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037D0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037E0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000037F0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
  • In the errorlog sample above:
    • Errorlog looped at least 1 time (1 errorcode FFFFFFFF)
    • Timestamps are invalid
    • Contains errors: A040 4  322 , A040 3  034 
Syscon Errorlog from CECHLxx/Mxx/Pxx/Qxx, VER-001, SW-301
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                                 
00000900  31 21 80 A0  73 D4 50 0B  30 20 40 A0  89 D4 50 0B
00000910  31 21 80 A0  8C D4 50 0B  30 20 80 A0  8E D4 50 0B
00000920  30 21 80 A0  8E D4 50 0B  31 21 80 A0  8F D4 50 0B
00000930  01 30 00 A0  FF FF FF FF  FF FF FF FF  FF FF FF FF
00000940  30 20 80 A0  75 D1 50 0B  30 20 80 A0  78 D1 50 0B
00000950  30 20 80 A0  B2 D1 50 0B  31 20 80 A0  B3 D1 50 0B
00000960  30 21 80 A0  BD D1 50 0B  30 21 80 A0  D5 D1 50 0B
00000970  30 21 80 A0  DF D1 50 0B  30 20 80 A0  E0 D1 50 0B
00000980  31 21 80 A0  84 D2 50 0B  30 21 80 A0  DC D2 50 0B
00000990  31 21 32 A0  4F D3 50 0B  31 20 40 A0  50 D3 50 0B
000009A0  31 21 80 A0  51 D3 50 0B  30 21 80 A0  57 D3 50 0B
000009B0  31 21 80 A0  59 D3 50 0B  31 21 80 A0  FF D3 50 0B
000009C0  31 21 80 A0  05 D4 50 0B  30 20 80 A0  06 D4 50 0B
000009D0  30 20 80 A0  07 D4 50 0B  31 21 80 A0  2D D4 50 0B
000009E0  31 20 80 A0  3A D4 50 0B  30 21 80 A0  42 D4 50 0B
000009F0  30 21 80 A0  72 D4 50 0B  31 21 80 A0  72 D4 50 0B
  • In the errorlog sample above:
    • Errorlog looped at least 1 time (1 errorcode FFFFFFFF)
    • Timestamps are valid, but the time was not configured
    • Contains errors: A080 2  131 , A040 2  030 , A080 2  030 
      A080 2  130 , A000 3  001 , A080 2  031 
      A032 2  131 , A040 2  031 
Syscon Errorlog from CECH-42xx, PQX-001, SW3-304
Offset(h) 00 01 02 03  04 05 06 07  08 09 0A 0B  0C 0D 0E 0F
                                                 
00000900  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000910  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000920  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000930  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000940  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000950  02 18 61 A0  FF FF FF FF  02 18 61 A0  FF FF FF FF
00000960  02 18 61 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
00000970  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
00000980  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
00000990  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
000009A0  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
000009B0  34 30 40 A0  FF FF FF FF  02 40 40 A0  FF FF FF FF
000009C0  34 30 40 A0  FF FF FF FF  FF FF FF FF  FF FF FF FF
000009D0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000009E0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
000009F0  FF FF FF FF  FF FF FF FF  FF FF FF FF  FF FF FF FF
  • In the errorlog sample above:
    • Errorlog not looped (more than 1 errorcode FFFFFFFF)
    • Timestamps are invalid
    • Contains errors: A061 1  802 , A040 4  002 , A040 3  034 

Error code format

The error codes follows the format:  A  R  ST  C  ERR , where:

 A  (Fixed) A = This is always "A"


 R  (Reserved) 0-E = Unknown

  • F = Frequent error (For example, Motherboard Damage/Breakdown, etc.)


 ST  (Step Number) 00-7F = Step Number of the Power On Sequence (POS). This is the Power On Self Test (POST) process. If successful, the BOOT process begins, which loads the OS.


80 = Static State (Power ON). The console completed the POST and was in a static state. An error occurred during the PS3 boot-up process. If you encounter an error during gameplay, it may happen with Step No. 80. For instance, if your NEC/TOKINs are failing, you may experience 80 1002 errors.


90 = Static State (Power OFF). The error happened when the PS3 was powering off. For example, if a problem causes the system to hang while shutting down the console will beep before powering off. An error with step no. 90 will be recorded in the errorlog.


A0 = Immediately after SYSCON reset. When power is supplied to the PS3, it should enter Standby mode. Early "Phat" models will have a solid red LED illuminated to indicate Standby mode. The PS3 contains a Standby circuit that requires constant power so that it can wait for the user to turn on the console. Many other electronics also utilize this feature. This is also referred to as a Vampire circuit, as it uses constant power even when a user is not using the console. It enables the ability to turn it on remotely, thus mitigating the need to physically flip a switch.

The PS3 reset circuit comprises the SYSCON and its clock generating crystal, Bluetooth/WIFI card, front power/eject and LED panel, and thermal monitor ICs. The SYSCON must detect whether the console is being manually started with the power button or over Bluetooth using the controller, hence these modules must be powered. It is important to ensure proper functioning of the thermal monitors before releasing power to the Southbridge, CPU, or GPU. Neglecting this step could pose a fire hazard. The thermal monitors ICs serve as the PS3's fire alarms, and are critical safety equipment.

If there is a hardware issue in the circuit, an error will occur after the SYSCON reset, preventing the console from powering on. Instead of a solid red LED, the front LED will flash red indefinitely upon plugging in the console, indicating an error with the standby circuit. There will also be associated error log in the syscon which can help identify the specific component causing the issue.

5v_MISC powers the reset circuit, consisting of SMD/SMT components and ICs that power the modules above. Please refer to the Service manual (if available) for details. In particular, in the COK-001 service manual, the circuit diagram is provided on page 23/45. From the diagram, it's evident that DC/DC converter IC6005 generates +3.3v_EVER, while IC6006 produces +1.8V_EVER, and IC6009 is responsible for generating +3.3V_THERMAL. These are the primary voltages utilized by the component in the reset circuit, such as the Wi-Fi/Bluetooth card necessary for initiating console startup remotely by pressing the PS button on the controller.


 C  (Category)

  • 1 = System Error
  • 2 = Fatal Error
  • 3 = Booting Error
  • 4 = Data Error


 ERR  (Error)

  • This is a 3-Digit error code that gives specific information about the issue. For example, System error 002 (1002) means "RSX VRM Failure."

Discussion


The three-digit error code may repeat in other categories, but that does not necessarily indicate the same issue. System error 001 and Fatal error 001 don't mean the same thing. 1001 is "BE VRM Power Failure" and 2001 is "BE Error." We wouldn't be able to distinguish the error just by referencing it. This alone lacks sufficient information to comprehend the problem. The category gives the error context. To provide context, we use the 4-digit code CERR to differentiate them from one another.

Likewise, the Step Nump (ST) provides context to the 4-digit code. For example, you can have a CELL VRM Power Failure occur while playing an intense game (Static State, Power on). That would generate an 801001 (Step number 80). However, this error can also occur when the console is turned on, during the Power On Sequence Testing (POST), before it has a chance to start the boot loader, which loads the operating system, which in turn allows you to load the game. This time, it might generate error 101001. Step number 10 is lower than Step number 80, telling you this 1001 occurred earlier. The Category + Error tells you "What" happened. The Step Number tells you "when" it happened. It's building context that can help you figure out what is causing the error.

It's crucial to examine the entire error log, not just the 4-digit CERR, to understand "why" the error occurred. In the previous example, the A0801001 may indicate an issue with the NEC/TOKIN Proadlizers, a type of capacitor in the CELL CPU's VRM (voltage regulation module). Conversely, A0101001 may result from various other factors, simply because there is a larger number of things that can go wrong. Therefore, it's possible that the NEC/TOKINs are not the source of the error.

All of this means you must be knowledgeable about the hardware to comprehend the errors stored in the SYSCON. Unfortunately, there is not a unique error code for every potential issue. For example, there is no error that can indicate if Capacitor C6900 is short. Nonetheless, there are a few exceptions, where a specific code typically conveys the same meaning. For example, a fuse that has blown "usually" causes said code. But we cannot dismiss the possibility that a cap blowing on the same line could also be the cause, however unlikely that may be.


Examples:

A0801002

  • System Error 002 (RSX VRAM Power Fail) which occurred while the System was successfully powered On.
  • 1002 errors are known to be caused by faulty NEC/TOKINs, but other factors could also be involved. For additional information, refer to the Error Code section below.

A0403034

  • Fatal Booting Error 034 (RSX/CELL Communication Error) which occurred at step no. 40 (BitTraining), before the Power On Sequence has completed.
  • 3034 errors are often caused by defects on the BGA or Bump (among other issues). Experienced PS3 repair technicians have noted that it is almost exclusive to the RSX. Although we cannot completely exclude the possibility of a CELL BE BGA/Bump defect being the cause, it has been the exception to the rule. Time and experience have demonstrated that 3034 errors are primarily an RSX problem. A repair technician needs to decide which processor to reball/replace based upon the more likely candidate and proceed accordingly, using their judgement.
  • See Error Code section below for more details.

A0213013

  • 3013 errors are caused by Dead CELL BE CPU.

The following error code section will only list the last 4 numbers (category + error). However, remember the Reserved Area and Step Number can be useful to figure out "when" the error occurred and how frequent it is. The last four numbers are the most important for figuring out what specific error means, but you still need to figure out what it means in context of your issue. So you can diagnose the error and then fix it.

Error codes

System Errors


1001 (Power CELL)

  • Components Involved:
    • CELL (IC1001 on COK-001)
    • NEC/TOKIN Proadlizers (C6140/C6141/C6142/C6143 on COK-001)
    • Other nearby components of the power block

This error may result from inadequate filtering on the CPU's core voltage (VDDC) or an unexpected system shutdown. Voltage ripple or noise within a certain range could cause errors before they worsen into a CELL VDDC Power Failure (3003). There are several SMD filters, but the most critical ones are the NEC/TOKIN Proadlizers (capacitors). Bad NEC/TOKINs are responsible for 1002 errors on the GPU, but diagnosing 1001 errors is not as straightforward. You must observe the console experiencing YLOD while under load and confirm the generation of a new 1001 error. Otherwise, the 1001 code may indicate the console was not turned off correctly.

1001 errors can be recorded automatically when the system experiences an unexpected shutdown or loss of AC power. These errors frequently arise during testing when the console is frequently turned on or off instead of being gracefully shut down. A0801001 errors alone cannot prove that NEC/TOKINs are failing. Such errors are regularly found in the log of properly functioning machines and are not cause for concern unless the system is unexpectedly shutting down on its own.

A machine that can power on but displays graphical artifacts or no video may lead to misinterpretation of 1001 errors. In such cases, the console must be turned off forcibly using the power switch at the back of Phat/Fat models or by pulling the power cord in Slim and Super Slim models. This may cause 1001 or 1004 errors, which can be ignored if they were not generated under normal circumstances. If a console is showing artifacts/GLOD, fix the larger problem first (usually a GPU problem requiring a reball/replacement). Only after that, if stress testing results in 1001 errors, should you troubleshoot the CPU NEC/TOKINs.

Anecdote: One console, with faulty CPU NEC/TOKINs, displayed an A0901001 error only during shutdown. The Last of Us, a strenuous game, showed no signs of typical bad NEC/TOKIN behavior, and the system remained stable. However, it remained in shutdown for a prolonged period, resulting in the YLOD (3 beeps and flashing red light). It required a reset to power back on. Replacing the NEC/TOKINs resolved the problem.

1002 (Power RSX)

  • Components Involved:
    • RSX (IC2001 on COK-001)
    • NEC/TOKIN Proadlizers (C6229/C6230/C6231/C6232 on COK-001)
    • Other nearby components of the power block

This error is linked to inadequate filtering on the RSX_VDDC power line. A certain range of voltage ripple and noise can trigger this error before it becomes severe enough to cause an RSX_VDDC Power Failure (3004). YLOD's that lead to 1002 errors vary in length, lasting from 2 seconds to only happening during intense games.

There are a number of SMD components used in filtering, but the major concern is the NEC/TOKIN proadlizers (capacitors). 1002 errors indicate faulty NEC/TOKINs.

1004 (Power AC/DC)

When a console loses AC power, it may generate error A0801004. During the shutdown sequence, voltage regulators are turned off sequentially in reverse order to the Power On Sequence (POS). This allows components enough time to enter a power off state. It helps to prevents data corruption and voltage spikes or discharges that can damage sensitive components. 1004 errors commonly occur in machines that can power on but have graphical artifacts or no video, also known as GLOD. In such cases, the user has no choice but to improperly power off. Forcing a shutdown using the power rocker at the back of the console (for Phat models), or by pulling the power cord (for slim and super slim models) will cause an unexpected loss of AC power and prevent the SYSCON from completing the shutdown sequence.


This error may be disregarded if it occurred due to abnormal circumstances, such as a power outage or accidental unplugging. Since it did not result from a hardware malfunction, it is not a significant concern. If a console displays artifacts or the GLOD, the main issue should be addressed first, typically involving a GPU problem requiring a reball or replacement. Afterward, if 1004 errors reoccur, the AC/DC line, PSU, and its connection to the DC-DC converters should be diagnosed.

1005

On an NPX-001 Super Slim Motherboard I was experiencing an A0091005 error, started looking at the CELL power rail and started replacing components, went to replace the Aluminum Polymer Caps and after replacing all 5 on the Cell Side the error was gone. Seems to be a super slim error only but more testing will be needed to be done to see if it occurs on other motherboards. Another user reported being able to fix the same error code by replacing the same capacitors. Sometimes this error will be paired with a 3003 error. I also had a PQX-001 board with A0801005 followed by A0043005, found a short on the RSX on what I believe to be FBVDDQ for the 28nm, I was unable to fix this issue as after removing every component the line was still short.

1103 (Thermal Alert SYSTEM)

  • Components Involved:
    • CELL
    • CELL temperature monitor (only in mullion syscons, the CELL temperature monitor for PS3 slims and super slims are unable to send this error code)

Syscons have a pad/pin specifically for this signal. It was given an official generic name (not indicating anything that triggers it) because several components can send it. In the first PS3 models (with mullion syscon?), the signal can be sent by CELL or the CELL temperature monitor using the official function names SYS_THR_ALRT or THERMAL_OVERLOAD.

But this electrical design is not specific for the PS3, there could be other devices based on the IBM CELL, and developed by SONY, where this error code is sent by other components, which could have more than one CELL so in general we could say this error code indicates one (or more) of the CELL processors (and maybe other components not present in retail PS3 models or his temperature monitor chips) is overheating.

At least 3 consoles that had CPU trace damage resulting from failed IHS delid attempts exhibited A0801103 and A0902203 errors.

1200 (Thermal CELL)

  • Components Involved:

CPU Overheat. This is a common error. The usual culprit is failed Thermal Interface Material (TIM). As the material ages it "dries" allowing air inside. Air is a heat insulator, reducing the TIM's ability to transfer enough heat away from the processor. The system fan will steadily get louder over time until it cannot keep up. Once the processor approaches it's Thermal Shutdown Temperature a Yellow LED begins flashing on the console (Early Phat Models). Once it reaches the Thermal Shutdown Temperature the console will beep three times and hard shutdown, flashing red until the console is unplugged and the error state reset. Error 1200 is generated in the SYSCON errorlog.

First be sure the system fan is working. If so, apply new TIM Between the Internal Heat Spreader (IHS) and Heatsink (HS). If that does not resolve the problem, carefully remove the IHS (Delid) and replace the TIM between the IHS and processor DIE.

If that still doesn't work, it could be an issue with the temperature monitor chip (IC1101). Beyond that, some users have noted that dead CPU's can throw error 1200. However, that's the limit of our current understanding. It could be dead, or have another unexplained issue, but usually reflowing or reballing is the last ditch effort to revive such a console.

1201 (Thermal RSX)

  • Components Involved:

GPU Overheat. This is the same as error 1200 above, except it's for the GPU. The same repair steps apply, except it's Temperature Monitor Chip is IC2101. This error is rare. Out of hundreds of consoles and years of user reports this error has only occurred when the user forgot to replace the RSX heatsink when testing the console. It has not been reported under normal circumstances. The RSX tends to fail long before the TIM degrades to the point thermal shutdown is reached.

1203 (Thermal CELL VR)

Thermal monitor No Command tag Specified Error when the thermal monitor was communicating with the SYSCON. Or Thermal Shutdown (Hardware Initialized). Possibly because the thermal monitor is failing or the connections on it are iffy. Has been seen with CPU or GPU Overheats (1200 or 1201) . A0801103/A0902203 seen from CPU Trace damage (failed delid). Sometimes seen with 1001/1002 combo associated with Bad NEC/TOKINs (Rare). Or when the PS3 was powered on when hot (after heatgun).

Some PS3 motherboards (TMU-520, COK-001, COK-002), have a temperature monitor located somewhere in the CELL power block. The other retail PS3 motherboard models doesn't measure the temperature of the CELL VR

All the PS3 temperature monitor chips have a internal thermal sensor integrated + 2 pins for an optional external sensor. The temperature monitors for CELL and RSX are configured to use the external sensor, but this one for CELL VR probably uses the internal

1204 (Thermal South Bridge)

1205 (Thermal EE/GS)

This error is specific for COK-001/CXD2953AGB (with full PS2 hardware compatibility, EE+GS) or COK-002/CXD2972GB (with partial PS2 hardware compatibility, GS only)

1301 (CELL PLL Unlock)

Has been reported in a console where the CPU die was chipped during a delid attempt gone wrong. The console exhibited a Green Light of Death (GLOD) and shutting down periodically with A0801301.

On another console 1301 occurred after both the CPU and RSX were reballed. The reball of cell likely failed, or damaged it.

On a third console it was reported after nearly every chip on the Motherboard was heat gunned. They probably didn't achieve the necessary temperatures to reflow the CPU, and if they did they probably damaged it by using too much heat.

In all 3 cases the CPU was damaged or heated in some way.

401001/401301/402120 occured when IC6002 pins 17/18 were accidentally short. It blew out C6025 & PS6001. Dead CPU possible. Double check CELL_PLL Voltage and clock generators. Rare code.

Occured while overvolting a 40nm CXD5300 RSX to achieve a higher overclock on a CECH-2501A. An A0801002/A0801301 occured when the voltage regulation module exceeded it's design specifications. SB UART reported there was a Busy loop detected, suggesting that poor filtering under these extreeme conditions can cause a PLL unlock. It did not occur on every YLOD. Note, the 25xx models do not have NEC/Tokins.

14FF (Check Stop)

This error can occur when the console was on at the time the YLOD occurred. On consoles exhibiting this error, subsequent attempts to start the console resulted in a GLOD with 1601/1701 errors, or a YLOD within 2 seconds. SYSCON errors usually show one A0801601/A0801701 occurring at the same timestamp, followed thereafter by 3034/4xxx errors for all subsequent attempts to PWR it on. Or it'll GLOD and throw more 1701/1601's. The working theory is that there is a precarious solder joint (BGA or bump defect) teetering on the edge of breaking. It'll soon switch to 3034 with or without 4xxx errors.

Complicating the issue is the fact that sometimes people will get a 1301 or 1802 also. It likely has to do with where the joints are failing and it's involving those sub systems briefly before fully breaking.

Unlike Livelocks, which can be caused by both hardware and software conditions, Checkstops are a hardware issue. A checkstop occurs when the CPU or GPU, cache, memory, or I/O bus controller, finds something in an impossible state (impossible unless the Hardware is broken). The error isn't identified as a particular bus transfer in progress, or the CPU/GPU detects the console is stuck (frozen, no progress being made with that operation). When nothing can be done for a long enough period of time, the checkstop errors is logged and BE ATTENTION is driven High. SYSCON immediately shuts the console down with error A08014FF and A0801701.

The most likely cause of the error is a failing GPU (RSX) solder joint (BGA or Bumps). A distant second is a failing CPU (CELL) solder joint (BGA).

1601 (CELL Livelock Detection)

CPU is deadlocked and cannot proceed. Some kind of error occurred, preventing a process from completing. It is the software equivalent of trying to pass someone in a hallway and you both keep choosing the same direction to swerve. Now imagine you had exactly 30s to make it to the other end of the hallway to catch an elevator, and it takes 29s to get there. Neither of you can pass and miss your elevators because of it. Now imagine you were supposed to pass an envelope to a person on the 3d floor, who had 30s to read it and enter it in a spreadhseet. Now he misses his deadline too. And imagine the entire organization was micromanaged like this. One disruption can cause the whole operation to grind to a halt! That's kinda how this works.

Basically this means the console froze and had to reboot. In the PS3 this is often preceded by graphical artifacting. The cause is often a solder joint on the RSX (BGA or Bumps). Generally these errors are seen in the early stages of a GPU failure. However CPU failures cannot be ruled out. They are just less likely.

Speculation:

As the impedance of propagating solder cracks increases, the digital logic core has a harder time calibrating the FlexIO during BitTraining. Once impedance reaches the limits of the compensation network, interference causes random issues during software execution. LiveLock conditions cause BE Attention signal to be driven High and the SYSCON shuts the console down (YLOD) with errors A0801601 and A0801701.

As the console cools the microscopic gaps in the solder can be physically reconnected by thermal warping. Warping is due to differences in the Coefficients of Thermal Expansion (CTE) between materials in the motherboard and processor. This expansion and contraction can reconnect the solder joints just enough to allow the console to boot. Or it may disconnect them.

  • If they reconnected, the console will boot until it experiences another 1601/1701 event.
  • It they do not reconnect, the console cannot complete BitTraining and will fail in POST with error A0403034. Often with an associated Data error, such as A0404401 (if the broken solder joint affected a Data line on one of the SPI lines). If there is no Data error, the broken joint only affected the voltage for the SPI line. Either RSX_VDDR or YC_RC_VDDIO.

If a YLOD turns into a GLOD after reball/reflow then 1601 (with or without 1701) could mean the RSX RAM was damaged. This is a loose association based on a few user reports.

1701 (CELL BE ATTENTION)

BE ATTENTION is an active-high output flag sent by the CPU to the SYSCON. During initialization & configuration it is used to request an operation by the SYSCON. When ATTN goes High the syscon reads the SPI Status Register to determine the cause of the Attention signal. It remains high until software resets the condition that caused it.

After Power On Reset the BE attention signal is driven low and is supposed to stay there! If there is a Checkstop error (14FF), Livelock Detection (1601), or PLL Unlock (1301) the CPU enters a fault condition and raises the Attention signal (1701) during operation. The SYSCON sees this and immediately shuts the console down with error code A0801701 and usually another error indicating the cause. One common way this happens is when a solder connection breaks while the system is on. This could be the BGA (Ball Grid Array) or the Solder Bumps under the die.

Going into more detail, BE Attention is used during Power On Reset (POR)...

  • To load CPU VID voltage from the VRM internal registers.
  • To Write configuration-ring data (Important CPU Config settings that should only be modified at boot, otherwise errors can occur).
  • To calibrate the FlexIO interface (BitTraining).

If Attention occurs during the Power ON State (Step# 80) it indicates an error condition. Basically, something is flagged by the Processor as abnormal. It's forced to attempt to resolve the problem before it can continue with whatever it was trying to do. If the error condition cannot be resolved, the CPU sends the ATTENTION signal to the SYSCON. The SYSCON immediately shuts off the console, then reads the SPI Status Register to determine the cause. Then it records the A0801701 in it's errorlog along with the specific cause (if it determined one). Errors that can cause the Attention include:

  • Unresolved Checkstop errors (14FF)
  • Livelock Detection (1601)
  • PLL Unlock Condition (1301)
  • BGA/Bump Defect that occurs while the Console was On (Step# 80). Subsequent attempts to power on the console would result in 3034/4xxx errors.

A user get this error code with a damaged hard drive. He was transferring some games via FTP, and his console turned off with YLOD. When he tried to turn on again, he get a GLOD. Problem was fixed just by changing the HDD.

1701 has been reported from using homebrew apps that caused a software conflict. Uninstalling the software can resolve the issue. It that's not possible because the system is locked up, it may be necessary to restore the operating system (OS).

1701 and 1601 can also occur when exiting heavier games on a dex firmware paired with multiman due to I'm guessing dex and webman allocating 2 more megabytes of ram [as shown by the webman fps counter allocating 18 to xmb on cex and 20 on dex], interfering with the syscon fan curve causing it to get "stuck" resulting in overheating and causing a ylod upon exiting the game though the console reboots fine which may trick people into thinking webman or dex overheats their consoles. I haven't looked into this issue too much but converting from cex to dex has fixed the issue and the issue arose when converting to dex so I belive there are fringe cases where this is an issue

1802 (RSX Initialization)

A0801802 occurring after the console has booted (step# 80) and causes BE Attention (1701) alarm raised when a Checkstop error (14FF) occurs. Likely the 1802 was the hardware failure that caused the checkstop error. That causes BE ATTENTION to be driven High and the SYSCON shuts the console down with A0801802, A08014FF, and A0801701. That makes sense because the CPU couldn't continue with it's process when the RSX interrupt occurred. These errors have been seen in consoles that were repaired by an RSX reball/replacement.

1802 is confirmation that the RSX was involved, if there's any doubt about what's cauing a 3034.

1900 (RTC Voltage)

RTC voltage

1901 (RTC Oscilator)

RTC oscilator

1902 (RTC Access)

RTC access

1b01 (CELL Initialization)

CPU Thermal Sense Error. Thermal Monitor (IC1101) external sense line. Check C1103, R1106/7 & replace IC1101 (COK-00x) before reballing CPU. If all else fails the CPU's thermal diode is dead.

This error tends to occur at step number 20, during core intialization.

Suspected that removing R1106/7 or the CPU iteslf will cause A0201b01/A0A02030 errors. Hasn't been confirmed through sabotauge testing.

1b02 (RSX Initialization)

RSX Thermal Sense Error. Thermal Monitor (IC2101) external sense line. Check C2103, R2101/2 & replace IC2101 (COK-00x) before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead.

This error tends to occur at step number 20, during core intialization.

Confirmed that removing R2101/2 or the GPU iteslf causes A0201b02/A0A02031 errors.


Fatal Errors


  • This fatal error codes seems to be repeated up to 3 times for 3 special cases, as example, errors 2003, 2103, and 2203 are related with southbridge, the only thing that changes in the error code is the second digit (located immediately after the category 2). If at some point we find what means that second digit we can join the wiki page sections together (with titles: "2001 & 2101", "2002 & 2102", "2003 & 2103", etc...)

In other words, there are 3 groups: 20xx (composed by 13 errors), 21xx (composed by 13 errors), and 22xx (composed by 1 error). See Discussion

2001 (CELL)

CELL (IC1001)

2002 (RSX)

RSX (IC2001)

2003 (South Bridge)

South Bridge Error (IC3001)

2010 (Clock Subsystems)

Clock Generator Error (IC5001)

2011 (Clock CELL)

Clock Generator Error (IC5003)

2012 (Clock CELL)

Clock Generator Error (IC5002)

2013 (Clock CELL, RSX, South Bridge)

Clock Generator Error (IC5004)

2014 (Unknown)

Bad GPU NEC/TOKINs if assiciated with 1002. Was reported in a DIA-002 with other errors. A0091002/A0102014, A0101002/A0102113, and also had A0101002 + 10x A0202120s. Presumably caused by failed RSX tokins.

2020 (HDMI)

HDMI Error (IC2502)

This code is not diagnostic on its own. When coinciding with 1601/1701, 14FF, 1301, and 3034 it usually means a GPU issue. When coinciding with a 1002 it's usually NEC/TOKIN proadlizers. When they occur in bunches AND without more diagnostic codes, all in the same power on, it may be the MultiAV or HDMI Transmitter ICs. The presence of other codes give you context to their meaning.

2021 (Unknown)

Rare code. Occurred in a CECHBxx model with 10x A0202121 occuring throughout the power on sequencing before starting the bootloader. It did not prevent boot before the console overheated. Errorlog shows two 10x 2121 + 1200 error combos. One of which also had A0802021 coinside. Log shows the console originally had a NEC/Tokin issue (A0801002) before this started. While this is an Unknown error combo, it may be similar to 2120. On another console, timestamps showed A0101002 occurred 1st and then 10x 2120's occurred over the next 10 seconds. Replacing tokins fixed that console. It's possable this is a similar situation, but with a new code.

2022 (DVE)

DVE Error (IC2406, CXM4024R MultiAV controller for analog out)

This error may be normal in an otherwise working console. They have been observed in th errorlogs of perfectly operational units and can occur naturally from AV issues.

This error has been observed with no video out using HDMI on a Samsung Smart TV. They reproduced the error by making the TV detect another console first (a PS4), turn off the TV, swap the HDMI cable from the PS4 to the PS3, and turning back on the TV.

This error is also present when the console produces graphical artifacts on the screen. The console freezes and cannot be used, forcing the user to turn off the console. This produces the 2022 error code and is an early sign of GLOD.

It is often seen coinciding with 1601/1701, 14FF, 1301, and 3034 in case of Bad GPU (Common). DVE or HDMI Transmitter possible. If so, multiple errors at the same timestamp allow you to distinguish between causes.

This error could also show when opening and closing a PS2 emulated game in a CFW console, both in Evilnat and Rebug. The errors would be in dyads. If this is the case there is no reason of concern.

2024 (AV)

This code is not diagnostic on its own. When coinciding with 1601/1701, 14FF, 1301, and 3034 it usually means a GPU issue. When coinciding with a 1002 it's usually NEC/TOKIN proadlizers. When they occur in bunches AND without more diagnostic codes, all in the same power on, it may be the MultiAV or HDMI Transmitter ICs. The presence of other codes give you context to their meaning.

This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).

2124 and 2024 errors occuring in random bunches registering several per power on attempt been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.

A0A02024 Occurred in a KTE-001 with a failed Bluetooth/Wifi module step-up voltage converter. A0002024/A0002124/A0003001 occured when attempting to power without 12v connected. A0A02024 also recorded. When 12v was connected the same codes would occur at step no. 09 instead of 00.

2030 (Thermal Sensor, CELL)

Thermal Monitor (IC1101) external sense line. Check C1103, R1006/7 & replace IC1101 before reballing CPU. If all else fails the CPU's thermal diode is dead. Was seen in a PS3 that was destroyed by a heatgun. Also had A0A02031/2033 & A0902031.

Speculation: 2030-33 errors reported in case of dodgy PWR/EJT daughter board.

2031 (Thermal Sensor, RSX)

GPU Thermal Monitor (IC1002) external sense line. Check C2103, R2101/2 & replace IC2101 before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead. Confirmed when the RSX is removed, you'll get 1b02/2031 at step number 20.  Was seen in a PS3 that was destroyed by a heatgun, which also had A0A02030/2033 & A0902031. Once reported to be caused by a checksum mismatch at address 3dfe.

2033 (Thermal Sensor, South Bridge)

Typically a dead SB Thermal Monitor IC. Check nearby SMDs & traces. Was seen in a PS3 that was destroyed by a heatgun. Also had A0A02030/2031 & A0902031.

2040

Found during sabotage testing on a KTE-001 Board that removing F6300 caused a A0012040 error, this fuse appears to be on the 12v line.

for super slim reflow or reball CPU

2044 (Super Slim short circuit - BT/Wi-Fi and 5Volt)

2101 (CELL)

CELL (IC1001)

Often coincides with A0403034 indicating the GPU needs replaced (usually). Deliding can cause this, look for trace damage. In one case, errors A0402101 / A0403034 occured because RSX TX1 was shorted to ground by a nicked RSX trace during the delid. TX is the transmit line, so the CPU didn't recieve data from it, and noted the error (BitTraining BE:RRAC:BX0:BX:FLEXIO_ID).

2102 (RSX)

RSX (IC2001)

In several reports IC6301 replacement fixed it. In one case, 10x 2120 / 1x 2102 combo was fixed by replacing RSX_VDDIO voltage controler (IC6317).  RSX_FBVDDQ (VRAM voltage) implicated. In most cases, it's an RSX Failure. Sometimes coinciding with A0403034 or other codes indicating GPU fail. Often after reflow attempt.

2103 (South Bridge)

Southbridge Error (IC3001)

2110 (Clock Subsystems)

Clock Generator Error (IC5001)

This error can be caused by a 5V_MISC short to ground. One user had an A0022110 after replacing IC6105 (Buck Converter) and accidentally bridging the 5V voltage input. So check the 5V line for shorts.

This error has been resolved by a number of users who had a short on F6001. It is important to note that something usually causes that fuse to blow, like a short. So it's important to troubleshoot the board to find and repair the shorting component before replacing the fuse. Otherwise the new one will blow too.

One user, who resolved this error on his C model PS3, noted "very short YLOD. Error code shows 2110[...]Some earlier code shows 1001 and 1002." The 1001 & 1002 errors he noted in the log before the 2110 appeared may have been a clue that C6019 or C6020 (as they are in parallel) was deteriorating. Further investigation is needed to confirm this hypothesis, however. In his case, C6019 was shorting and caused F6001 to blow. This short overloaded F6001 and cut power to many Subsystems, such as the HDD, USB ports, South bridge, CPU, GPU, etc. Another user confirmed this. The error log was showing code 2110 and one entry earlier was showing code 1001. Checking both capacitors after removing them from the board, confirmed that one capacitor was reading 140 ohms and not reading as a capacitor, so it was working as a resistor causing extra load in the fuse.

One particularly noteworthy component is IC6020, which supplys +3.3v_MK_Vdd to the clock generator (IC5001). When F6001 blows, a 02 2110 is generated. A step number of 02 is very early in the power on sequence (POS), which explains why 2110 is triggered instead of another error code. Since the clock generator is critical for timing, it is one of the first things the SYSCON checks during the POS.

2111 (Clock CELL)

Clock Generator Error (IC5003)

Once reported in a console with a bad RSX Thermal monitor. Had mostly 2031 errors at various step numbers. The 2111 was a rare occurance. SYSCON reported it as an "Unrecoverable FATAL ERROR by thermal." Check C2103, R2101/2 & replace IC2101 before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead.

2112 (Clock CELL)

Clock Generator Error (IC5002)

2113 (Clock CELL, RSX, South Bridge)

Clock Generator Error (IC5004)

Analog Voltage for the core PLL of IC5004, which is an ICS9214 Clock Generator used to support the Rambus XDR memory subsystem and Redwood logic interface.

SW_1_B enables control Pin 5 on IC6013, which generates +2.5V_LREG_XCG_500_MEM. If that fails it generates A0092113.​

Reportedly fixed by replacing IC5001. One person tried replacing X5301, but short C5142 (2.5v to GND). This killed power to IC5004 (RSX/CELL/SB Clock Generator for FlexIO) and caused error A0092113. IC5004 relies on +1.2V_YC_RC_VDDIO refrence voltage to carry the signals. That can be affected by RSX/CPU faults. Another possability is F6302 or nearby SMDs, which supplys 1.7V_MISC to IC6303 to generate +1.2V_YC_RC_VDDIO, among other voltages required to start CPU/SB/GPU.​

2114 (Unknown)

Bad GPU NEC/TOKINs if assiciated with 1002 and/or 3004. Has been reported in VER-001 and DYN-001 motherboard revisions. Related codes have been reported in a DIA-002, with A0091002/A0102014, A0101002/A0102113. That was presumably caused by failed RSX tokins. ​

A lone 2114 or one assiciated with 2124, 3020, 1301, and/or 3034 may be GPU/BGA related. Possably HDMI encoder (MN864709), or Texas instrument 88J9LKK C5714 G4 Clock generator, but evidence for both of those cases is weak. However, given it's similarity to error code A0092113, which is related to the clock generators, a connection is suspected.​

2120 (HDMI I/O Error)

NOTE: Context matters with this error code! The step number and the number of codes per YLOD is different. Careful observation allows you to diagnose the most likely cause.

2120 means an issue with the high speed data buss connection between the DVE <--> RSX <--> HDMI transmitter has occurred. This I/O error DOES NOT mean the HDMI encoder (IC2502) is bad. It is context based. Associative, not diagnostic by itself. You must infer the diagnosis by using other, more diagnostic codes and observe console behavior to identify the cause.

Count the number of 2120's your SYSCON records per YLOD event; look at the timestamp. 10x A0202120 + A0213013 error combinations appear to be related to VDDIO, the reference voltage powering the I/O buss. IC6301 is involved in the formation of +1.7V_MISC, which among other things provides input power to the DC-DC converters that output +1.2V_YC_RC_VDDIO, +1.5V_YC_RC_VDDA, +1.2V_SB_VDDC and +1.2V_SB_VDDR. Lack of voltage to these DC/DC converters downstream of IC6301 suggests F6302 has blown. A number of people have fixed these 2120/3013 error combos by finding shorts at or near C6320 and replacing Fuse F6302. But there are many other SMD nearby that might cause these fuses to blow. So you will need to track the source of the short and fix it, or the fuse will just blow again.

A bad thermistor (TH2501) has been reported to cause A0002120. It provides over current protection for the HDMI transmitter and output device in case there's a 5v short. This might happen if pins 17 (GND) and 18 (+5v) are damaged on your HDMI port or cable. Or if C2558 or C2570 short. See the service manual circuit diagrams as there are other SMDs that could malfunction and cause this error.

A0802120 and A0902120 errors can be caused by BGA or Bump defects that affect I/O, either the RSX or CELL. BGA defects on RSX VDDIO pads have been confirmed with a pressure test to have caused 2120 errors, but usually only one of them occurs per YLOD event. For example, one YLOD event may generate A0403034, A0404412 and an A0902120 error. This would indicate a bad GPU, not a bad HDMI transmitter. And since it occurred during the shutdown state (step number 90) this excludes issues that would have generated an error earlier in POST, like a fuse or a short in the Voltage regulation module (VRM).

The HDMI transmitter (IC2502) can also cause A0802120 and A0902120 errors. The IC itself or any of the SMDs between it and the RSX. You can tell a genuine HDMI transmitter issue apart because there multiple A0802120 errors occurring during the bootloader after the console has completed the power on self test (POST). This excludes a fuse and VRM issues, as indicated by the step number 80 (power on state). You will usually see a different number of 2120s at random. Like 4 or 6 of them. This is different than the 10x 2120/3013 error combo or 3034/4xxx/2120 combo described earlier.

2122 (DVE)

DVE Error (IC2406, CXM4024R MultiAV controller for analog out)

2124 (AV)

This error tends to cause a delayed Yellow Light Of Death (10s - 1min). Sometimes described as a Green Light Of Death (GLOD) or Red Light Of Death (RLOD).

2124 and 2024 errors have been fixed by replacing both the AV and HDMI encoders. One user reported 2024/2124 errors resolved by replacing the HDMI encoder. Another removed the HDMI encoder and tested the console without it. That console primarily filled the errorlog with 2124 errors, but a few 2024's as well. So it is unclear if 2124 is specific to the HDMI Encoder or AV Encoder. It seems it could be either.

2130 (Thermal Sensor, CELL)

CPU Thermal Monitor (IC1101) external sense line. Check C1002, R1003/4 & replace IC1002 before reballing CPU. If all else fails the CPU's thermal diode is dead.

2131 (Thermal Sensor, RSX)

GPU Thermal Monitor (IC1002). Check C2103, R2101/2 & replace IC2101 before reballing/Replacing RSX. If all else fails the GPU's thermal diode is dead.

2133 (Thermal Sensor, South Bridge)

2203 (South Bridge)

From sabotage tests it was found that disabling +2.5V_SB_PLL_VDDC produced four A0802203 errors.​ Also, disabling +1.2V_SB_VDDR produced A0302203 & A0403034.

Sometime seen with a "SB Counter Error -  Explicit Bug" in bringup log. Oftern accompanies CPU (1200) or GPU (1201) Overheats. Once occurred in GLOD, after holding power "SB (FATAL) XDR Link not initilized."

2310


Fatal Boot Errors


3000

Power Failure

3001

12v Power Failure

Usually this caused by a bad Power Supply Unit (PSU).

Alternatively, a failure on the 12v_main line can cause it. Check fuses, capacitors, resistors, and ICs on the 12v line. Measure resistance of the large 2 prong 12v connector on the motherboard. It should read in the Kilo ohms range if there is sufficient separation. Otherwise you may have a short somewhere on the line.

3002

Power Failure

3003 (CELL Core Power Failure)

This error will occur in the case of a PWR failure on the main core voltage of the CPU (VDDC). CPU Bulk filter caps (Eg. NEC/TOKIN) or any SMD in the Feedback and Compensation network of the Voltage Regulation module (VRM). Including the Buck Converters (AKA IOR Power Blocks).

A short Blu-Ray drive can cause this error as well. Be sure that your drive is going well before doing anything on your console.

3004 (RSX Core Power Failure)

This error will occur in the case of a PWR failure on the main core voltage of the GPU (VDDC). Bulk filter caps (Eg. NEC/TOKIN) or any SMD in the Feedback and Compensation network of the Voltage Regulation module (VRM). Including the Buck Converters (AKA IOR Power Blocks).

3005

Had A0043005 on a PQX-001, found that the RSX was shorted out and causing A0043005, I was unable to fix the error.

This error will occur in the case of fuse F7601 is burning in PQX-001.

3010

CELL Error

Observations: A user triggered this error by injecting 3.3V into PWRGD (power good) of IC6103 (NCP5318 CPU Buck Controller). It generated error 20 1001 and 20 3010. Another user (Razmann4k) got this error on their CECHL04 by attempting the eraser mod on an already delidded Cell and noticed a crack running down the middle of the Cell die. It caused 20 3010. A 20 3010 error was also observed on a CELL that was physically damaged during a delidding attempt by the console owner.


This problem may be related to the PLL signal generator circuit, open resistors, crystal oscillator or even the integrated itself (CDC735/CDC736/4227ANLG)

RSX FBVDDQ shorts, BE thermal/PLL VDDA open line, PWM signal disruption to CPU Buck Converters at startup have all been known to cause A0203010 errors. Seen in consoles that also had or developed 3034/4412.

3011

CELL

3012

CELL

3013

BE_SPI DI/DO ERROR

CELL not communicating to syscon via SPI (1.2V MC2_VDDIO and 1.2V BE_VCS no output) = Possible shorts on the line, check C4001 and trailing caps. Possible dead CPU?

Another user had one on a CPU he damaged while deliding.

A0212120/A0213013 error combinations are common. They appear to be related to VDDIO. IC6301 is involved in the formation of +1.7V_MISC, which among other things provides input power to the DC-DC converters that output +1.2V_YC_RC_VDDIO, +1.5V_YC_RC_VDDA, +1.2V_SB_VDDC and +1.2V_SB_VDDR. Lack of voltage to these DC/DC converters downstream of IC6301 suggests F6302 has blown. A number of people have fixed these 2120/3013 errors by finding shorts at or near C6320 and replacing Fuse F6302. But there are many other SMD nearby that might cause these fuses to blow. So you will need to track the source of the short and fix it, or the fuse will just blow again.

One person reported A0202120/A0213013 when his CPU substrate (interposer) was cracked in half by a failed delid attempt.

Through sabotage testing is was found that disabling +1.2V_YC_RC_VDDIO caused A0213013​.

Also through sabotage testing, it was found that when L6305 is removed it cuts off +1.8V_RSX_FBVDDQ (VRAM voltage). It caused a 10x A0202120 & 1x A0213013 error combo.

3020

CELL

A0233020 occurred during the readiness check after VDDC is formed. It suggests a voltage instability or error preventing the CPU from reporting power good back to the SYSCON. Has occurred in a console where every chip was heatgunned. Associated errors in the log were, A0A02031/A0201802 (RSX thermal monitor and interrupt). Before the heatgut it had, A0801301/A0802120 (BE_PLL & VDDIO error). In another console it coincided with A0231002 (RSX VDDC filtering). That console had A0003001, A0002120/A0221002, A0221002/A0222120, A0231002/A0233020. Indicating a more serious issue with the PSU, Fuses or Core voltage ICs.

3030

CELL

Reportedly, a CPU BGA defect caused by delid. No trace damage or knocked SMD's observed.

3031

CELL XGC REF Voltage Error

Error during CPU initialization. This error appears to be a CPU BGA defect. In one PS3, it was caused by an "eraser mod," which puts pressure underneath the CPU (bad idea). In another, after delidding GPU/CPU an A0313032 was reported by knocking R5167 off. Which is +1.2V_YC_RC_VDDIO refrence voltage for the CPU's Redwood FlexIO ADC differential reference clock pair (BE_RC_REFCLK_P). An open line fault. He replaced the resistor and got A0402101 / A0403034 because RSX TX1 was shorted to ground by a nicked RSX trace during the delid. TX is the transmit line, so the CPU didn't recieve data from it, and noted the error (BitTraining BE:RRAC:BX0:BX:FLEXIO_ID). He messed with the nick and the error changed to A0313031.

3032

CELL BE XGC REF Voltage Error

Error during CPU initialization. This error appears to be a CPU BGA defect. In one PS3, A0313031 was caused by an "eraser mod," which puts pressure underneath the CPU (bad idea). In another, after delidding GPU/CPU an A0313032 was reported by knocking R5167 off. Which is +1.2V_YC_RC_VDDIO refrence voltage for the CPU's Redwood FlexIO ADC differential reference clock pair (BE_RC_REFCLK_P). An open line fault. He replaced the resistor and got A0402101 / A0403034 because RSX TX1 was short to ground by a nicked RSX trace during the delid. TX is the transmit line, so the CPU didn't recieve data from it, and noted the error (BitTraining BE:RRAC:BX0:BX:FLEXIO_ID). He messed with the nick and the error changed to A0313031.

It was discovered through sabotage testing that disabling +1.5V_YC_RC_VDDA caused error A0313032

3033

CELL

This error has been triggered when pad N12 (RSXVRM_VID0) was damaged, preventing RSX VDDC voltage from being set correctly. SYSCON sets the CPU VID just before the Config ring data is loaded. Apparently, SYSCON sets RSX VID on IC6201 (Buck Controller) at step number 32, which is just after. These voltages must be stable before the FlexIO can calibrated (BitTraining at Step No. 40 & ByteTraining at 50 & 51).

3034

CELL / RSX / South Bridge error during Bit-Training

This error occurs when Bit Training fails. Bit Training, also know as bit calibration, is a critical process during the power-on-reset (POR) sequence of the CELL BE processor. It fine-tunes the behavior of individual bits within the 8-bit-wide Rambus channels. This adjustment accounts for variations in circuitry, wiring, and loading delays. Bit training plays a pivotal role in optimizing signal quality by calibrating the signal driver current, driver impedance, and ensuring that the timing of each of the eight data bits aligns with clock edges, effectively centering the data "eye" allowing for more accurate and reliable data transmission.

Remember ITS NOT ALWAYS bad connection between the CPU and GPU. Bit training calibrates the connection between the GPU, CPU AND South bridge. For example, A0403034 occurred on a VER-001 with a probably BGA defect. By putting pressure on the southbridge the console would boot. Look at the data error and other information of the console before assuming a bad GPU.

This is the most common error seen in early Phat model PS3's with the 90nm RSX. It is the hallmark of solder fatigue (such as a cracked solder ball or bump defect) which affects the Flex IO interface that allows the CPU, GPU, and SB to communicate. It is by no means limited to the early models, however. These errors have been seen in every model of PS3 with varying frequency. However, it's most common in the earliest models, likely due to a manufacturing defect in the 90nm RSX material set. Namely a CTE mismatch between underill and bump material that leads to premature solder fatigue and GPU failure. Dubbed "BumpGate," this is a well known failure modality among GPUs manufactured from 2005-2008. Although it has not been proven unequivocally that the 90nm RSX is affected by Bumpgate, members of the community have shown the 90nm RSX has an increased failure rate, similar material set, and exhibits similar symptoms to known bumpgate affected chipsets - such as black screens (GLOD), graphical artifacts like lines, double images, color splotches and pixelation.

While Bumpgate is a plausible explanation, it's not the only one. The materials used to construct the motherboard and processors have different coefficient of thermal expansion (CTE). This means they will expand and contract at different rates as the chip heats up and cools down, which applies force to solder connections. Over many thermal cycle this deforms the solder and causes a defect. That may affect the Bumps, which attach the silicon die to the interposer (sometimes referred to as substrate) or the Ball-Grid Array (BGA) which connects the interposer to the Motherboard.

3034 is triggered when Bit calibration, also known as BitTraining, cannot complete correctly. So it is not limited to a singular cause. BGA defects from thermal cycling, drop damage, pulling force from separating the heat sink from the processors while disassembling, or delidding can occur. The bumps on CPU, GPU, or SB can fail, Flex IO traces that connect them can be broken/scratched, or accumulated damage from wear and tear (electromigration) can also cause BitTraining to fail. Anything that can disrupt the impedance of the FlexIO can cause BitTraining to fail. A skilled technician will need to use deductive reasoning to diagnose the cause and choose the appropriate repair.

A qualitative test known as a "pressure test" may be used to help make a diagnosis. Applying slight pressure, within reason (not your body weight or clamping force which could cause a BGA defect), to the processor flexes the motherboard beneath the BGA and "may" temporarily reconnect a solder ball with it's pad. Like holding 2 wires together. This can cause flickering on screen, a console to power on when it couldn't before, etc. If the console or error responds differently when pressure is applied, this may be taken as evidence of a BGA defect. It is not definitive, but tips the odds in favor of that diagnosis. A reball in that case may be successful. However, if it does not respond to pressure is not likely to be the BGA and another explanation, such as bumps are more likely. It should be noted that bumps can be affected by force as well, but because the underfill supports them, it generally requires more force to reconnect them using this method. This is what the "Bolt mod," commonly performed on the XBOX 360 did. That much force permanently deforms the motherboard and causes irreparable damage. DO NOT DO THIS! But it illustrates the point. You don't need much force to see if the BGA is affected and if it responds to light pressure, it's unlikely to be the bumps. Therefore, taken together with other clues, it can be helpful to a skilled technician gathering evidence for a diagnosis.

In consoles with a 90nm RSX (CECH-Axx/Bxx/Cxx/Exx/Gxx/Hxx, M03 and Q00 models) the most likely cause of a 3034 is the GPU itself. It can be replaced with another 90nm RSX without modification. However, it can also be replaced with a more reliable 65nm or 40nm model, using a process nicknamed a "Frankenstein Mod." SONY service technicians performed this modification in some officially refurbished consoles. The PS3 community has developed a method as well. Since there is a question about the 90nm RSX's reliability and both a reball and Frankenstein mod require the 90nm to be desoldered, it is advisable to replace the 90nm GPU with a more reliable model instead of risking another 90nm GPU. Rework is hard on the motherboard and surrounding components, so choosing a repair with the fewest uncertainty's is wise.

In models without the 90nm RSX, 3034 is still possible, but far less likely to be caused by the GPU. CPU BGA defects are common in dropped consoles, those that have been delidded or have trace damage to the area around the processors. So troubleshooting is necessary to make a diagnosis.

3035

CELL and RSX error during Byte-Training

Failing GPU. RSX BGA or Bump Defect. Gradual decline in the solder connection affected Byte Calibration, but it managed to pass bit calibration 1st. A0403034 is soon to follow.  As electromigration wears down RSX Core, A0801601/A0801701 become A0501802/A0503037, A0503035, and finally A0403034.

3036

CELL and RSX

3037

CELL and RSX

RSX BGA or Bump Defect have cause A0503037/1802. A gradual decline in the solder connection affected Byte Calibration, but it managed to pass bit calibration 1st. A0403034 is soon to follow.

3038

CELL and RSX

3039

CELL and RSX

Occurred in a CECHL04 coinciding with a check stop error (14FF) during IO initialization at step# 52, which is after Byte-Training, but before the flash firmware sequence at step# 60. So maybe it's starship 2 related? Or it could be CPU/GPU related. Unknown.

3040

Flash

A0603040 is known to be caused by not soldering the flash (NAND/NOR) back on properly. It happens when the flash is not powered. Step #60 is when the StarShip 2 flash controller and NAND/NOR are initialized, kicking off the firmware sequence that loads the Operating System. Check their voltages and be sure the FW is not corrupt. If you have a backup, you could try replacing the Flash to see if a module failed.

3041

A0523041 only reported once. Step #s 50-60 are when Southbridge paripherals are initialized. Step 52 is the last step before 60, when the flash and controller (SS2) are initialized. A0603040 will occur. Speculation: Perhaps 3041 is related to the SS2 or another SB paripheral. Perhaps a flash solder connection, or corruption issue. We don't know. Too few reports.

Data Errors


  • This error codes seems to be repeated up to 5 times for 5 special cases, as example, errors 4001, 4101, 4201, 4301, 4401 are related to CELL, the only thing that changes in the error code is the second digit (located immediatly after the category). If at some point we find what means that digit we can join the wiki page sections together (with titles: "4001, 4101, 4201, 4301, 4401", etc...)

4001

CELL

4002

RSX

4003

Southbridge

4011

CELL

4101

CELL

4102

RSX

4103

Southbridge

4111

CELL

4201

CELL

4202

RSX

4203

Southbridge

4211

CELL

4212

RSX

4221

CELL

4222

RSX

4231

CELL

4261

CELL

4301

CELL

4302

RSX

4303

Southbridge

4311

CELL

4312

RSX

4321

CELL

4322

RSX

4332

RSX

4341

CELL

4401

CELL or RSX

4402

CELL or RSX

4403

CELL or RSX

4411

CELL or RSX

4412

CELL or RSX

4421

CELL or RSX

4422

CELL or RSX

4432

CELL or RSX

4441

CELL or RSX

5FFF

CELL or RSX

In recent times, this error has been known for the CPU (CELL), but it is actually due to an error in the NOR of the Playstation 3 SLIM/SUPER SLIM. Due to a failure when performing the exploit, you can end up having a console Bricked, for this use E3 FLASHER, Tennsy.etc

For 3XXX, 4XXX consoles, the BRICK WITH Tennsy can be solved. in 4XXX keep in mind that the NOR can be emmc (12GB) therefore it will not be possible to solve it (for now...)

for Super Slim reflow or reball RSX