User talk:Masterzorag

From PS3 Developer wiki
Revision as of 12:13, 11 July 2014 by Masterzorag (talk | contribs)
Jump to navigation Jump to search

SPU Problems on Linux > 3.2, OpenCL related

As far as I know, I'm the only coding OpenCL on the Cell here, if someone want to test something be warned that due some spufs changes that ppc-kernel-devs are (maybe) trying to fix, latest 3.3/3.4/3.5 branches falls into 'possible circular locking dependency detected' and slowdown runtime.

  • It's stable until 3.2 branch.
  • Even disabling lock debugging it slowdowns without warnings, it happens even with OpenCL samples from IBM.

http://permalink.gmane.org/gmane.linux.ports.ppc.embedded/50547

Latest tested kernels:

  • 3.2.55 works fine
# ./perlin
OpenCL took 22.496168 seconds to compute 1000 frames. Pixel Rate = 46.611316 Mpixels/sec, Frame Rate = 44.452015 frames/sec
Host code took 12.620616 seconds to compute 10 frames. Pixel Rate = 0.830844 Mpixels/sec, Frame Rate = 0.792354 frames/sec
OpenCL provided a 56.101182 speedup
  • 3.3.3/3.4.6/3.5.3 falls into 'possible circular locking dependency detected' and slowdown runtime

Here the slowdown effect:

# ./perlin
OpenCL took 93.280273 seconds to compute 1000 frames. Pixel Rate = 11.241133 Mpixels/sec, Frame Rate = 10.720380 frames/sec
Host code took 12.948244 seconds to compute 10 frames. Pixel Rate = 0.809821 Mpixels/sec, Frame Rate = 0.772305 frames/sec
OpenCL provided a 13.881010 speedup

In this specific case time spent is 4x to do the same thing!
When program runs something is going weird, e.g. in my program I'm used to query an OpenCL builtin function to tell me how many available SPEs there are, and its reply 8.
Using spu_base.enum_shared=1 parameter it should reply 7, so seems that the issue is OpenCL related.

OtherOS region

OtherOS/OtherOS++ region is on HDD (ps3dd), we have new linux tools (ps3sed) and drivers.
To resize ps3da I've tried new ps3sed (manually), unsuccesfully: GameOS always detect corruption and redo its own things.

I've found a way to force resize on 4.46, no emer_init patch, no downgrading: GameOS respect standards.
I can now resize ps3da at arbitrary size.
Swapping HDD on pc is necessary to me to send a couple to SET MAX ADDRESS ata commands to get the job done: set the limit, left GameOS (partition and) format, then reset size back the same way.
On boot all regions are fine, plus empty space as tail, nice to fit a fouth region.

Here I've forced ps3da to use 1216709344 sectors, this left me about 16G for ps3dd.
After that GameOS do it own things, I've resetted ps3da to its real geometry (1250263728) and booted a new petitboot.

root@ps3-linux:~# dmesg | grep ps3disk
[    3.220526] ps3disk_init:601: registered block device major 254
[    3.220549] ps3_system_bus_match:369: dev=6.0(sb_04), drv=6.0(ps3disk): match
[    3.220856] ps3disk sb_04: accessible region 0 start 0 size 1250263728
[    3.220952] ps3disk sb_04: accessible region 1 start 32 size 1212515008
[    3.221045] ps3disk sb_04: accessible region 2 start 1212515040 size 4194296
[    3.221051] ps3disk sb_04: ps3stor_probe_access:133: 3 accessible regions found
[    3.227341] ps3disk sb_04: ps3da is a SAMSUNG HM641JI (610480 MiB total, 610480 MiB region)
[    3.229035] ps3disk sb_04: ps3db is a SAMSUNG HM641JI (610480 MiB total, 592048 MiB region)
[    3.230008] ps3disk sb_04: ps3dc is a SAMSUNG HM641JI (610480 MiB total, 2047 MiB region)

root@ps3-linux:~# ps3sed print_region 3
   0                0       1250263728    1
   1               32       1212515008    8
   2       1212515040          4194296    8

root@ps3-linux:~# create_hdd_region.sh
INFO: device id 3
INFO: number of regions 3
INFO: total number of blocks 1250263728
INFO: last region start block 1212515040
INFO: last region number of blocks 4194296
INFO: new region start block 1216709344
INFO: new region number of blocks 33554376
INFO: new region id 3

root@ps3-linux:~# ps3sed print_region 3
   0                0       1250263728    1
   1               32       1212515008    8
   2       1212515040          4194296    8
   3       1216709344         33554376    1

root@ps3-linux:~# reboot && exit

Last number 1 is wrong, it says that last region has only one acl entry, we need to fix it at 8 entries:

  • manually with ps3sed
  • rebooting

Petitboot finally detect a new ps3dd device, the fourth region, of (33554376 * 512 =) 17179840512 bytes.
All of this with a 3.10.26 kernel and new tools: no vflash hacking involved, linux on vflash7 is deprecated.

Sometimes HDD is reported as second device (something buggy in my kernel?):

root@ps3-linux:~# ps3sed print_device
     flash    1      512           491008        7
     cdrom    3     2048       2147483647        1
      disk    2      512       1250263728        4

00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 0f ac e0 ff  00 00 00 00 de ad fa ce  |................|
00000020  00 00 00 00 00 00 00 03  00 00 00 00 00 00 00 02  |................|
00000030  00 00 00 00 00 00 00 20  00 00 00 00 48 45 82 c0  |....... ....HE..|
00000040  10 70 00 00 02 00 00 01  00 00 00 00 00 00 00 03  |.p..............|
00000050  10 70 00 00 01 00 00 01  00 00 00 00 00 00 00 03  |.p..............|
00000060  10 20 00 00 03 00 00 01  00 00 00 00 00 00 00 03  |. ..............|
00000070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  00 00 00 00 48 45 82 e0  00 00 00 00 00 3f ff f8  |....HE.......?..|
000000d0  10 70 00 00 02 00 00 01  00 00 00 00 00 00 00 03  |.p..............|
000000e0  10 70 00 00 01 00 00 01  00 00 00 00 00 00 00 03  |.p..............|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000150  00 00 00 00 48 85 82 e0  00 00 00 00 01 ff ff c8  |....H...........|
00000160  10 70 00 00 02 00 00 01  00 00 00 00 00 00 00 03  |.p..............|
00000170  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400

ps3vuart-tools

We miss some stuff from old ps3sm-utils, looking to port: temperature, get_fan_policy and set_fan_policy to new ps3vuart-tools.
We need to enable some sort of fan control on petitboot now.

root@fedora_clone ~]# /home/ps3vuart-tools-2012-09-01/ps3sm/ps3sm get_fan_policy 0
0x01 0x01 0x48 0x00

[root@fedora_clone ~]# /home/ps3vuart-tools-2012-09-01/ps3sm/ps3sm temperature 0
01 00 00 00 3f 49 00 00

Updating the Real Time Clock with hwclock results in error:

Mar 31 18:09:18 fedora_clone kernel: os_area_queue_work_handler: Could not update FLASH ROM

UPL.xml.pkg

tar -t -f update_files.tar

ls UPL.xml.unpkg/
-rw-r--r-- 1 0 0 2.8K Jun 27 13:40 content
-rw-r--r-- 1 0 0   64 Jun 27 13:40 info0
-rw-r--r-- 1 0 0   64 Jun 27 13:40 info1
...
-rwxr-xr-x 1 0 0  640 Jun 27 15:20 UPL.xml.pkg.spkg_hdr.1

UPL.xml.unpkg/content:                XML document text
UPL.xml.unpkg/info0:                  data
UPL.xml.unpkg/info1:                  data
UPL.xml.unpkg/UPL.xml.pkg.spkg_hdr.1: data

scetool -v -i update_files.untar/UPL.xml.pkg [*] Using keyset [pkg 0x0000 03.55] [*] Header decrypted. [*] Data decrypted. [*] SCE Header: Magic 0x53434500 [OK] Version 0x00000002 Key Revision 0x0000 Header Type [PKG] Metadata Offset 0x00000000 Header Length 0x0000000000000280 Data Length 0x0000000000000B9D // 2973 bytes, content + info0 + info1 [*] Metadata Info: Key 87 EE 46 44 60 DA DA EA 49 74 58 F9 02 1D 6D 11 IV F4 9F 43 D8 D0 6A F0 FC 33 AF 5E 6E CF 2F 30 1E [*] Metadata Header: Signature Input Length 0x0000000000000250 unknown_0 0x00000001 Section Count 0x00000003 Key Count 0x00000014 Optional Header Size 0x00000000 unknown_1 0x00000000 unknown_2 0x00000000 [*] Metadata Section Headers: Idx Offset Size Type Index Hashed SHA1 Encrypted Key IV Compressed 000 00000280 00000040 01 01 [YES] 00 [NO ] -- -- [NO ] 001 000002C0 00000040 02 02 [YES] 06 [NO ] -- -- [NO ] 002 00000300 0000016B 03 03 [YES] 0C [YES] 12 13 [YES] [*] SCE File Keys: n 14

hexdump -C UPL.xml.pkg

00000280  00 00 00 03 00 00 00 04  00 00 00 00 00 00 00 0a  |................|
00000290  20 14 06 19 01 15 45 00  00 00 00 00 00 00 0b 1d  | .....E.........|
000002a0  00 00 00 00 00 00 01 6b  00 00 00 00 00 00 00 00  |.......k........|
000002b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

000002c0  00 00 00 00 00 00 00 03  00 00 00 00 00 00 00 40  |...............@|
000002d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 0b 1d  |................|
000002e0  00 00 00 00 00 00 00 01  00 00 00 00 00 00 00 01  |................|
000002f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

We got info0 + info1, 64bytes each
content is 2845bytes, total is data lenght 2973
info0, info1 are not encrypted

UPL.xml.pkg is decrypted using pkg keyset [0x0000 03.55], UPL.xml.pkg.spkg_hdr.1 is decrypted using spkg keyset for > 3.55: only Metadata Info and Signature changes.

I think that on 3.55 all .spkg_hdr.1 files are not involved in updating, on 3.56 all of them are used as in overlayfs, so all .spkg_hdr.1 files are readed as metadata headers, decrypted by newer spkg keyset to get the same SCE keys to get rest of data decrypted.

.spkg_hdr.1

strip metadata info

# dd if=CORE_OS_PACKAGE.pkg.spkg_hdr.1 of=metainfo.crypt skip=32 count=64 bs=1 
64+0 records in
64+0 records out
64 bytes (64 B) copied, 0.0021009 s, 30.5 kB/s
# hexdump -C metai.cry 
00000000  d7 f9 82 9e 75 0a 3f 20  8b 6f e7 41 b1 bb 52 15  |....u.? .o.A..R.|
00000010  e1 8f d2 86 43 b5 4f 56  4c 42 a0 10 e1 1a 25 38  |....C.OVLB....%8|
00000020  9c 28 c7 fd 38 31 24 3b  1b 2b 9f 3f dc 72 4f c4  |.(..81$;.+.?.rO.|
00000030  95 34 b8 0a af 25 a1 05  b6 8f ce 2c 88 e9 2b 7b  |.4...%.....,..+{|

verify metadata info decryption with standard tool

# openssl enc -d -aes-256-cbc -in metai.crypt -K erk -iv riv | hexdump -C
00000000  7c f2 9a 4b 96 de 5f 75  a1 32 87 c0 42 ec 8f cf  ||..K.._u.2..B...|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  6f 85 6a 60 2a 8d b4 3f  2a 81 1b 1a 9c a3 02 f6  |o.j`*..?*.......|
00000030

strip rest of crypted metadata

# dd if=CORE_OS_PACKAGE.pkg.spkg_hdr.1 of=metarest.crypt skip=96 bs=1
544+0 records in
544+0 records out
544 bytes (544 B) copied, 0.00626423 s, 86.8 kB/s

decrypt rest of metadata

get signature

compute digest of the whole metadata

verify digest / signature