The fail0verflow group revealed at 33c3 (here) how they initially hacked the PS4 kernel. I will speculate a little detail about how I think they did this? I recommend watching the presentation and reading the postscript. I am pretty sure this is still a promising hack, even though newer kernels will patch the slfash IOMMU stack exposure. It might be better to hack the Aeolia alone because this hack takes a lot of time to write/setup. I wrote this very late at night so there are probably typos and errors. If other people take interest in this, I might buy some equipment.
See the Transaction layer part of the PCI Express page on Wikipedia.
Soldering with the board
On the SAA-001 motherboard, RX going to Aeolia is on side A and TX going to APU is on side B. They desoldered all the decoupling caps on the PCIe x4 LVDS RX line between the Aeolia and APU. They also removed the decoupling caps on the PCIe clock line. I don't know where the clock is located but pins 31/32 or 35 on the Aeolia clock generator IDT6V41265 is probably the original source location. I will say that the DC offset in the LVDS signals might be negligible in this setup, but you also might want to make sure that there are 100 nF caps inline with the RX/TX signal pairs. I would use some thin high guage solid core copper wire and hot glue to solder to these pads (28-30 gauge kynar/enamel).
Setting up the FPGA
They used an ECP3 Versa Devkit from Lattice, but the newer ECP5 Versa Devkit would also work. Some good Virtex-7, from Xilinx, boards are out there but at the same time anyone can buy some a very expensive devkit that has a PCIe card slot. I presume they chose the ECP3 Versa Devkit because it has a nice very example for PCIe DMA. They hooked up one of the pairs of TX bus, so cutting it from x4 to x1, to the FPGA. I am not sure if this specific ECP3 kit has the 100 nF caps already, simply look and you will see - maybe only on RX. They then hooked up the PCIe clock from the APU side to the FPGA.
Setting up the computer
They then hooked up the RX line to some generic x86 single board computer. I have no clue what they used here but it could be something as simple as one of these with a mini-pcie to pcie adapter and soldered right to the adapter. The board they bought probably supported PCIe 2.0 and had a capable Serial/UART port. They then hooked up wires for the PCIe clock from the Aeolia side to the x86 computer. I will discuss more important factors in the software section, but they were vague on this part.
If you are working with Xilinx then have a field day. If you are working with Lattice then you have a nice example but annoying licensing. Contact me if you need help with Lattice licensing as I know its stupid. On the generic x86 board, they have some custom ? Linux setup ? running with some sort of shim that takes in data from PCIe TLPs and queues them in a FIFO. Then they send this data out from the FIFO to the serial and FPGA to transmit to the APU. The shim seems to be a custom C Linux driver or something - no clue here but there might be a generic driver that can just take in the raw data from separate TLP types, try a tty configuration? The Linux kernel aspect would then send all this raw data to maybe a python program that handles the FIFO and inserting new TLPs to the FIFO. This management program would then send the data over to the Serial to the FPGA which would handle sending the TLP packet. Set the baud to 115200 of higher if you feel lucky on the UART.
The FPGA setup would look like this...
I would look at the ECP3 setup for how to work with Lattice IP. This is really just about linking together IP blocks with a little glue logic. Go grab some random HDL UART implementation and set it up at the same baud as the PC. If you are dealing with Xilinx, then start reading documentation because you might need a completely scratch solution; I have no clue if examples exist. After this is all done, then I would look into different attack vectors by inserting or modifying transaction layer packets - but then specifically the wrapped protocols.
Other avenues of exploitation over PCIe
It would be interesting to write a fuzzer with this hack to look into other drivers and protocols. The Aeolia controls a lot of stuff, like Ethernet, Bluetooth, Wi-Fi, HDMI, Optical Drive, and other system components that have corresponding protocols and drivers in the APU's FreeBSD kernel that could be buggy - especially with the fail that Sony has brought us before. These protocols communicate with the Aeolia over PCIe. The USB stack also might hold some promise. Look at the kernel and see which drivers are tainted by Sony engineers and I will expect there to be exploitable bugs. But there could also be fail in common FreeBSD drivers! I have no clue what the IOMMU blocks and what it doesn't.
PCI Device List
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pci0: found aeolia_pcie iommu0: <AMD IOMMU> at device 0.2 on pci0 gc0: <Starsha> port 0x6000-0x60ff mem 0xe0000000-0xe3ffffff,0xe4000000-0xe47fffff,0xe4800000-0xe483ffff at device 1.0 on pci0 hdac0: <GPU/DEHT Audio Controller> mem 0xe4840000-0xe4843fff at device 1.1 on pci0 apcie0: <Aeolia PCI Express glue> mem 0xd0200000-0xd03fffff at device 20.4 on pci0 apcie0: Misc Peripherals base:0xfffffe00d0200000, start:0xd0200000 end:0xd03fffff size:0x200000 apcie0: Chip revision: 00000300 apcie0: Chip ID0: 41b30130 apcie0: Chip ID1: 52024d44 icc0: <Aeolia ICC> at device 20.4 on pci0 hpet_pci0: <Aeolia High Precision Event Timer> at device 20.4 on pci0 sflash0: <Aeolia Serial Flash I/F> at device 20.4 on pci0 rtc0: <Aeolia RTC> at device 20.4 on pci0 twsi0: <Aeolia TWSI> at device 20.4 on pci0 xhci0: <XHCI (ORBIS) USB 3.0 controller> mem 0xdc000000-0xdc1fffff at device 20.7 on pci0 xhci1: <XHCI (ORBIS) USB 3.0 controller> mem 0xdc200000-0xdc3fffff at device 20.7 on pci0 xhci2: <XHCI (ORBIS) USB 3.0 controller> mem 0xdc400000-0xdc5fffff at device 20.7 on pci0 aeolia_acpi0: <Aeolia acpi> port 0x1000-0x10ff at device 20.0 on pci0 mskc0: <Aeolia GBE controller> mem 0xc4000000-0xc4003fff at device 20.1 on pci0 ahci0: <Orbis Aeolia AHCI SATA controller> mem 0xc8000000-0xc8000fff,0xc8001000-0xc8001fff at device 20.2 on pci0 sdhci0: <Aeolia SDHCI> mem 0xcc000000-0xcc000fff at device 20.3 on pci0 dmac0: <Aeolia DMA controller> mem 0xd4000000-0xd4000fff at device 20.5 on pci0 dmac1: <Aeolia DMA controller> mem 0xd4001000-0xd4001fff at device 20.5 on pci0 spm0: <Aeolia Scratch Pad Memory> mem 0xd9000000-0xd903ffff at device 20.6 on pci0 sbram0: <Aeolia DDR3 memory> mem 0x80000000-0xbfffffff at device 20.6 on pci0
for full log, see Console Bootlog