raspbian wheezy 20140726 available

I updated my Raspbian Wheezy minimal image, raspbian_wheezy_20130926.img.7z is available here:
http://www.linuxsystems.it/raspbian-wheezy-armhf-raspberry-pi-minimal-image/

- New kernel.
- New firmware.
- Works with Raspberry Pi model B+

raspbian wheezy 20140701 available

I updated my Raspbian Wheezy minimal image, raspbian_wheezy_20130923.img.7z is available here:
http://www.linuxsystems.it/raspbian-wheezy-armhf-raspberry-pi-minimal-image/

- New kernel.
- New firmware.
- rpi-update
- wireless-tools
- wpa_supplicant

radeonsi vs Catalyst 3DMark wine benchmarks

If a big team like CD Projekt RED thinks that using a wrapper layer like eON by Virtual Prgramming is a suitable solution for a AAA game port like The Witcher 2, who am I to ditch wine (which performs even better than eON)? I might even speak about benchmarking a native linux 3DMark version according to CD Projekt RED standards…

Anyway, I wanted to becnhmark something different and in particular to test wine’s D3D command stream patches, so I decided to try the famous 3DMarks: 3D Mark 2001, 3D Mark 2005 and 3DMark 2006. I did not test 3DMark 2003 because of this regression.

I’m using default 3DMark settings and my video card is an AMD Radeon HD 7950. For radeonsi I’m using kernel 3.15-rc5 + PTE patches (VRAM page table entry compression) + hyperz (R600_DEBUG=hyperz). I’m also using libdrm git, xf86-video-ati git, llvm 3.5 git with a rebased Tom Stellard’s si-spill-fixes-v4 branch, mesa git (OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.3.0-devel (git-57730d6)) and Keith Packard’s xorg-server glamor-server branch (1.16.0 RC 2). Catalyst version is 14.4 (kernel 3.14.3, xorg-server 1.15.1 because of compatibility issues). Wine version is 1.7.18 + Stefan Dösinger’s D3D command stream patches.

3dmark2001

3dmark2005

3dmark2006

What about you? Please share your 3DMark results and tell me which card/driver/wine version you are using.

A new linuxsystems overlay: wine-d3dstream

This overlay allows you to build latest git version of wine with the D3D command stream patches which create a separate command stream / worker thread for WineD3D. This work moves OpenGL calls into a seperate thread in order to improve performance up to 50~100% and in some cases making the games under Wine faster than on Windows.

This patchset is made by Stefan Dösinger (here you can find his work).

You can find my app-emulation/wine-9999 ebuild with d3d stream patches in the new wine-d3dstream overlay: http://www.linuxsystems.it/overlay/

Please make sure you have HKCU/Software/Wine/Direct3D/CSMT = “enabled” in the registry. To do so open a shell, type regedit, browse to HKCU/Software/Wine, right click and select new->key “Direct3D”, right click and select new->string “CSMT”, double click and set it to “enabled”.

Here is a quick benchmark of 3DMark 2001 with the radeonsi driver with and without d3d command stream patches:

wine_comparison

A new linuxsystems overlay: radeonsi

In my previous post someone asked about my radeonsi ebuilds, so I decided to create a new overlay.
This overlay allows you to use Keith Packard’s xorg-server glamor-server branch and Tom Stellard’s si-spill-fixes-v4 llvm branch (rebased from time to time by me). You will have to use my x11-drivers/xf86-video-ati and x11-drivers/xf86-video-intel ebuilds because otherwise they will try to install x11-libs/glamor which conflicts with >=x11-base/xorg-server-1.15.901.

Here you can find the new radeonsi overlay: http://www.linuxsystems.it/overlay/

Radeonsi is faster than Catalyst with Steam games

As I said in my previous post radeonsi is becoming faster than Catalyst in several scenarios. Some peoples on phoronix didn’t think it was actually possible and blamed “old games”. So I decided to benchmark Steam games, in particular Half-Life 2: Lost Coast, Team Fortress 2 and Portal. Unfortunately these are the only Steam games with a phoronix-test-suite profile available and they all use the Source engine. Hopefully the Phoronix Test Suite’s author Michael Larabel will provide us more profiles in the future, games with different engines.

AMD Radeon HD 7950 using kernel 3.15-rc4 + PTE patches (VRAM page table entry compression) + hyperz (R600_DEBUG=hyperz). I’m also using libdrm git, xf86-video-ati git, llvm 3.5 git, mesa git (OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.3.0-devel (git-cf93f86)) and Keith Packard’s xorg-server glamor-server branch (1.16.0 RC 2). Catalyst version is 14.4 (kernel 3.14.3, xorg-server 1.15.1).


Radeonsi is 21% faster than Catalyst with Half-Life 2: Lost Coast.


Radeonsi is 3% faster than Catalyst with Team Fortress 2.


Radeonsi runs at 81% of Catalyst with Portal.

Here is the link on openbenchmarking: http://openbenchmarking.org/result/1405092-SO-1405097SO80

Radeonsi is awesome, beats Catalyst!

Edit: see also Radeonsi is faster than Catalyst with Steam games.

I did some benchmarks of my AMD Radeon HD 7950 using kernel 3.15-rc4 + PTE patches (VRAM page table entry compression) + hyperz (R600_DEBUG=hyperz). I’m also using libdrm git, xf86-video-ati git, llvm 3.5 git, mesa git (OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.3.0-devel (git-cf93f86)) and Keith Packard’s xorg-server glamor-server branch (1.16.0 RC 2). Catalyst version is 14.4 (kernel 3.14.3, xorg-server 1.15.1).


Radeonsi is 14% faster than Catalyst with Xonotic.


Radeonsi is 4% faster than Catalyst with Openarena.


Radeonsi runs at 76% of Catalyst with Unigine Heaven.


Radeonsi runs at 62% of Catalyst with Unigine Valley.


Something’s wrong with radeonsi and Unvanquished, it’s barely 30% of Catalyst.

Here is the link on openbenchmarking: http://openbenchmarking.org/result/1405084-SO-1405083SO83

I also compared 2D acceleration using Keith Packard’s xorg-server glamor-server branch between radeonsi, Catalyst and Intel’s HD 4000 SNA: http://openbenchmarking.org/result/1405080-SO-1405080SO26

Intel’s HD 4000 SNA 2D acceleration is much faster in almost every test compared to both Catalyst and radeonsi, but glamor is much faster using Keith Packard’s xorg-server glamor-server branch compared to the old standalone glamor lib.

drm-next-3.14.0-20140407.patch.xz available

New patch against 3.14 which backports drm from 3.15-rc1-pre:

http://www.linuxsystems.it/linux-drm-graphic-stack-backports/

A new screenshots comparator and loads of new x264 x265 vp8 vp9 tests

hobbit_1-756kbps--x264_x265_vp8-ssim_analysisI bought the Blu-ray of “The Hobbit: An Unexpected Journey” (which is considered to be a reference for video quality) and I decided to do more codec tests. If you want to know something about my methodology please read my previous article. I ripped a scene from the Blu-ray and I cropped it to 1920×800, then I compressed it using x264 LOSSLESS compression. Except for the crop it is bit identical with the Blu-ray.

I will not go in the details because I already explained everything, also I coded a new screenshot comparator which eases the job: use it to find the details about sources, codecs, etc.. like the exact command line with the parameters I used. It also allows you to download the sample videos. It took quite a while to code it, it uses XML to store data and it quickly grown to ~200 lines of code so please let me known if something’s wrong. I know, the graphic sucks but I hate web programming :)

Just some considerations… First of all: where is VP9? There isn’t. Why? Because after a whole week it still didn’t finish encoding. Seriously, using a wooden abacus would be faster! If someone wants to waste his cpu cycles to encode it, please send me the resulting files. You can download the source video using the screenshot comparator and encode it with this command:

ffmpeg -v 0 -i hobbit_lossless.mkv -threads 4 -vsync 0 -an -pix_fmt yuv420p -f yuv4mpegpipe - | ./vpxenc - -o hobbit.vp9 --codec=vp9 --cpu-used=0 --best --target-bitrate=756 -p 2 --pass=1 --fpf=vp9.stat; ffmpeg -v 0 -i hobbit_lossless.mkv -threads 4 -vsync 0 -an -pix_fmt yuv420p -f yuv4mpegpipe - | ./vpxenc - -o hobbit.vp9 --codec=vp9 --cpu-used=0 --best --target-bitrate=756 -p 2 --pass=2 --fpf=vp9.stat

Also, someone ranted because of lack of PSNR/SSIM graphs. Who needs SSIM when you can use your eyes? Anyway, here is the shiny graph and here is how I created it:

./qpsnr --ignore-fps -a avg_ssim -o blocksize=16:fpa=24 -r hobbit_lossless.mkv hobbit_crf28_756kbps.h265 hobbit_756kbps.h264 hobbit_756kbps.vp8 > ssim-fpa24.csv

Then you can use Libreoffice Calc to make the graph.

Some notes about the graph: yeah this is a completely different scenario from the previous test and x265 wins hand down. But beware, this is just with a VERY low bitrate. I encoded the source with x265 using CRF 28 which resulted in a 756 kbps bitrate, then I two pass encoded using x264 and vp8 with a 756 kbps target bitrate: this is the resulting SSIM graph. If you raise the bitrate there are no such big advantages using x265 over x264 which is a pity. Link. Source vs x265.

I did another encoding starting with x265 CRF 23 which resulted in a 1733 kbps bitrate. Link. Source vs x265.

Finally I did two more tests with x265 16 bit variables CRF 28 and 23: 943 kbps and 1186 kbps. I used x264 hi10p to match the file size. I just discovered “16bpp” doesn’t mean high bit depth in x265, anyway it’s bugged and shouldn’t be used right now.

Here is the SCREENSHOT COMPARATOR. Press F11 to switch your browser to the full screen mode.

Who washed my video? x264 vs x265 ‘placebo’ comparison

I wanted to know something about the actual x265 development state and since I didn’t find anything up to date I decided to do my own tests.
First of all I compiled a recent snapshot of x265:

./x265 --version
x265 [info]: HEVC encoder version 0.7+259-5e2043f89aa1
x265 [info]: build info [Linux][GCC 4.8.2][64 bit] 8bpp
x265 [info]: using cpu capabilities: MMX2 SSE SSE2Fast SSSE3 SSE4.2 AVX

x264 –version
x264 0.142.2389 956c8d8
built on Feb 21 2014, gcc: 4.8.2
configuration: –bit-depth=8 –chroma-format=all
x264 license: GPL version 2 or later

This is my 112s / 216MB test video:
https://mega.co.nz/#!eQhSjJQR!EEe8-taN5IspIu-RW0WQzmvKzc5fkCn282kS5ugZ_as

Let’s encode it using x265 first:

ffmpeg -v 0 -i ~/PlanetEarthBirds.mkv -threads 4 -vsync 0 -an -pix_fmt yuv420p -f yuv4mpegpipe - | ./x265 - --y4m --preset placebo --crf 28 -o prova.h265

As you can see I used ffmpeg to decode the video and then I piped the raw stream to the x265 encoder. I did a quality based encoding with the default crf (28) which gives low quality but low size output files (considering my source isn’t lossless this is exactly what I want to easily spot differences with x264).

Then I generated a file of the very same size using two pass encoding with x264:

ffmpeg -v 0 -i ~/PlanetEarthBirds.mkv -threads 4 -vsync 0 -an -pix_fmt yuv420p -f yuv4mpegpipe - | x264 - --demuxer y4m --preset placebo -o prova.h264 --pass 1 --bitrate 3310
ffmpeg -v 0 -i ~/PlanetEarthBirds.mkv -threads 4 -vsync 0 -an -pix_fmt yuv420p -f yuv4mpegpipe - | x264 - --demuxer y4m --preset placebo -o prova.h264 --pass 2 --bitrate 3310

Here is the result:

-rw-r--r-- 1 niko niko 46094455 22 feb 04.19 prova.h264
-rw-r--r-- 1 niko niko 46238768 21 feb 21.26 prova.h265

As you can see prova.h264 is 46094455 bytes (~45MB) while prova.h265 is 46238768 bytes (~45MB). The video encoded with x264 is just 0.14MB smaller which is a 0.3% difference.

Using the placebo presets I used the slowest settings which are supposed to give the best quality for both encoders.
Having to use two pass encoding for x264 to make a meaningful comparison there is no sense comparing encoding speed, so I will not do it.

Here are the encoded files:
prova.h264
prova.h265

Anyway we can compare DECODING speeds :)
Let’s decode them as fast as we can using ffh264 (FFmpeg H.264) and ffhevc (FFmpeg HEVC / H.265):

mplayer -benchmark -nosound -lavdopts threads=16 prova.h264 -vo null 2> /dev/null

BENCHMARKs: VC: 5.640s VO: 0.003s A: 0.000s Sys: 0.333s = 5.976s
BENCHMARK%: VC: 94.3730% VO: 0.0482% A: 0.0000% Sys: 5.5788% = 100.0000%

mplayer -benchmark -nosound -lavdopts threads=16 prova.h265 -vo null 2> /dev/null

BENCHMARKs: VC: 26.936s VO: 0.003s A: 0.000s Sys: 0.360s = 27.300s
BENCHMARK%: VC: 98.6680% VO: 0.0123% A: 0.0000% Sys: 1.3197% = 100.0000%

As you can see the video encoded with x264 took 5.976s to decode while the x265 one took 27.300s.
h265 actually takes 4.6x more time than h264 to decode. Not bad, really.

Let’s take some screenshots:

mplayer -lavdopts threads=16 -nosound -benchmark -fps 23.976 -vf framestep=275 -vo png:z=9 prova.h264; for i in `ls -l *png | awk '{ print $9 }' | grep -v h26 | grep -v source`; do mv $i x264-test1-$i; done
mplayer -lavdopts threads=16 -nosound -benchmark -fps 23.976 -vf framestep=275 -vo png:z=9 prova.h265; for i in `ls -l *png | awk '{ print $9 }' | grep -v h26 | grep -v source`; do mv $i x265-test1-$i; done
mplayer -lavdopts threads=16 -nosound -benchmark -fps 23.976 -vf framestep=275 -vo png:z=9 ~/PlanetEarthBirds.mkv; for i in `ls -l *png | awk '{ print $9 }' | grep -v h26 | grep -v source`; do mv $i source-test1-$i; done

Finally, let’s compare them:

x264 vs x265
http://files.linuxsystems.it/functions/comparison.php?test=test1&codec1=x264&codec2=x265&number=1
http://files.linuxsystems.it/functions/comparison.php?source=planetearthbirds&bitrate1=3310&codec1=x265-0.7-259-5e2043f89aa1-8bpp-placebo&bitrate2=3310&codec2=x264-0.142.2389-956c8d8-8bpp-placebo&n=10&current=1

source vs x265
http://files.linuxsystems.it/functions/comparison.php?test=test1&codec1=source&codec2=x265&number=1
http://files.linuxsystems.it/functions/comparison.php?source=planetearthbirds&bitrate1=source&codec1=x265-0.7-259-5e2043f89aa1-8bpp-placebo&bitrate2=3310&codec2=x265-0.7-259-5e2043f89aa1-8bpp-placebo&n=10&current=1

Press F11 to switch your browser to the full screen mode and click on the image to switch from the x265 screenshots to source or x264. I also put some previous/next buttons (there are 10 png screenshots in total).

I don’t want to be critical because x265 is still in early stages of development, but as you can see it completely washed the video: details are gone. Sometimes x265 did a better job but overall x264 definitely wins.
Keep up the good work devs ;)