Optane 900p 480G: zfs vs btrfs vs ext4 benchmarks

I recently bought a new server with an Optane 900p 480G and I decided to give zfs a try instead of using btrfs as usual (I will not use raid or other devices, just a single 900p).

I will use my Optane drive to host several KVM virtual machines.

I have been fooled to think that the native sector size was 512B by the fact that we weren’t allowed to reformat the NVMe to 4K/8K:



This seems to be just a marketing move to sell the more expensive datacenter disks, in fact some reviews suggest that 512B is emulated, as well as 4K for the datacenter disks:



Regular NVMe SSDs will present a 512B emulated sector by slicing up the larger (4K/8K/etc) flash pages into the smaller sector. Optane on the other hand is byte (bit?) addressable by design so all of its “sector sizes” are emulated by assembling a sector from each individual component. Since we are able to choose freely both the sector size and the record size the question is: which one to choose?

Since I plan to use compression I need to basically rule out all combinations where sector size equals record size. Recordsize is on the uncompressed size, so if you take an 8k record, compress it to 5.3k, you then still have to store that data in a 8k sector, so you save nothing. So I will consider only record sizes which are at least 4 times the sector size.

I also decided to throw raw device, btrfs and ext4 values to the mix, just to make things more fun.

I used fio 3.6 benchmark for the worst case: queue depth 1, single job. I also used direct=1 for the raw values, but I didn’t find a way to completely bypass caches for zfs.
Disk partitions has been aligned at 1MiB by zfs itself.
For a 512 sector size you need to set ashift=9 for your whole zpool, ahift=12 for 4K and ashift=13 for 8K.
On the contrary you can set recordsize=512, recordsize=4K or recordsize=8K on per-dataset basis.

Here are the results:






























I suggest you do download the calc file:

And the fio output, along with the commands I used:

The official zfs wiki suggests a 4K recordsize to store virtual machine images, so I will probably opt for a 512 sector size with a 4K recordsize for VMs and a 32K recordsize for everything else.

EDIT: the ‘none’ scheduler has been used. It wasn’t clear from the previous graphs, but going from the default s 512 / r 128k (22,08 MiB/s) to s 512 / r 4k (32.57 MiB/s) leads to a 48% improvement in 4k randwrite, while s 512 / r 32k still retains a very good 45% increase in performance.

IKEA: Swedish inefficiency at its best

IKEA's customer service.

IKEA’s customer service.

This is somehow a technical blog so I’m sorry to bother you with personal complaints like this one, but as an Engineer I cannot avoid to be stunned watching such impractical implementations.
Some of the IKEA furnishings are sold as two (or more) separate pieces, which is fine. The funny part is how they handle such a thing. I came to IKEA to help my mother loading a furnishing into her car, so we asked the department manager which furnishing may suit our needs and she gave us a piece of paper with the product code of the pack to pick up from the warehouse. Then came my first error: I didn’t pay enough attention when picking the pack from the shelf. There was a single product code and the boxes were all similar (identical except for a small label with a number) and so I didn’t think about the possibility of having to pick up two different pieces from the very same shelf with the very same product code. The fault was mine, so shame on me. But let’s move on. We headed towards the self-service checkout and while I was going crazy trying to find my IKEA Family card (I don’t know if they already made one but in the 2016 an app would definitely help, I’m tired having to carry dozens of fidelity cards) in the meantime my mother found hers and started checking out at the self. The system printed a warning informing that the barcode was associated with a two pieces article and in the best “point-and-click” Windows tradition my mother clicked “OK” without even noticing the message. Moral of the story we carried a single piece but we paid for the whole furnishing. We didn’t notice the missing parts until Monday, when I began to assemble it. What left me astonished is how they handle multipart articles: they load almost identical parts in the very same shelf -without tying them together- under the very same product code, but only the first one has a barcode. So if you pick only the first one and accidentally miss the warning at the self-service checkout you’re doomed. This is a big problem when paired with their inventory mess and I really don’t understand why they don’t simply print a different barcode on each piece and prevent the self-service to checkout until you scanned all of them. Simple and easy, but if their outdated systems cannot manage to achieve this they can simply tie the two pieces together, which they didn’t. When you couple this with one of the worst customer services I have ever seen the omelet is served. They didn’t seem surprised of what happened, in fact at first they told me “don’t worry, it happens very often”, but they have such a mess in their inventory that they didn’t succeed finding the missing piece so they told me that I will have to pay for the whole furnishing again if I wanted the missing piece. I asked to talk with the customer service supervisor to ask if they were really afraid of me trying to steal half of the most economical furnishing in IKEA and it happened that they cannot afford to run such a risk: when in doubt they prefer leaving unhappy customers having to buy the very same article twice. Talking with their customer service was a painful experience, but at least it teached me something: do to not rely on such big companies when you need to buy something for which you may need assistance in the future. Just save a little more money and avoid you future headaches buying from someone who knows the meaning of “customer satisfaction”.

Alcuni dei mobili IKEA vengono venduti come due (o più) pezzi. La parte divertente è come gestiscono internamente questa situazione. Sono stato all’IKEA per aiutare mia madre a caricare un mobile nella sua macchina, così abbiamo chiesto al responsabile del reparto quale potesse fare al caso nostro e ci ha dato un pezzo di carta con sopra scritto il product code del pezzo da prendere nel magazzino. Ed è qui che ho commesso il mio primo errore: non ho prestato sufficiente attenzione quando ho preso il pacco dallo scaffale. C’era un singolo product code e le scatole erano tutte simili (identiche, ad eccezione di una piccola etichetta con un numero) e quindi non ho pensato alla possibilità di dover prendere due differenti pezzi dallo stesso scaffale con lo stesso identico product code. Ad ogni modo la colpa è stata mia, avrei dovuto prestare più attenzione. Ma andiamo avanti. Ci siamo diretti verso le casse self service and mentre stavo cercando la mia tessera IKEA Family (non so se l’abbiano già fatta ma nel 2016 un’app sarebbe decisamente d’aiuto, sono stufo di dovermi portare dietro dozzine di tessere fedeltà) nel frattempo mia madre ha trovato la sua e ha cominciato a passare i prodotti alla cassa. Il sistema ha mostrato un avviso informandoci che il barcode era associato ad un articolo formato da due pezzi, ma nella miglior tradizione “punta e clicca” in stile Windows mia madre ha clickato “OK” senza neanche accorgersi dell’avviso. Morale della storia ci siamo portati a casa soltanto uno dei due pezzi, pur avendo pagato per l’intero mobile. Non abbiamo notato le parti mancanti fino a lunedì, quando ho cominciato ad assemblarlo. Quello che mi ha lasciato veramente di stucco è come gestiscono gli articoli formati da più pezzi: caricano delle scatole praticamente identiche nello stesso scaffale -senza neanche legare assieme- con lo stesso identico product code, mentre soltanto la prima parte ha un barcode. Quindi se qualcuno prendesse soltanto il primo pezzo e accidentalmente non notasse l’avviso alla cassa self service è in un mare di guai. Questo è un grosso problema accoppiato con il loro disastro inventariale e non capisco veramente perché non si limitano a stampare un differente barcode su ogni pezzo, impedendo alle casse self service di ultimare l’acquisto finché non siano stati scansionati tutti. Semplice ed efficiente, ma se dovesse essere troppo complicato per i lori sistemi antiquati potrebbero semplicemente legare i due pezzi insieme, cosa che non hanno fatto. Aggiungeteci uno dei peggiori servizi clienti che abbia mai visto e la frittata è servita. Non sembravano essere sorpresi da quanto accaduto, infatti inizialmente mi hanno detto “non si preoccupi, capita molto spesso”, ma hanno un tale macello in inventario che non sono riusciti a ritrovare il pezzo mancante e mi hanno detto che avrei dovuto pagare l’intero mobile nuovamente se avessi voluto prendere il pezzo mancante. Ho chiesto di parlare con il supervisore del servizio clienti per chiedere se avessero veramente paura che volessi rubargli metà del mobile IKEA più economico e a quanto pare non possono correre questo rischio: per loro è più importante lasciare che un loro cliente resti insoddisfatto e che debba comprare due volte lo stesso prodotto per cui ha regolarmente pagato. Parlare con il loro servizio clienti è stata un’esperienza da incubo, ma se non altro mi ha insegnato una cosa: non affidatevi a queste grosse compagnie quando dovete comprare qualcosa per cui potreste dover chiedere assistenza in futuro. Mettete da parte qualche soldo in più ed evitatevi un mal di testa assicurato comprando da qualcuno che conosca il significato delle parole “soddisfazione del cliente”.

Radeonsi with si scheduler humiliates Catalyst in all tests

Following my last article I decided to test Axel Davy’s si scheduler and run the very same OpenGL4+ tests with both radeonsi+si scheduler and Catalyst.
The si scheduler is such a huge performance boost! Not only it is faster, but now radeonsi is faster than Catalyst in *all* tests, sometimes by a wide margin!
Catalyst version is the latest and greatest 15.7, while the radeonsi stack is from git (including linux 4.2, xorg-server git and llvm 3.8 git). I also use modesetting instead of xf86-video-ati. Distro is gentoo.

Unfortunately both Bioshock Infinite and Dirt Showdown didn’t work for me with Catalyst, quite ironic considering they both work flawlessly with radeonsi (plus a small patch)!

But now let’s have a look at some simpler foss games. Don’t consider the other cards results because they were made at 4K while my monitor is a simple full hd (1920×1080). Just compare HD 7950 radeonsi vs HD 7950 si scheduler vs HD 7950 catalyst. I asked Michael if it was possible to filter out some results, but he still has to answer me. Eventually I will update the graphs later.

Catalyst got completely humiliated! Radeonsi is so much faster that I will no longer consider Catalyst as a reference for future performance improvements: we aim at the Nvidia performance now!

I would like someone else with the very same card to reproduce my results. If you want to test si scheduler just apply this patch on top of llvm git master and comment out:

//else //(uncomment to turn default for SI)
// return createSIMachineScheduler(C);

To run Bioshock Infinite with mesa you need to apply this patch and to set this evironmental variable:

EDIT: as I stated on irc the boost was largely due to a big regression reverted in mesa while doing the first test. Only a little boost is accountable for the SI scheduler.

Radeonsi vs AMD Catalyst vs NVIDIA proprietary on GL4+ workloads





Counter Strike Global Offensive: radeonsi is on par with Catalyst

AMD Radeon HD 7950 using kernel 3.17-rc5-drm-next-3.18-wip + hyperz (R600_DEBUG=hyperz). I’m also using libdrm git, xf86-video-ati git, llvm 3.6 git, mesa git and xorg-server 1.17.0 RC 1. Catalyst version is 14.6 beta2 (kernel 3.14.3, xorg-server 1.15.2).

You can find all the info on my system here: http://openbenchmarking.org/result/1409232-DARK-140923107

wine: vanilla vs CSMT (d3dstream) vs Gallium nine vs Catalyst

How to achieve the best possible performance with wine? I compared vanilla wine using latest radeonsi open source drivers, wine with the CSMT (d3dstream) patchset and wine with the Gallium nine patchset. I also compared the results to latest Catalyst drivers using wine patched with CSMT (d3dstream). Surprisingly radeonsi + gallium nine beats Catalyst + CSMT (d3dstream) in 3DMark2005 and reaches 86% of Catalyst + CSMT (d3dstream) in Tropics!

Soon open source radeonsi drivers with the gallium nine state tracker will be the best available solution to get the most out of wine: users aiming for the best performance should get rid of proprietary blobs if favor of open source drivers.

My card is an AMD HD7950 and I used latest graphic stack from git, including drm-next-3.18. To use gallium nine you will need FOSS drivers with a patched mesa and a patched wine. You can’t use gallium nine with proprietary drivers.

Wine has to translate DirectX => OpenGL => Gallium, which add complications and brings inefficiency. Thanks to the gallium nine state tracker we simply skip the OpenGL translation. More info here: http://ixit.cz/faster-wine-games-with-open-source-drivers-d3d9-aka-gallium-nine/

Both 3DMark2005 and Unigine Tropics runs @2560×1600, here are some screenshots:



You can find my wine vanilla, wine CSMT (d3dstream) and wine gallium nine ebuilds in my overlay.

A new linuxsystems overlay: wine-nine

This overlay allows you to build latest git version of mesa and wine with the gallium nine patches. Wine has to translate DirectX => OpenGL => Gallium, which add complications and brings inefficiency. Thanks to the gallium nine state tracker we simply skip the OpenGL translation. More info here: http://ixit.cz/faster-wine-games-with-open-source-drivers-d3d9-aka-gallium-nine/

This patchset is maintained by David Heidelberger (here you can find his work).

You can find my media-libs/mesa-9999 and app-emulation/wine ebuilds with gallium nine patches in the new wine-nine overlay: http://www.linuxsystems.it/overlay/

New ebuild: app-emulation/wine-1.7.24 CSMT (d3dstream)

You can find it in the wine-d3dstream overlay: http://www.linuxsystems.it/overlay/

A new linuxsystems overlay: wine

Latest wine version is currently 1.7.26 while latest version available in gentoo repositories is only 1.7.21.
This overlay allows you to build latest version of wine with pulseaudio, pipelight (compholio) and gstreamer support.

You can find my app-emulation/wine-1.7.26 ebuild in the new wine overlay: http://www.linuxsystems.it/overlay/

raspbian wheezy 20140726 available

I updated my Raspbian Wheezy minimal image, raspbian_wheezy_20130926.img.7z is available here:

– New kernel.
– New firmware.
– Works with Raspberry Pi model B+