Friday 24 July 2015

Effectiveness of ZFS usage for OpenVZ

This article is a detailed answer to conversation about simfs vs ploop vs ZFS from OpenVZ maillist.

Because I have finished very detailed tests of ZFS for VPS storage I would like to share they with my Dear Community. 

Source data (real clients, real data, different OS templates, different OS templates versions, server running for ~1year, production server full copy):
Size: 5,4T
Used: 3,5T
Avail: 1,7T
Usage:69%
This data from HWN with OpenVZ with ploop.

Our internal toolkit fastvps_ploop_compacter show following details about this server:
Total wasted space due to ploop bugs: 205.1 Gb
Wasted space means difference between real data size and ploop disks size, i.e. ploop overhead.

gzip compression on ZFS

We have enabled ZFS gzip compression and move all data to new created ZFS volume.

And we got following result:
NAME          SIZE   ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT data  8,16T  3,34T  4,82T    40%  1.00x  ONLINE  -

As you can see  we save about 160 GB of data. New ZFS size of this data is: 3340 GB, old size: 3500 GB.

lz4 compression on ZFS

So, ZFS developers do not recommend gzip and we will try lzr4 compression as best option.

We copy same data to new ZFS storage with lz4 compression enabled and got following results:
ALLOC: 2,72Tb
Wow! Amazing! We save about 600 Gb of data!  Really!

ZFS deduplication

As you know, ZFS has another killer feature - deduplication! And it's best option when we store so much containers with fixed number of distributions (debian, ubuntu, centos).

But please keep in mind, we have disabled compression on this step!

We have enabled deduplication for new storage and copy all production data to new storage with deduplication.

When data copy was finished we got this breath taking results:

zpool list
NAME         SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOTdata  8,16T  2,53T  5,62T    31%  1.33x  ONLINE  -

We have saved 840 GB of data with deduplication! We are really close to save 1Tb!

For very curious readers I could offer some internal data regarding ZFS dedup:
zdb -D data
DDT-sha256-zap-duplicate: 5100040 entries, size 581 on disk, 187 in core
DDT-sha256-zap-unique: 27983716 entries, size 518 on disk, 167 in core
dedup = 1.34, compress = 1.00, copies = 1.02, dedup * compress / copies = 1.31
ZFS compression and deduplication simultaneously

So, ZFS is amazingly flexible and we could use compression and dedulication in same time and got even more storage save. And this tests is your own home task :) Please share your results here!

Conclusion

That's why we have only single file system which ready for 21 century. And ext4 with derivative file systems should be avoided everywhere if possible.  

So, will be fine if you help ZFS on Linux community with bug reporting and development! 

No comments:

Post a Comment

Note: only a member of this blog may post a comment.