From 336b288615055395cd2ff197a2de413cd88cdf27 Mon Sep 17 00:00:00 2001 From: "Scot W. Stevenson" Date: Sat, 13 Apr 2019 15:56:36 +0200 Subject: [PATCH] First rewrite --- docs/index.md | 3 +- docs/overview.md | 47 ++++----- docs/zfs_configuration.md | 117 +++++++++++----------- docs/zfs_overview.md | 206 +++++++++++++++++++++----------------- 4 files changed, 189 insertions(+), 184 deletions(-) diff --git a/docs/index.md b/docs/index.md index 2d4550cc..c8f9310f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -27,4 +27,5 @@ Head to [installation](installation.md) if you're ready to roll, or to you're done, check out the [post-installation](post_installation.md) steps. If this is all very confusing, there is also an [overview](overview.md) of the -project and what is required for complete beginners. +project and what is required for complete beginners. If you're only confused +abot ZFS, we'll help you [get started](zfs_overview.md) as well. diff --git a/docs/overview.md b/docs/overview.md index e29ba00c..3332f0b7 100644 --- a/docs/overview.md +++ b/docs/overview.md @@ -48,20 +48,13 @@ technologies involved and be able to set up the basic stuff yourself. As a to-do list, before you can even install Ansible-NAS, you'll have to: 1. Choose, buy, configure, and test your own **hardware**. Note that ZFS loves - RAM - it will run [with 1 GB](https://wiki.freebsd.org/ZFSTuningGuide), but - it won't be happy. The ZFS on Linux (ZoL) people - [recommend](https://github.com/zfsonlinux/zfs/wiki/FAQ#hardware-requirements) - at least 8 GB for best performance, but the more, the better. As robust as - ZFS is, it assumes the data in memory is correct, so [very bad - things](http://research.cs.wisc.edu/adsl/Publications/zfs-corruption-fast10.pdf) - happen to your data if there is memory corruption. For this reason, it is - [strongly - recommended](https://github.com/zfsonlinux/zfs/wiki/FAQ#do-i-have-to-use-ecc-memory-for-zfs) - to use ECC RAM. ZFS also prefers to have the hard drives all to itself. If - you're paranoid (a good mindset when dealing with servers), you'll probably - want an uninterruptible power supply (UPS) of some sort as well and SMART - monitoring for your hard drives. See the [FreeNAS hardware - requirements](https://freenas.org/hardware-requirements/) as a guideline. + RAM, and it is [recommended](zfs_overview.md) you use ECC RAM. ZFS also + prefers to have the hard drives all to itself. If you're paranoid (a good + mindset when dealing with servers), you'll probably want an uninterruptible + power supply (UPS) of some sort as well and SMART monitoring for your hard + drives. See the [FreeNAS hardware + requirements](https://freenas.org/hardware-requirements/) as a guideline, but + remember you'll also be running Docker. 1. Install **Ubuntu Server**, preferably a Long Term Support (LTS) edition such as 18.04, and keep it updated. You'll probably want to perform other basic @@ -69,19 +62,13 @@ As a to-do list, before you can even install Ansible-NAS, you'll have to: [various guides](https://devanswers.co/ubuntu-18-04-initial-server-setup/) for this, but if you're just getting started, you'll probably need a book. -1. Install **ZFS** and set up storage. This includes creating data sets for - various parts of the system, some form of automatic snapshot handling, and - possibly automatic backups to another server or an external hard drive. - Currently on Linux, it is [something of a - hassle](https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS) to - use ZFS on the root file system. If you are completely new to ZFS, expect a - brutal learning curve. There is a slightly dated (2012) but extensive - [introduction to ZFS on - Linux](https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/) by - Aaron Toponce to get you started, or you can watch [this - video](https://www.youtube.com/watch?v=MsY-BafQgj4) that introduces the - philosophy and big picture of ZFS. - +1. Install **ZFS** and set up storage. You can use a different file system and + volume manager, but Ansible-NAS historically tends towards ZFS. You'll have + to create datasets for various parts of the system, some form of automatic + snapshot handling, and possibly automatic backups to another server or an + external hard drive. If you are completely new to ZFS, expect a brutal + learning curve. A [brief introduction](zfs_overview.md) is included here, as + well as a [basic example](zfs_configuration.md) of a very simple ZFS setup. After that, you can continue with the actual [installation](installation.md) of Ansible-NAS. @@ -91,6 +78,6 @@ Ansible-NAS. The easiest way to take Ansible-NAS for a spin is in a virtual machine, for instance in [VirtualBox](https://www.virtualbox.org/). You'll want to create three virtual hard drives for testing: One of the actual NAS, and the two others -to create a mirrored ZFS pool. Note because of the RAM requirements of ZFS, -you might run into problems with a virtual machine, but this will let you -experiment with installing, configuring, and running a complete system. +to create a mirrored ZFS pool. A virtual machine will probably not be happy or +fast, but this will let you experiment with installing, configuring, and running +a complete system. diff --git a/docs/zfs_configuration.md b/docs/zfs_configuration.md index 487d3f7b..f0d51339 100644 --- a/docs/zfs_configuration.md +++ b/docs/zfs_configuration.md @@ -5,35 +5,34 @@ overview](zfs_overview.md) introduction first. ## Just so there is no misunderstanding Unlike other NAS variants, Ansible-NAS does not install, configure or manage the -disks or file systems for you. It doesn't care which file system you use -- ZFS, -Btrfs, XFS or EXT4, take your pick. It also provides no mechanism for external -backups, snapshots or disk monitoring. As Tony Stark said to Loki in _Avengers_: -It's all on you. +disks or file systems for you. It doesn't care which file system you use - ZFS, +Btrfs, XFS or EXT4, take your pick. Nor does it provides a mechanism for +external backups, snapshots or disk monitoring. As Tony Stark said to Loki in +_Avengers_: It's all on you. -However, Ansible-NAS has traditionally been used with with the powerful ZFS -filesystem ([OpenZFS](http://www.open-zfs.org/wiki/Main_Page), to be exact). -Since [ZFS on Linux](https://zfsonlinux.org/) is comparatively new, this text -provides a very basic example of setting up a simple storage configuration with -scrubs and snapshots. To paraphrase Nick Fury from _Winter Soldier_: We do -share. We're nice like that. +However, Ansible-NAS has traditionally been used with the powerful ZFS +filesystem. Since out of the box support for [ZFS on +Linux](https://zfsonlinux.org/) with Ubuntu is comparatively new, this text +shows how to set up a simple storage configuration. To paraphrase Nick Fury from +_Winter Soldier_: We do share. We're nice like that. > Using ZFS for Docker containers is currently not covered by this document. See -> [the Docker ZFS -> documentation](https://docs.docker.com/storage/storagedriver/zfs-driver/) for -> details. +> [the official Docker ZFS +> documentation](https://docs.docker.com/storage/storagedriver/zfs-driver/) +> instead. ## The obligatory warning We take no responsibility for any bad thing that might happen if you follow this -guide. We strongly suggest you test these procedures in a virtual machine. +guide. We strongly suggest you test these procedures in a virtual machine first. Always, always, always backup your data. ## The basic setup -For this example, we're assuming two identical spinning rust hard drives for all +For this example, we're assuming two identical spinning rust hard drives for Ansible-NAS storage. These two drives will be **mirrored** to provide redundancy. The actual Ubuntu system will be on a different drive and is not our -concern here. +concern. > [Root on ZFS](https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS) > is currently still a hassle for Ubuntu. If that changes, this document might @@ -42,13 +41,12 @@ concern here. The Ubuntu kernel is already ready for ZFS. We only need the utility package which we install with `sudo apt install zfsutils`. +### Creating a pool -### Creating the pool - -We assume you don't mind totally destroying whatever data might be on your +We assume you don't mind totally destroying whatever data might be on your two storage drives, have used a tool such as `gparted` to remove any existing -partitions, and have installed a GPT partition table. To create our ZFS pool, we -will use a command of the form +partitions, and have installed a new GPT partition table on each drive. To +create our ZFS pool, we will use a command in this form: ``` sudo zpool create -o ashift= mirror @@ -61,35 +59,36 @@ The options from simple to complex are: common are `tank` and `dozer`. Whatever you use, it should be short. 1. ****: The Linux command `lsblk` will give you a quick overview of the - hard drives in the system. However, we don't want to pass a drive - specification in the format `/dev/sde` because this is not persistant. - Instead, [we - use](https://github.com/zfsonlinux/zfs/wiki/FAQ#selecting-dev-names-when-creating-a-pool) + hard drives in the system. However, we don't pass the drive specification in + the format `/dev/sde` because this is not persistant. Instead, + [use](https://github.com/zfsonlinux/zfs/wiki/FAQ#selecting-dev-names-when-creating-a-pool) the output of `ls /dev/disk/by-id/` to find the drives' IDs. 1. ****: This is required to pass the [sector size](https://github.com/zfsonlinux/zfs/wiki/FAQ#advanced-format-disks) of the drive to ZFS for optimal performance. You might have to do this by hand because some drives lie: Whereas modern drives have 4k sector sizes (or 8k in - case of many SSDs), they will report 512 bytes for backward compatibility. + case of many SSDs), they will report 512 bytes because Windows XP [can't + handle 4k + sectors](https://support.microsoft.com/en-us/help/2510009/microsoft-support-policy-for-4k-sector-hard-drives-in-windows). ZFS tries to [catch the liars](https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_vdev.c) - and use the correct value. However, that sometimes fails, and you have to add + and use the correct value. However, this sometimes fails, and you have to add it by hand. The `ashift` value is a power of two, so we have **9** for 512 bytes, **12** for 4k, and **13** for 8k. You can create a pool without this parameter and then use `zdb -C | grep ashift` to see what ZFS generated - automatically. If it isn't what you think, you can destroy the pool (see - below) and add it manually when creating it again. + automatically. If it isn't what you think, destroy the pool again and add it + manually. -In our pretend case, we use 3 TB WD Red drives. Listing all drives by ID gives -us something like this, but with real serial numbers: +In our pretend case, we use two 3 TB WD Red drives. Listing all drives by ID +gives us something like this, but with real serial numbers: ``` ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN01 ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN02 ``` -The actual command to create the pool would be: +WD Reds have a 4k sector size. The actual command to create the pool would then be: ``` sudo zpool create -o ashift=12 tank mirror ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN01 ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN02 @@ -97,15 +96,15 @@ The actual command to create the pool would be: Our new pool is named `tank` and is mirrored. To see information about it, use `zpool status tank` (no `sudo` necessary). If you screwed up (usually with -`ashift`), use `sudo zpool destroy tank` and start over _now_, before it's too +`ashift`), use `sudo zpool destroy tank` and start over _now_ before it's too late. ### Pool default parameters Setting pool-wide default parameters makes life easier when we create our -datasets. To see them all, you can use the command `zfs get all tank`. Most are -perfectly sensible. Some you'll [want to -change](https://jrs-s.net/2018/08/17/zfs-tuning-cheat-sheet/) are: +filesystems. To see them all, you can use the command `zfs get all tank`. Most +are perfectly sensible, some you'll [want to +change](https://jrs-s.net/2018/08/17/zfs-tuning-cheat-sheet/): ``` sudo zfs set atime=off tank @@ -113,30 +112,29 @@ change](https://jrs-s.net/2018/08/17/zfs-tuning-cheat-sheet/) are: sudo zfs set autoexpand=on tank ``` -The `atime` parameter means that your system updates an attribute of a file -every time the file is accessed, which uses a lot of resources. Usually, you -don't care. Compression is a no-brainer on modern CPUs and should be on by -default (we will discuss exceptions for compressed media files later). -`autoexpand` lets the pool grow when you add larger hard drives. +The `atime` parameter means that your system updates a time stamp every time a +file is accessed, which uses a lot of resources. Usually, you don't care. +Compression is a no-brainer on modern CPUs and should be on by default (we will +discuss exceptions for compressed media files later). The `autoexpand` lets the +pool grow when you add larger hard drives. - -## Creating the filesystems +## Creating filesystems To actually store the data, we need filesystems (also known as "datasets"). For -our very simple default Ansible-NAS setup, we will create two examples: One -filesystem for movies (`movies_root` in `all.yml`) and one for downloads +our very simple default Ansible-NAS setup, we will create two: One filesystem +for movies (`movies_root` in `all.yml`) and one for downloads (`downloads_root`). ### Movies (and other large, pre-compressed files) -We first create the basic file system for movies: +We first create the basic filesystem: ``` sudo zfs create tank/movies ``` -Movie files are usually rather large, already in a compressed format, and the -files stored there shouldn't be executed for security reasons. We change the +Movie files are usually rather large, already in a compressed format and for +security reasons, the files stored there shouldn't be executable. We change the properties of the filesystem accordingly: ``` @@ -147,11 +145,11 @@ properties of the filesystem accordingly: The **recordsize** here is set to the currently largest possible value [to increase performance](https://jrs-s.net/2019/04/03/on-zfs-recordsize/) and save -storage. Recall that we used `ashift` during the creation of the pool to match +storage. Recall that we used `ashift` during the creation of the pool to match the ZFS block size with the drives' sector size. Records are created out of these blocks. Having larger records reduces the amount of metadata that is -required, and various aspects of ZFS such as caching and checksums work on this -level. +required, because various parts of ZFS such as caching and checksums work on +this level. **Compression** is unnecessary for movie files because they are usually in a compressed format anyway. ZFS is good about recognizing this, and so if you @@ -159,9 +157,10 @@ happen to leave compression on as the default for the pool, it won't make much of a difference. [By default](https://zfsonlinux.org/manpages/0.7.13/man8/zfs.8.html#lbAI), ZFS -stores pools directly under the root directory and do not have to be listed in -`/etc/fstab` to be mounted. This means that our filesystem will appear as -`/tank/movies`. We need to change the line in `all.yml` accordingly: +stores pools directly under the root directory. Also, the filesystems don't have +to be listed in `/etc/fstab` to be mounted. This means that our filesystem will +appear as `/tank/movies` if you don't change anything. We need to change the +line in `all.yml` accordingly: ``` movies_root: "/tank/movies" @@ -172,8 +171,7 @@ property. Setting this to `none` prevents the file system from being automatically mounted at all. The filesystems for TV shows, music files and podcasts - all large, -pre-compressed files - should take the exact same parameters as the one for -movies. +pre-compressed files - should probably take the exact same parameters. ### Downloads @@ -184,7 +182,7 @@ For downloads, we can leave most of the default parameters the way they are. sudo zfs set exec=off tank/downloads ``` -The recordsize stays at the 128k default. In `all.yml`, the new line is +The recordsize stays the 128 KB default. In `all.yml`, the new line is ``` downloads_root: "/tank/downloads" @@ -192,8 +190,9 @@ The recordsize stays at the 128k default. In `all.yml`, the new line is ### Other data -Depending on the use case, you might want to tune your filesystems. For example, -[Bit Torrent](http://open-zfs.org/wiki/Performance_tuning#Bit_Torrent), +Depending on the use case, you might want to create and tune more filesystems. +For example, [Bit +Torrent](http://open-zfs.org/wiki/Performance_tuning#Bit_Torrent), [MySQL](http://open-zfs.org/wiki/Performance_tuning#MySQL) and [Virtual Machines](http://open-zfs.org/wiki/Performance_tuning#Virtual_machines) all have known best configurations. diff --git a/docs/zfs_overview.md b/docs/zfs_overview.md index 5d6877f9..ca8053f4 100644 --- a/docs/zfs_overview.md +++ b/docs/zfs_overview.md @@ -1,148 +1,157 @@ This is a general overview of the ZFS file system for people who are new to it. -If you have some experience and are looking for specific information about how -to configure ZFS for Ansible-NAS, check out the [ZFS example -configuration](zfs_configuration.md) instead. +If you have some experience and are actually looking for specific information +about how to configure ZFS for Ansible-NAS, check out the [ZFS example +configuration](zfs_configuration.md). ## What is ZFS and why would I want it? [ZFS](https://en.wikipedia.org/wiki/ZFS) is an advanced filesystem and volume -manager originally created by Sun Microsystems from 2001 onwards. First released -in 2005 for OpenSolaris, Oracle later bought Sun and started developing ZFS as -closed source software. An open source fork took the name +manager originally created by Sun Microsystems starting in 2001. First released +in 2005 for OpenSolaris, Oracle later bought Sun and switched to developing ZFS +as closed source software. An open source fork took the name [OpenZFS](http://www.open-zfs.org/wiki/Main_Page), but is still called "ZFS" for short. It runs on Linux, FreeBSD, illumos and other platforms. ZFS aims to be the ["last word in filesystems"](https://blogs.oracle.com/bonwick/zfs:-the-last-word-in-filesystems) -- a system so future-proof that Michael W. Lucas and Allan Jude famously stated -that the _Enterprise's_ computer on _Star Trek_ probably runs it. The design -was based on [four principles](https://www.youtube.com/watch?v=MsY-BafQgj4): +- a technology so future-proof that Michael W. Lucas and Allan Jude famously +stated that the _Enterprise's_ computer on _Star Trek_ probably runs it. The +design was based on [four + principles](https://www.youtube.com/watch?v=MsY-BafQgj4): -1. "Pooled" storage to completely eliminate the notion of volumes. You can - add more storage the same way you just add a RAM stick to memory. +1. "Pooled" storage to eliminate the notion of volumes. You can add more storage + the same way you just add a RAM stick to memory. 1. Make sure data is always consistant on the disks. There is no `fsck` command - for ZFS. + for ZFS and none is needed. 1. Detect and correct data corruption ("bitrot"). ZFS is one of the few storage - systems that checksums everything and is "self-healing". + systems that checksums everything, including the data itself, and is + "self-healing". 1. Make it easy to use. Try to "end the suffering" for the admins involved in managing storage. -ZFS includes a host of other features such as snapshots, transparent -compression, and encryption. During the early years of ZFS, this all came with -hardware requirements which only enterprise users could afford. By now, however, -computers have become so powerful that ZFS can run (with some effort) on a -[Raspberry +ZFS includes a host of other features such as snapshots, transparent compression +and encryption. During the early years of ZFS, this all came with hardware +requirements only enterprise users could afford. By now, however, computers have +become so powerful that ZFS can run (with some effort) on a [Raspberry Pi](https://gist.github.com/mohakshah/b203d33a235307c40065bdc43e287547). FreeBSD and FreeNAS make extensive use of ZFS. What is holding ZFS back on Linux are -[licensing conflicts](https://en.wikipedia.org/wiki/OpenZFS#History) beyond the +[licensing issues](https://en.wikipedia.org/wiki/OpenZFS#History) beyond the scope of this document. -Ansible-NAS doesn't actually specify a filesystem - you can use EXT4, XFS, Btrfs -or pretty much anything you like. However, ZFS not only provides the benefits -listed above, but also lets you use your hard drives with different operating -systems. Some people now using Ansible-NAS originally came from FreeNAS, and -were able to `export` their ZFS pools there and `import` them to Ubuntu. On the -other hand, if you ever decide to switch back to FreeNAS or maybe try FreeBSD -instead of Linux, you should be able to do so using the same ZFS pools. +Ansible-NAS doesn't actually specify a filesystem - you can use EXT4, XFS or +Btrfs as well. However, ZFS not only provides the benefits listed above, but +also lets you use your hard drives with different operating systems. Some people +now using Ansible-NAS came from FreeNAS, and were able to `export` their ZFS +storage drives there and `import` them to Ubuntu. On the other hand, if you ever +decide to switch back to FreeNAS or maybe want to use FreeBSD instead of Linux, +you should be able to use the same ZFS pools. -## A small taste of ZFS +## An overview and some actual commands Storage in ZFS is organized in **pools**. Inside these pools, you create **filesystems** (also known as "datasets") which are like partitions on -steroids. For instance, you can keep each user's `/home/` files in a separate -filesystem. ZFS systems tend to use lots and lots of specialized filesystems. -They share the available storage in their pool. +steroids. For instance, you can keep each user's `/home` directory in a separate +filesystem. ZFS systems tend to use lots and lots of specialized filesystems +with tailored parameters such as record size and compression. All filesystems +share the available storage in their pool. Pools do not directly consist of hard disks or SSDs. Instead, drives are organized as **virtual devices** (VDEV). This is where the physical redundancy in ZFS is located. Drives in a VDEV can be "mirrored" or combined as "RaidZ", roughly the equivalent of RAID5. These VDEVs are then combined into a pool by the -administrator. - -To give you some idea of how this works, this is how to create a pool: +administrator. The command might look something like this: ``` sudo zpool create tank mirror /dev/sda /dev/sdb ``` This combines `/dev/sba` and `/dev/sdb` to a mirrored VDEV, and then defines a -new pool named `tank` consisting of this single VDEV. We can now create a -filesystem in this pool to hold our books: +new pool named `tank` consisting of this single VDEV. You can now create a +filesystem in this pool for, say, all of your _Mass Effect_ fan fiction: ``` - sudo zfs create tank/books + sudo zfs create tank/mefanfic ``` -You can then enable automatic and transparent compression on this filesystem -with `sudo zfs set compression=lz4 tank/books`. To take a **snapshot**, use +You can then enable automatic compression on this filesystem with `sudo zfs set +compression=lz4 tank/mefanfic`. To take a **snapshot**, use ``` - sudo zfs snapshot tank/books@monday + sudo zfs snapshot tank/mefanfic@21540411 ``` -Now, if evil people were somehow to encrypt your book files with ransomware on -Wednesday, you can laugh and revert to the old version: +Now, if evil people were somehow able to encrypt your precious fan fiction files +with ransomware, you can laugh maniacally and revert to the old version: ``` - sudo zfs rollback tank/books@monday + sudo zfs rollback tank/mefanfic@21540411 ``` -Of course, you did lose any work from Tuesday unless you created a snapshot then -as well. Usually, you'll have some form of **automatic snapshot -administration**. +Of course, you would lose any texts you might have added to the filesystem +between that snapshot and now. Usually, you'll have some form of **automatic +snapshot administration** configured. -To detect bitrot and other defects, ZFS periodically runs **scrubs**: The system -compares the available copies of each data record with their checksums. If there -is a mismatch, the data is repaired. +To detect bitrot and other data defects, ZFS periodically runs **scrubs**: The +system compares the available copies of each data record with their checksums. +If there is a mismatch, the data is repaired. ## Known issues -Constructing the pools out of virtual devices creates some problems. You can't -just detach a drive (or a VDEV) and have the pool reconfigure itself. To -reorganize the pool, you'd have to create a new, temporary pool out of separate -hard drives, move the data over, destroy and reconfigure the original pool, and -then move the data back. Increasing the size of a pool involves either adding -more VDEVs (_not_ just additional disks) or replacing each disk in a VDEV by a -larger version with the `autoexpand` parameter set. +> At time of writing (April 2019), ZFS on Linux does not yet offer native +> encryption, TRIM support, or device removal, which are all scheduled to be +> included in the upcoming [0.8 +> release](https://www.phoronix.com/scan.php?page=news_item&px=ZFS-On-Linux-0.8-RC1-Released). -> At time of writing (April 2019), ZFS on Linux does not offer native encryption, -> trim support, or device removal, which are all scheduled to be included in the -> [0.8 release](https://www.phoronix.com/scan.php?page=news_item&px=ZFS-On-Linux-0.8-RC1-Released) -> in the near future. +ZFS' original design for enterprise systems and redundancy requirements can make +some things more difficult. You can't just add individual drives to a pool and +tell the system to reconfigure automatically. Instead, you have to either add a +new VDEV, or replace each of the existing drives with one of higher capacity. In +an enterprise environment, of course, you would just _buy_ a bunch of new drives +and move the data from the old pool to the new pool. Shrinking a pool is even +harder - put simply, ZFS is not built for this. + +If you need to be able to add or remove single drives, ZFS might not be the +filesystem for you. ## Myths and misunderstandings -There are a bunch of false or simply outdated information about ZFS. To clear up -the worst of them: +Information on the internet about about ZFS can be outdated, conflicting, or +simply wrong. Partially this is because it has been in use for almost 15 years +now and things change, partially it is the result of being used on different +operating systems which have minor differences under the hood. Also, Google +searches tend to return the Sun/Oracle documentation for their closed source ZFS +variant, which is increasingly diverging from the open source OpenZFS standard. +To clear up some of the most common misunderstandings: ### No, ZFS does not need at least 8 GB of RAM This myth is especially common [in FreeNAS circles](https://www.ixsystems.com/community/threads/does-freenas-really-need-8gb-of-ram.38685/). -Note that FreeBSD, the basis of FreeNAS, will run with as little [as 1 +Curiously, FreeBSD, the basis of FreeNAS, will run with [1 GB](https://wiki.freebsd.org/ZFSTuningGuide). The [ZFS on Linux FAQ](https://github.com/zfsonlinux/zfs/wiki/FAQ#hardware-requirements), which is -more relevant here, states under "suggested hardware": +more relevant for Ansible-NAS, states under "suggested hardware": > 8GB+ of memory for the best performance. It's perfectly possible to run with > 2GB or less (and people do), but you'll need more if using deduplication. -(Deduplication is only useful in [very special +(Deduplication is only useful in [special cases](http://open-zfs.org/wiki/Performance_tuning#Deduplication). If you are reading this, you probably don't need it.) -What everybody agrees on is that ZFS _loves_ RAM, and you should have as much of -it as you possibly can. So 8 GB is in fact a sensible lower limit you shouldn't -go below unless for testing. When in doubt, add more RAM, and even more, and -them some, until your motherboard's capacity is reached. +What everybody agrees on is that ZFS _loves_ RAM and works better the more it +has, so you should have as much of it as you possibly can. When in doubt, add +more RAM, and even more, and them some, until your motherboard's capacity is +reached. Experience shows that 8 GB of RAM is in fact a sensible minimal amount +for continious use. But it's not a requirement. ### No, ECC RAM is not required for ZFS -This again is a case where a recommendation has been taken as a requirement. To +This is another case where a recommendation has been taken as a requirement. To quote the [ZFS on Linux FAQ](https://github.com/zfsonlinux/zfs/wiki/FAQ#do-i-have-to-use-ecc-memory-for-zfs) again: @@ -154,46 +163,55 @@ again: > filesystem) will write the damaged data to disk and be unable to automatically > detect the corruption. -It is _always_ better to have ECC RAM on all computers if you can afford it, and -ZFS is no exception. However, there is absolutely no requirement for ZFS to have -ECC RAM. +ECC corrects [single bit errors](https://en.wikipedia.org/wiki/ECC_memory) in +memory. It is _always_ better to have it on _any_ computer if you can afford it, +and ZFS is no exception. However, there is absolutely no requirement for ZFS to +have ECC RAM. ### No, the SLOG is not really a write cache -You'll hear the suggestion that you add a fast SSD or NVMe as a "SLOG" -(mistakingly also called "ZIL") drive for write caching. This isn't what would -happen, because ZFS already includes [a write -cache](https://linuxhint.com/configuring-zfs-cache/). It is located in RAM. -Since RAM is always faster than any drive, adding a disk as a write cache -doesn't make sense. +You'll read the suggestion to add a fast SSD or NVMe as a "SLOG" (mistakingly +also called "ZIL") drive for write caching. This isn't what happens, because ZFS +already includes [a write cache](https://linuxhint.com/configuring-zfs-cache/) +in RAM. Since RAM is always faster, adding a disk as a write cache doesn't make +sense. What the ZFS Intent Log (ZIL) does, with or without a dedicated drive, is handle synchronous writes. These occur when the system refuses to signal a successful -write until the data is actually on a physical disk somewhere. This keeps it -safe. By default, the ZIL initially shoves a copy of the data on a normal VDEV +write until the data is actually stored on a physical disk somewhere. This keeps +the data safe, but is slower. + +By default, the ZIL initially shoves a copy of the data on a normal VDEV somewhere and then gives the thumbs up. The actual write to the pool is -performed later from the normal write cache, _not_ the temporary copy. The data -there is only ever read if the power fails before the last step. +performed later from the write cache in RAM, _not_ the temporary copy. The data +there is only ever read if the power fails before the last step. The ZIL is all +about protecting data, not making transfers faster. -A Separate Intent Log (SLOG) is a fast drive for the ZIL's temporary synchronous -writes. It allows the ZIL give the thumbs up quicker. This means that SLOG is -never read unless the power has failed before the final write to the pool. -Asynchronous writes just go through the normal write cache. If the power fails, -the data is gone. +A Separate Intent Log (SLOG) is an additional fast drive for these temporary +synchronous writes. It simply allows the ZIL give the thumbs up quicker. This +means that a SLOG is never read unless the power has failed before the final +write to the pool. Asynchronous writes just go through the normal write cache, +by the way. If the power fails, the data is gone. -In summary, the ZIL is concerned with preventing data loss for synchronous -writes, not with speed. You always have a ZIL. A SLOG will make the ZIL faster. -You'll need to [do some +In summary, the ZIL prevents data loss during synchronous writes. You always +have a ZIL. A SLOG will make the ZIL faster. You'll probably need to [do some research](https://www.ixsystems.com/blog/o-slog-not-slog-best-configure-zfs-intent-log/) -to figure out if your system would benefit from a SLOG. NFS for instance uses -synchonous writes, SMB usually doesn't. If in doubt, add more RAM instead. +and some testing to figure out if your system would benefit from a SLOG. NFS for +instance uses synchonous writes, SMB usually doesn't. If in doubt, add more RAM +instead. ## Further reading and viewing -- One of the best books around is _FreeBSD Mastery: ZFS_ by Michael W. +- In 2012, Aaron Toponce wrote a now slightly dated, but still very good + [introduction](https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/) + to ZFS on Linux. If you only read one part, make it the [explanation of the + ARC](https://pthree.org/2012/12/07/zfs-administration-part-iv-the-adjustable-replacement-cache/), + ZFS read cache. + +- One of the best books on ZFS around is _FreeBSD Mastery: ZFS_ by Michael W. Lucas and Allan Jude. Though it is written for FreeBSD, the general guidelines - apply for all variants. There is a second book for advanced users. + apply for all variants. There is a second volume for advanced use. - Jeff Bonwick, one of the original creators of ZFS, tells the story of how ZFS came to be [on YouTube](https://www.youtube.com/watch?v=dcV2PaMTAJ4).