diff --git a/docs/installation.md b/docs/installation.md index dab5d948..362b52ca 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -8,13 +8,17 @@ You can run Ansible-NAS from the computer you plan to use for your NAS, or from 1. Copy `group_vars/all.yml.dist` to `group_vars/all.yml`. -1. Open up `group_vars/all.yml` and follow the instructions there for configuring your Ansible NAS. +1. Open up `group_vars/all.yml` and follow the instructions there for + configuring your Ansible NAS. -1. If you plan to use Transmission with OpenVPN, also copy `group_vars/vpn_credentials.yml.dist` to -`group_vars/vpn_credentials.yml` and fill in your settings. +1. If you plan to use Transmission with OpenVPN, also copy + `group_vars/vpn_credentials.yml.dist` to `group_vars/vpn_credentials.yml` and + fill in your settings. 1. Copy `inventory.dist` to `inventory` and update it. -1. Install the dependent roles: `ansible-galaxy install -r requirements.yml` (you might need sudo to install Ansible roles) +1. Install the dependent roles: `ansible-galaxy install -r requirements.yml` + (you might need sudo to install Ansible roles) -1. Run the playbook - something like `ansible-playbook -i inventory nas.yml -b -K` should do you nicely. +1. Run the playbook - something like `ansible-playbook -i inventory nas.yml -b + -K` should do you nicely. diff --git a/docs/zfs_configuration.md b/docs/zfs_configuration.md new file mode 100644 index 00000000..487d3f7b --- /dev/null +++ b/docs/zfs_configuration.md @@ -0,0 +1,232 @@ +This text deals with specific ZFS configuration questions for Ansible-NAS. If +you are new to ZFS and are looking for the big picture, please read the [ZFS +overview](zfs_overview.md) introduction first. + +## Just so there is no misunderstanding + +Unlike other NAS variants, Ansible-NAS does not install, configure or manage the +disks or file systems for you. It doesn't care which file system you use -- ZFS, +Btrfs, XFS or EXT4, take your pick. It also provides no mechanism for external +backups, snapshots or disk monitoring. As Tony Stark said to Loki in _Avengers_: +It's all on you. + +However, Ansible-NAS has traditionally been used with with the powerful ZFS +filesystem ([OpenZFS](http://www.open-zfs.org/wiki/Main_Page), to be exact). +Since [ZFS on Linux](https://zfsonlinux.org/) is comparatively new, this text +provides a very basic example of setting up a simple storage configuration with +scrubs and snapshots. To paraphrase Nick Fury from _Winter Soldier_: We do +share. We're nice like that. + +> Using ZFS for Docker containers is currently not covered by this document. See +> [the Docker ZFS +> documentation](https://docs.docker.com/storage/storagedriver/zfs-driver/) for +> details. + +## The obligatory warning + +We take no responsibility for any bad thing that might happen if you follow this +guide. We strongly suggest you test these procedures in a virtual machine. +Always, always, always backup your data. + +## The basic setup + +For this example, we're assuming two identical spinning rust hard drives for all +Ansible-NAS storage. These two drives will be **mirrored** to provide +redundancy. The actual Ubuntu system will be on a different drive and is not our +concern here. + +> [Root on ZFS](https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS) +> is currently still a hassle for Ubuntu. If that changes, this document might +> be updated accordingly. Until then, don't ask us about it. + +The Ubuntu kernel is already ready for ZFS. We only need the utility package +which we install with `sudo apt install zfsutils`. + + +### Creating the pool + +We assume you don't mind totally destroying whatever data might be on your +storage drives, have used a tool such as `gparted` to remove any existing +partitions, and have installed a GPT partition table. To create our ZFS pool, we +will use a command of the form + +``` + sudo zpool create -o ashift= mirror +``` + +The options from simple to complex are: + +1. ****: ZFS pools traditionally take their names from characters in the + [The Matrix](https://www.imdb.com/title/tt0133093/fullcredits). The two most + common are `tank` and `dozer`. Whatever you use, it should be short. + +1. ****: The Linux command `lsblk` will give you a quick overview of the + hard drives in the system. However, we don't want to pass a drive + specification in the format `/dev/sde` because this is not persistant. + Instead, [we + use](https://github.com/zfsonlinux/zfs/wiki/FAQ#selecting-dev-names-when-creating-a-pool) + the output of `ls /dev/disk/by-id/` to find the drives' IDs. + +1. ****: This is required to pass the [sector + size](https://github.com/zfsonlinux/zfs/wiki/FAQ#advanced-format-disks) of + the drive to ZFS for optimal performance. You might have to do this by hand + because some drives lie: Whereas modern drives have 4k sector sizes (or 8k in + case of many SSDs), they will report 512 bytes for backward compatibility. + ZFS tries to [catch the + liars](https://github.com/zfsonlinux/zfs/blob/master/cmd/zpool/zpool_vdev.c) + and use the correct value. However, that sometimes fails, and you have to add + it by hand. The `ashift` value is a power of two, so we have **9** for 512 + bytes, **12** for 4k, and **13** for 8k. You can create a pool without this + parameter and then use `zdb -C | grep ashift` to see what ZFS generated + automatically. If it isn't what you think, you can destroy the pool (see + below) and add it manually when creating it again. + +In our pretend case, we use 3 TB WD Red drives. Listing all drives by ID gives +us something like this, but with real serial numbers: + +``` + ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN01 + ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN02 +``` + +The actual command to create the pool would be: + +``` + sudo zpool create -o ashift=12 tank mirror ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN01 ata-WDC_WD30EFRX-68EUZN0_WD-WCCFAKESN02 +``` + +Our new pool is named `tank` and is mirrored. To see information about it, use +`zpool status tank` (no `sudo` necessary). If you screwed up (usually with +`ashift`), use `sudo zpool destroy tank` and start over _now_, before it's too +late. + +### Pool default parameters + +Setting pool-wide default parameters makes life easier when we create our +datasets. To see them all, you can use the command `zfs get all tank`. Most are +perfectly sensible. Some you'll [want to +change](https://jrs-s.net/2018/08/17/zfs-tuning-cheat-sheet/) are: + +``` + sudo zfs set atime=off tank + sudo zfs set compression=lz4 tank + sudo zfs set autoexpand=on tank +``` + +The `atime` parameter means that your system updates an attribute of a file +every time the file is accessed, which uses a lot of resources. Usually, you +don't care. Compression is a no-brainer on modern CPUs and should be on by +default (we will discuss exceptions for compressed media files later). +`autoexpand` lets the pool grow when you add larger hard drives. + + +## Creating the filesystems + +To actually store the data, we need filesystems (also known as "datasets"). For +our very simple default Ansible-NAS setup, we will create two examples: One +filesystem for movies (`movies_root` in `all.yml`) and one for downloads +(`downloads_root`). + +### Movies (and other large, pre-compressed files) + +We first create the basic file system for movies: + +``` + sudo zfs create tank/movies +``` + +Movie files are usually rather large, already in a compressed format, and the +files stored there shouldn't be executed for security reasons. We change the +properties of the filesystem accordingly: + +``` + sudo zfs set recordsize=1M tank/movies + sudo zfs set compression=off tank/movies + sudo zfs set exec=off tank/movies +``` + +The **recordsize** here is set to the currently largest possible value [to +increase performance](https://jrs-s.net/2019/04/03/on-zfs-recordsize/) and save +storage. Recall that we used `ashift` during the creation of the pool to match +the ZFS block size with the drives' sector size. Records are created out of +these blocks. Having larger records reduces the amount of metadata that is +required, and various aspects of ZFS such as caching and checksums work on this +level. + +**Compression** is unnecessary for movie files because they are usually in a +compressed format anyway. ZFS is good about recognizing this, and so if you +happen to leave compression on as the default for the pool, it won't make much +of a difference. + +[By default](https://zfsonlinux.org/manpages/0.7.13/man8/zfs.8.html#lbAI), ZFS +stores pools directly under the root directory and do not have to be listed in +`/etc/fstab` to be mounted. This means that our filesystem will appear as +`/tank/movies`. We need to change the line in `all.yml` accordingly: + +``` + movies_root: "/tank/movies" +``` + +You can also set a traditional mount point if you wish with the `mountpoint` +property. Setting this to `none` prevents the file system from being +automatically mounted at all. + +The filesystems for TV shows, music files and podcasts - all large, +pre-compressed files - should take the exact same parameters as the one for +movies. + +### Downloads + +For downloads, we can leave most of the default parameters the way they are. + +``` + sudo zfs create tank/downloads + sudo zfs set exec=off tank/downloads +``` + +The recordsize stays at the 128k default. In `all.yml`, the new line is + +``` + downloads_root: "/tank/downloads" +``` + +### Other data + +Depending on the use case, you might want to tune your filesystems. For example, +[Bit Torrent](http://open-zfs.org/wiki/Performance_tuning#Bit_Torrent), +[MySQL](http://open-zfs.org/wiki/Performance_tuning#MySQL) and [Virtual +Machines](http://open-zfs.org/wiki/Performance_tuning#Virtual_machines) all have +known best configurations. + + +## Setting up scrubs + +On Ubuntu, scrubs are configurated out of the box to run on the second Sunday of +every month. See `/etc/cron.d/zfsutils-linux` to change this. + + +## Email notifications + +To have the [ZFS +demon](http://manpages.ubuntu.com/manpages/bionic/man8/zed.8.html) `zed` send +you emails when there is trouble, you first have to [install an email +agent](https://www.reddit.com/r/zfs/comments/90prt4/zed_config_on_ubuntu_1804/) +such as postfix. In the file `/etc/zfs/zed.d/zed.rc`, change the three entries: + +``` +ZED_EMAIL_ADDR= +ZED_NOTIFY_INTERVAL_SECS=3600 +ZED_NOTIFY_VERBOSE=1 +``` + +If `zed` is not enabled, you might have to run `systemctl enable zed`. You can +test the setup by manually starting a scrub with `sudo zpool scrub tank`. + + +## Setting up automatic snapshots + +See [sanoid](https://github.com/jimsalterjrs/sanoid/) as a tool for snapshot +management. + + + diff --git a/docs/zfs_overview.md b/docs/zfs_overview.md new file mode 100644 index 00000000..5d6877f9 --- /dev/null +++ b/docs/zfs_overview.md @@ -0,0 +1,200 @@ +This is a general overview of the ZFS file system for people who are new to it. +If you have some experience and are looking for specific information about how +to configure ZFS for Ansible-NAS, check out the [ZFS example +configuration](zfs_configuration.md) instead. + +## What is ZFS and why would I want it? + +[ZFS](https://en.wikipedia.org/wiki/ZFS) is an advanced filesystem and volume +manager originally created by Sun Microsystems from 2001 onwards. First released +in 2005 for OpenSolaris, Oracle later bought Sun and started developing ZFS as +closed source software. An open source fork took the name +[OpenZFS](http://www.open-zfs.org/wiki/Main_Page), but is still called "ZFS" for +short. It runs on Linux, FreeBSD, illumos and other platforms. + +ZFS aims to be the ["last word in +filesystems"](https://blogs.oracle.com/bonwick/zfs:-the-last-word-in-filesystems) +- a system so future-proof that Michael W. Lucas and Allan Jude famously stated +that the _Enterprise's_ computer on _Star Trek_ probably runs it. The design +was based on [four principles](https://www.youtube.com/watch?v=MsY-BafQgj4): + +1. "Pooled" storage to completely eliminate the notion of volumes. You can + add more storage the same way you just add a RAM stick to memory. + +1. Make sure data is always consistant on the disks. There is no `fsck` command + for ZFS. + +1. Detect and correct data corruption ("bitrot"). ZFS is one of the few storage + systems that checksums everything and is "self-healing". + +1. Make it easy to use. Try to "end the suffering" for the admins involved in + managing storage. + +ZFS includes a host of other features such as snapshots, transparent +compression, and encryption. During the early years of ZFS, this all came with +hardware requirements which only enterprise users could afford. By now, however, +computers have become so powerful that ZFS can run (with some effort) on a +[Raspberry +Pi](https://gist.github.com/mohakshah/b203d33a235307c40065bdc43e287547). FreeBSD +and FreeNAS make extensive use of ZFS. What is holding ZFS back on Linux are +[licensing conflicts](https://en.wikipedia.org/wiki/OpenZFS#History) beyond the +scope of this document. + +Ansible-NAS doesn't actually specify a filesystem - you can use EXT4, XFS, Btrfs +or pretty much anything you like. However, ZFS not only provides the benefits +listed above, but also lets you use your hard drives with different operating +systems. Some people now using Ansible-NAS originally came from FreeNAS, and +were able to `export` their ZFS pools there and `import` them to Ubuntu. On the +other hand, if you ever decide to switch back to FreeNAS or maybe try FreeBSD +instead of Linux, you should be able to do so using the same ZFS pools. + +## A small taste of ZFS + +Storage in ZFS is organized in **pools**. Inside these pools, you create +**filesystems** (also known as "datasets") which are like partitions on +steroids. For instance, you can keep each user's `/home/` files in a separate +filesystem. ZFS systems tend to use lots and lots of specialized filesystems. +They share the available storage in their pool. + +Pools do not directly consist of hard disks or SSDs. Instead, drives are +organized as **virtual devices** (VDEV). This is where the physical redundancy +in ZFS is located. Drives in a VDEV can be "mirrored" or combined as "RaidZ", +roughly the equivalent of RAID5. These VDEVs are then combined into a pool by the +administrator. + +To give you some idea of how this works, this is how to create a pool: + +``` + sudo zpool create tank mirror /dev/sda /dev/sdb +``` + +This combines `/dev/sba` and `/dev/sdb` to a mirrored VDEV, and then defines a +new pool named `tank` consisting of this single VDEV. We can now create a +filesystem in this pool to hold our books: + +``` + sudo zfs create tank/books +``` + +You can then enable automatic and transparent compression on this filesystem +with `sudo zfs set compression=lz4 tank/books`. To take a **snapshot**, use + +``` + sudo zfs snapshot tank/books@monday +``` + +Now, if evil people were somehow to encrypt your book files with ransomware on +Wednesday, you can laugh and revert to the old version: + +``` + sudo zfs rollback tank/books@monday +``` + +Of course, you did lose any work from Tuesday unless you created a snapshot then +as well. Usually, you'll have some form of **automatic snapshot +administration**. + +To detect bitrot and other defects, ZFS periodically runs **scrubs**: The system +compares the available copies of each data record with their checksums. If there +is a mismatch, the data is repaired. + + +## Known issues + +Constructing the pools out of virtual devices creates some problems. You can't +just detach a drive (or a VDEV) and have the pool reconfigure itself. To +reorganize the pool, you'd have to create a new, temporary pool out of separate +hard drives, move the data over, destroy and reconfigure the original pool, and +then move the data back. Increasing the size of a pool involves either adding +more VDEVs (_not_ just additional disks) or replacing each disk in a VDEV by a +larger version with the `autoexpand` parameter set. + +> At time of writing (April 2019), ZFS on Linux does not offer native encryption, +> trim support, or device removal, which are all scheduled to be included in the +> [0.8 release](https://www.phoronix.com/scan.php?page=news_item&px=ZFS-On-Linux-0.8-RC1-Released) +> in the near future. + +## Myths and misunderstandings + +There are a bunch of false or simply outdated information about ZFS. To clear up +the worst of them: + +### No, ZFS does not need at least 8 GB of RAM + +This myth is especially common [in FreeNAS +circles](https://www.ixsystems.com/community/threads/does-freenas-really-need-8gb-of-ram.38685/). +Note that FreeBSD, the basis of FreeNAS, will run with as little [as 1 +GB](https://wiki.freebsd.org/ZFSTuningGuide). The [ZFS on Linux +FAQ](https://github.com/zfsonlinux/zfs/wiki/FAQ#hardware-requirements), which is +more relevant here, states under "suggested hardware": + +> 8GB+ of memory for the best performance. It's perfectly possible to run with +> 2GB or less (and people do), but you'll need more if using deduplication. + +(Deduplication is only useful in [very special +cases](http://open-zfs.org/wiki/Performance_tuning#Deduplication). If you are +reading this, you probably don't need it.) + +What everybody agrees on is that ZFS _loves_ RAM, and you should have as much of +it as you possibly can. So 8 GB is in fact a sensible lower limit you shouldn't +go below unless for testing. When in doubt, add more RAM, and even more, and +them some, until your motherboard's capacity is reached. + +### No, ECC RAM is not required for ZFS + +This again is a case where a recommendation has been taken as a requirement. To +quote the [ZFS on Linux +FAQ](https://github.com/zfsonlinux/zfs/wiki/FAQ#do-i-have-to-use-ecc-memory-for-zfs) +again: + +> Using ECC memory for OpenZFS is strongly recommended for enterprise +> environments where the strongest data integrity guarantees are required. +> Without ECC memory rare random bit flips caused by cosmic rays or by faulty +> memory can go undetected. If this were to occur OpenZFS (or any other +> filesystem) will write the damaged data to disk and be unable to automatically +> detect the corruption. + +It is _always_ better to have ECC RAM on all computers if you can afford it, and +ZFS is no exception. However, there is absolutely no requirement for ZFS to have +ECC RAM. + +### No, the SLOG is not really a write cache + +You'll hear the suggestion that you add a fast SSD or NVMe as a "SLOG" +(mistakingly also called "ZIL") drive for write caching. This isn't what would +happen, because ZFS already includes [a write +cache](https://linuxhint.com/configuring-zfs-cache/). It is located in RAM. +Since RAM is always faster than any drive, adding a disk as a write cache +doesn't make sense. + +What the ZFS Intent Log (ZIL) does, with or without a dedicated drive, is handle +synchronous writes. These occur when the system refuses to signal a successful +write until the data is actually on a physical disk somewhere. This keeps it +safe. By default, the ZIL initially shoves a copy of the data on a normal VDEV +somewhere and then gives the thumbs up. The actual write to the pool is +performed later from the normal write cache, _not_ the temporary copy. The data +there is only ever read if the power fails before the last step. + +A Separate Intent Log (SLOG) is a fast drive for the ZIL's temporary synchronous +writes. It allows the ZIL give the thumbs up quicker. This means that SLOG is +never read unless the power has failed before the final write to the pool. +Asynchronous writes just go through the normal write cache. If the power fails, +the data is gone. + +In summary, the ZIL is concerned with preventing data loss for synchronous +writes, not with speed. You always have a ZIL. A SLOG will make the ZIL faster. +You'll need to [do some +research](https://www.ixsystems.com/blog/o-slog-not-slog-best-configure-zfs-intent-log/) +to figure out if your system would benefit from a SLOG. NFS for instance uses +synchonous writes, SMB usually doesn't. If in doubt, add more RAM instead. + + +## Further reading and viewing + +- One of the best books around is _FreeBSD Mastery: ZFS_ by Michael W. + Lucas and Allan Jude. Though it is written for FreeBSD, the general guidelines + apply for all variants. There is a second book for advanced users. + +- Jeff Bonwick, one of the original creators of ZFS, tells the story of how ZFS + came to be [on YouTube](https://www.youtube.com/watch?v=dcV2PaMTAJ4). +