Update 2017-08-26: While the system was usable during heavy I/O, the lag was fairly annoying. I have since moved my system partitions back to the SSD (losing the raid1 redundancy). My two HDDs are still encrypted with LUKS, but I switched to btrfs’s built-in raid feature for raid1. So basically I’ve ditched mdadm and bcache.


Goals

My desktop is currently running a fairly custom configuration of Arch Linux which I’m going to attempt to explain in this post. First, I use my desktop as a multi-purpose system. My primary activities include:

  • Open source software development (Ruby, Node, Go, Elixir, etc.)
  • General desktop use (browser, email, chat, etc.)
  • Gaming (including some Windows-only games, like Overwatch)
  • Personal file server/cold storage (important files I would never want to lose)
  • Media server (Plex, etc.)
  • Kitchen sink (I run whatever appliances I happen to be using at the moment with docker)

I dual-boot Windows 10 so that I can play Overwatch and a few other games. I previously also used Windows as a media/file server, but I prefer to use Linux for my personal computing as there are many privacy concerns about Windows 10 and it’s kind of a black box to me. For example, I tried to use Storage Spaces and nearly lost some data when a drive failed and Windows wouldn’t mount my mirrored space (even though it was supposed to tolerate that failure transparently). I also wanted to encrypt the space, but after that near-miss I didn’t feel confident in allowing Windows to encrypt the same data it almost ate without the encryption.

That said, I had some pretty lofty goals for my new desktop setup:

1. At-rest encryption for everything

I encrypt the entire 500GB disk on my laptop with dm-crypt/LUKS (standard full-disk encryption technique on most Linux distros). I use LVM on top of LUKS so that all my volumes are encrypted at-rest, including swap. I find it much better (for the sake of sanity) to encrypt everything rather than try to decide what deserves encryption and what doesn’t.

I wanted to use the same approach on my desktop, except my disk layout is more complex: I have multiple HDDs (for media/archive storage) with a 128GB SSD M.2 card.

2. One encryption password

It would be simple enough to install the OS on my SSD using the same LUKS on LVM approach as my laptop, but that would mean my HDDs must be encrypted as separate volumes. I might be able to store those extra keys on the encrypted SSD (either in a keychain or when mounting the volumes in /etc/fstab), but ideally I wanted all my data encrypted using the same key which is entered once, at boot.

The other issue with using the SSD exclusively for the OS is that there would be no redundancy in case of drive failure (meaning it would take longer to restore my OS volumes from a backup).

3. Full redundancy

I wanted to be able to tolerate at least a single drive failure for my storage as well as my OS and bootloader. That means I need to use a raid1 or raid5 configuration for my OS as well as HDD storage, and at least 2 drives for each volume group.

4. OS Speed

I wanted to leverage my SSD as much as possible when booting the OS and accessing commonly used programs and files. Normally I’d just install the OS on the SSD, but again that offered no redundancy, and files stored on the HDDs would not benefit from it at all.

5. Flexible (and automated) system snapshots

Arch is a rolling release, which means it gets the latest package updates (including the Linux kernel) immediately. I’m pretty sure Archers used “YOLO” before it was cool.

It’s important to have a way to roll back my system if it doesn’t like an update. The alternatives are to drop everything I’m doing to troubleshoot and fix the issue or live with whatever it is until I can get around to it.

With that in mind, I wanted to have regular system snapshots which I could restore effortlessly.

The setup

My available drives consist of 1x128GB SSD (M.2), 2x5TB HDD, 4x2TB HDD. Because I don’t have more than 5TB of data atm, I opted to keep the 2TB drives (which I salvaged from a Seagate NAS) in reserve and go with the SSD and 2 5TB drives in raid1 (for 5TB of mirrored storage). The simplest thing I could think of was to install Arch on my raid1 array and use the SSD as a cache on top of it. I’d been curious if that were possible for some time until I stumbled onto bcache, which is exactly what I wanted.

I’m using 5 different file system layers which each serve a single purpose:

  1. Software raid (via mdadm)
  2. SSD cache (via bcache)
  3. Encrypted file system (via LUKS)
  4. LVM
  5. btrfs

My bootloader (Syslinux atm) is configured on a second raid1 array (same disks, different partitions) so that it should also be able to tolerate a drive failure transparently.

Here’s what my lsblk looks like:

NAME                 SIZE TYPE  MOUNTPOINT
sda                119.2G disk
sdb                  4.6T disk
├─sdb1               260M part
│ └─md127            260M raid1 /boot
└─sdb2               4.6T part
  └─md126            4.6T raid1
    └─bcache0        4.6T disk
      └─luks         4.6T crypt
        ├─vg0-swap    16G lvm   [SWAP]
        └─vg0-root   4.5T lvm   /
sdc                  4.6T disk
├─sdc1               260M part
│ └─md127            260M raid1 /boot
└─sdc2               4.6T part
  └─md126            4.6T raid1
    └─bcache0        4.6T disk
      └─luks         4.6T crypt
        ├─vg0-swap    16G lvm   [SWAP]
        └─vg0-root   4.5T lvm   /

sda is my SSD and sdb/sdc are my HDDs.

LVM allows me to configure a volume for swap which lives inside the encrypted LUKS container as well as change my partitions more easily in the future.

LUKS is also below the SSD cache, so any cached data is also encrypted at-rest.

The rest of the drive space is used for a root volume which is formatted with btrfs, which is a newer file system for Linux which allows me to create flexible subvolumes which can then be snapshotted using a copy-on-write strategy. I’ve configured my system to take hourly snapshots of the root subvolume (where the OS lives) so that if an Arch update hoses something I can just restore an earlier snapshot. For a really interesting talk about btrfs, check out Why you should consider using btrfs … like Google does.

Result

So far everything seems to be working great. Arch boots up quickly and everything feels snappy, about the same as on my laptop’s SSD, except I have 5TB available. Writing files to disk is really fast until (I assume) the HDDs get involved at which point I do have some system lag if I’m transferring a massive amount of data. So far that hasn’t been a big problem for me.

The real test will come when something fails… I still need to set up a full remote system backup and then I plan to run some failure scenarios like unplugging one of my HDDs and trying to boot the system, etc. I’m also very new to btrfs, but I really like what I’ve seen.

If you have any questions or would like a more detailed technical guide on how to set up this system from scratch, hit me up on Twitter.