What to do with a Degraded ZFS Pool

What to do with a Degraded ZFS Pool

Craft Computing

8 месяцев назад

37,014 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

Dave Stemerdink
Dave Stemerdink - 23.11.2023 01:13

ZFS is awesome

Ответить
inputoutput1126
inputoutput1126 - 21.11.2023 00:01

you should really still be using disk uuid's when replacing the degraded disk

Ответить
Alisher Beknazarov
Alisher Beknazarov - 20.11.2023 12:17

zpool scrub -v pool_name

invalid option 'v'

Ответить
OsX86H3AvY
OsX86H3AvY - 20.11.2023 05:46

this is why i use zabbix

Ответить
Mikhail Sokolovskyy
Mikhail Sokolovskyy - 19.11.2023 02:59

Hi Jeff! Can you recommend any reliable enterprise SSD for home server application for storing all my VMs which are controlled and accessed by proxmox? Preferably these disks should be available in local stores, not only as used parts.

Ответить
Martin Schreiber
Martin Schreiber - 18.11.2023 23:45

You can have surprises regarding the disksize. A 1TB SSD is not always the same size as another 1TB SSD. I ran into this problem when trying to replace a faulty SSD with a new one. The replacement i purchased had a little bit less storagecapacity than the faulty one despite coming from the same manufacturer and being the same model and size. In the end i recreated my zfs-pool and restored the data from my backup.

Ответить
Fx_Gamer11 and Babygirllvvs
Fx_Gamer11 and Babygirllvvs - 18.11.2023 09:50

hey man not to pester you but what up with the shops remodel any update i would love to see more on it

Ответить
Ewen Chan
Ewen Chan - 18.11.2023 06:45

Two things:

1) ZFS on root is good on Proxmox IF you do NOT intend on passing a GPU through and/or if you are willing to deal with the additional complexities that ZFS root presents, when you're updating the GRUB boot command line options (which is required for GPU passthrough).

2) Replacing the disk by the /dev/# rather than the /dev/disk-by-id/# is NOT the preferred way to do it because if you have a system with multiple NVMe controllers, the way that Debian/Proxmox ennumerates those controllers and slots CAN change, therefore; replacing the disk via zfs replace <<pool_name>> <<old_disk>> <<new_disk_by_id>> is the preferred way to execute the zfs replace command as it no longer cares WHERE you put the disk (i.e. on different NVMe controllers), so long as it is present in the system.

Ответить
Adam Chandler
Adam Chandler - 18.11.2023 06:11

When do we get an updated video on the new studio?

Ответить
gsrcrxsi
gsrcrxsi - 18.11.2023 04:07

The scrub command does not accept the -v option.

Ответить
eDoc2020
eDoc2020 - 17.11.2023 21:17

This is the type of thing that should have a built-in GUI option. Especially since, as others have said, you did not copy the boot pool or records.

Ответить
Adam Rohr
Adam Rohr - 17.11.2023 20:25

I have been watching your channel for a long time, and you have been very helpful on many topics, but this video is very misleading as you skipped a very critical step of replicating the partition setup so you could also mirror the boot partition. You acknowledge this in another comment, but it's not sticky and you have not made any attempt to correct this video. But the Vultr sponsorship is stuck at the top... I'm all for making money, but you should correct your video immediately as you are impacting people who may be using this advice and breaking their systems. The viewers are just as important as sponsors.

Ответить
ofacesig
ofacesig - 17.11.2023 17:35

I currently run an NFS share on my TrueNAS and feed 10Gbps to my Proxmox box for VM storage. I don't like what I'm seeing on how Proxmox reports this stuff.

Ответить
Klobi for President
Klobi for President - 17.11.2023 15:58

I stopped watching your channel a few years ago, I'm not sure why. Now that I need to store several terabytes of data that I don't want to sacrifice my D:/ drive for I guess I'm back. The more things change, the more they stay the same.

Ответить
Feels Bad
Feels Bad - 17.11.2023 06:48

Please more ZFS content!!! Maybe some ZFS on root also? Not BSD but Linux ZFS on root? I would love to see some of that.

Ответить
Darrell Patenaude
Darrell Patenaude - 17.11.2023 06:23

I love the show. I actually have thought about trying prox-mox for a while, just haven't yet. I hope in the future prox-mox can incorporate some sort of warning to let you know of ZFS issues, as I assume that zfs recently was added to prox-mox. Also, I would hope for the less tech types that having a zfs menu with these commands would do wonders so one would not have to goto command line and try to remember these commands to do a hd swap (or atleast show the command in a zfs menu.

Does prox-mox have a email option to email you when things go wrong? I run 2 truenas scale box's and I have email enabled to let me know when there are problems.
Maybe this is more of a wish list hoping prox-mox adds to future releases.

Ответить
SoulkeepHL
SoulkeepHL - 17.11.2023 02:55

As someone who's never used ProxMox, I was wondering how this was a 15 minute video given that ProxMox has always appeared to have a nice WebUI. And then the command line tools came out and I went "Oh. Ohhh. OHHHHHH."

Ответить
frank wong
frank wong - 17.11.2023 02:32

Software hardware requirement: technical it do exist
You need sata controller or pcie lanes depending on drive type


Pcie lanes are not cheap 😢

Ответить
C S
C S - 16.11.2023 19:03

i respect proxmox but would prefer just a standalone true nas core. i guess im a newb?

Ответить
TevisC
TevisC - 16.11.2023 17:20

UnRaid supports zfs nativity now. Drive health is very easily monitored.

Ответить
BillClintonIsRapist
BillClintonIsRapist - 16.11.2023 16:34

I use Zabbix to monitor Proxmox via "Linux agent" and Proxmox API template. Would make for a good video. Make sure to build your own Zabbix and not use the evaluation VM image

Ответить
Greg Liming
Greg Liming - 16.11.2023 16:07

Wow... I could've used this video 3 weeks ago.

Ответить
Marco Genovesi
Marco Genovesi - 16.11.2023 15:14

Imho this video is a bit rough, a bit rushed. No mention about setting up email notifications for events and zfs (very important to get notified of failures), no mention of the fact that this is a root pool so you have to follow a different procedure to create the boot partition and update the config so proxmox re-generates the boot files there too.

Ответить
TAP7a
TAP7a - 16.11.2023 14:52

Haircut looking sharp

Ответить
DasBreaker
DasBreaker - 16.11.2023 14:45

I don’t like zfs on proxmox in particular. Every install in keeps about 50% of the drive for the system and the other half for the pool. Since I don’t need 500G for proxmox itself I ether remove the zfs pool entirely or extend it which is definitely not easy. I rather go through a Debian net install and have a clean lvm install and then install proxmox over it instead of dealing with zfs with the ISO.

ZFS on a dedicated NAS is a completely different thing tho so don’t get me wrong. I like zfs but not in combination with proxmox.

Ответить
GourmetSaint
GourmetSaint - 16.11.2023 14:13

What about the boot partitions? There's a guide in the Proxmox dock, under ZFS on Linux, which describes the sgdisk copying of the partitions before the zpool replace?

Ответить
Richard Parker
Richard Parker - 16.11.2023 13:33

I've been casually looking for a while to try to find a disk format/system that will take care of a bunch of drives, allow you to add or remove a drive at will, of various sizes, and report any errors as they occur. It should try to maximise throughput (read and write) while having redundancy, so long as there are at least 2 drives. Ideally with a web interface.

I know, moon on a stick.

Ответить
Marco2G
Marco2G - 16.11.2023 12:47

I've had errors on my large pool and I reset the disk twice in many years. It's still going and it's been months without a single error on that disk. Really interesting. I have a replacement new in box ready if I ever do feel the need to replace a disk but so far... nothing. That's over an array of 12 disks. I mean sure, my disks aren't worked very hard but still I'm impressed.

Ответить
Krzysztof Maciejewski
Krzysztof Maciejewski - 16.11.2023 12:11

More Proxmox content always nice to wach :)

Ответить
Wolfgang Demeter
Wolfgang Demeter - 16.11.2023 10:53

Good video, but didn't you miss to replicate the partition table of the old / known good device (-part3 for rpool) and instead used the whole new NVMe device for rpool. If your now old NVMe device fails you have now no Boot-Partition to boot from on your newly added NVMe. Or am i completely wrong here?

Those are the steps, i usally take to replace a failed Proxmox ZFS rpool disc:

1) Replace the physical failed/offline drive with /dev/sdc (for example)

Initialize Disk
2) From the WebUI, Servername -> Disks -> Initialize Disk with GPT (/dev/sdc) OR gdisk /dev/sdc --> o (new empty GPT) --> w (write)

Copy the partition table from /dev/sda (known good device) to /dev/sdc
3) sgdisk --replicate=/dev/sdc /dev/sda

Ensure the GUIDs are randomized
4) sgdisk --randomize-guids /dev/sdc

Install the Grub on the new disk
5) grub-install /dev/sdc

Then replace the disk in the ZFS pool
6) zpool replace rpool /dev/sdc3
OR zpool attach rpool /dev/sda3 /dev/sdc3 --> sda3 known good devcie / sdc3 new device

Maybe detach old disk from the ZFS pool
7) zpool detach rpool /dev/sdx3

Maybe install Proxmox Boot Tool to new device
8) proxmox-boot-tool status
proxmox-boot-tool format /dev/sdc2
proxmox-boot-tool init /dev/sdc2
proxmox-boot-tool clean

Ответить
Hicknopunk
Hicknopunk - 16.11.2023 10:29

How you got zfs working in cpm is beyond me 😅

Ответить
MarcoZ
MarcoZ - 16.11.2023 09:40

Do NOT use the simple /dev/ path when replacing a disk. there's no guarantee Linux will always use the same number for the same drive. There's a reason why rpool is set up with disk by id.

Ответить
Volodymyr Fedorov
Volodymyr Fedorov - 16.11.2023 09:38

Hey, @CraftComputing I do not think it is fair to share with the community how to block notifications that proxmox has non-enterprise repo enabled. Staff of Proxmox Server Solutions GmbH has to pay their bills to continue developing products.

Ответить
bitter Medicine
bitter Medicine - 16.11.2023 09:16

correct answer: run screaming away from zfs and never look back

Ответить
Rambozo Clown
Rambozo Clown - 16.11.2023 06:31

While software RAID is the future, I really did like my old Compaq hardware RAID servers. See a drive with a yellow light, pull it and replace it with a new one, job done. The controller does everything in the background, no need to even log in. The green light starts blinking as it rebuilds, when it's solid, it's done. I could even run redundant hot swap controllers in the cluster boxes.

Ответить
Computers Cats and More
Computers Cats and More - 16.11.2023 06:21

ZFS may be better, but replacing a drive on a RAID controller is so much easier. I was stumped when I had my first ever drive failure, and all the instructions said to pull the drive and insert a new one. Didn't realize it was that easy.

Ответить
pkt1213
pkt1213 - 16.11.2023 06:19

Looks like you got a haircut. For this episode we'll call you Fresh.

Ответить
zasweqa pouytew
zasweqa pouytew - 16.11.2023 05:56

you are the only one that can entertain me on this app

Ответить
Christ Schlacta
Christ Schlacta - 16.11.2023 05:47

...if you do that exact process on your rpool and nothing more, twice, once for each drive, you'll end up with an unbootable system. For your rpool, you need to partition the drive and properly install and configure the boot loader as well. Read up on proxmox-boot-tool. Your instructions are fine for a data only pool, but insufficient for rpool. I rarely suggest it, but you should take down and edit this video to either show the correct procedure for an rpool or use a standard data only pool and add that qualifier to the video

Ответить
Bernd Eckenfels
Bernd Eckenfels - 16.11.2023 05:30

If you replace with the short disk name is it still finding the disk by id later on when device names change?

Ответить
bobruddy
bobruddy - 16.11.2023 04:45

don't you have to worry about the uefi partition? or is that on the zfs volume?

Ответить
unijabnx2000
unijabnx2000 - 16.11.2023 04:43

if you drop to a shell... and run mutt/mail you should have gotten those degraded alerts emailed to root user.

Ответить
SideSweep
SideSweep - 16.11.2023 04:06

Damn I love that Ad intro 😄!
[edit] You look different... Even maybe even... Do I dare to say it... handsome? Do you have more hair? Or did you change hairstyle? WTF Happend?!!

Ответить
Vatharian
Vatharian - 16.11.2023 03:43

zpool-scrub doesn't have the -v argument...?

Ответить
S.E. Ong
S.E. Ong - 16.11.2023 03:36

I have checksum errors in my vm image on a zfs mirror... any idea how to find the cause and reset it to 0?

Ответить
Justin Lee
Justin Lee - 16.11.2023 03:34

$12USD for 4 beer... Tall boys by the look of it... For canadians, that's not bad. Down right approachable. xD

Ответить
Praveen Premaratne
Praveen Premaratne - 16.11.2023 02:57

This video came at the perfect time for me as my boot disk just became degraded a few weeks ago. It'd be pretty good if there's a walk-through of going from a single disk to a mirror migration. AFAIK, this isn't a thing and I can't clone the drive because one specific path is causing disk error when I try to do anything on it so I need to re-installed the OS and migrate the VM spec over to the new boot drives.

Ответить
Richard25000
Richard25000 - 16.11.2023 02:56

I had a pool randomly decide to corrupt, no drive failures..

It made FreeBSD Kernel Panic anytime the pool was attempted to be mounted. (TrueNAS Core)

I thought maybe I'll boot linux and try mount it (TrueNAS Scale)

Nope managed to kernel panic linux too...

Thankfully, I replicated to another nas, and recreated and restored the pool..

Painful but was a known restore time scale than trying to fix a broken pool..

Ответить