- 26 Jan, 2024 1 commit
-
-
Ameer Hamza authored
If devid or physpath for a vdev changes between imports, ensure it is updated to the new value. Signed-off-by:
Ameer Hamza <ahamza@ixsystems.com>
-
- 23 Jan, 2024 2 commits
-
-
Pawel Jakub Dawidek authored
Descriptor leak can be easily reproduced by doing: # zpool import tank # sysctl kern.openfiles # zpool export tank; zpool import tank # sysctl kern.openfiles We were leaking four file descriptors on every import. Similar leak most likely existed when using file-based VDEVs. External-issue: https://reviews.freebsd.org/D43529 Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15630
-
Brian Behlendorf authored
If block cloning is disabled by default then enable it when running the bclone tests. Follow up to #15529. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #15796
-
- 19 Jan, 2024 1 commit
-
-
Val Packett authored
musl libc has deprecated LFS64 aliases, so bootstrapping FreeBSD tools under musl distros has been failing with stat64 errors. Apply the aliases under non-glibc Linux to fix this problem. Reviewed-by:
Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Val Packett <val@packett.cool> Closes #15780
-
- 17 Jan, 2024 4 commits
-
-
Tino Reichardt authored
Compiling on arm64 freebsd-13.2 and arm64 almalinux-8 brings currently this error: ``` CC tests/zfs-tests/cmd/clonefile.o tests/zfs-tests/cmd/clonefile.c:166:43: error: result of comparison of \ constant -1 with expression of type 'char' is always true \ [-Werror,-Wtautological-constant-out-of-range-compare] while ((c = getopt(argc, argv, "crfdq")) != -1) { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~ 1 error generated. gmake[2]: *** [Makefile:8675: tests/zfs-tests/cmd/clonefile.o] Error 1 ``` Fix: use correct variable type `int`. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Rob Norris <robn@despairlabs.com> Signed-off-by:
Tino Reichardt <milky-zfs@mcmilk.de> Closes #15783
-
Tino Reichardt authored
Credential Implementation -> Condition Variables Implementation Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Tino Reichardt <milky-zfs@mcmilk.de> Closes #15782
-
Kevin Jin authored
Switch from cv_wait() to cv_wait_idle() in vdev_autotrim_wait_kick(), which should mitigate the high load average while waiting. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
jxdking <lostking2008@hotmail.com> Closes #15781
-
Pawel Jakub Dawidek authored
If the destination file is mmaped and the mmaped region was already read, so it is cached, we need to update mmaped pages after successful clone using update_pages(). Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Pointed out by: Ka Ho Ng <khng@freebsd.org> Signed-off-by:
Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15772
-
- 16 Jan, 2024 8 commits
-
-
Rob N authored
In db4fc559 I messed up and changed this bit of code to set the inode atime to an uninitialised value, when actually it was just supposed to loading the atime from the inode to be stored in the SA. This changes it to what it should have been. Ensure times change by the right amount Previously, we only checked if the times changed at all, which missed a bug where the atime was being set to an undefined value. Now ensure the times change by two seconds (or thereabouts), ensuring we catch cases where we set the time to something bonkers Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #15762 Closes #15773
-
Lalufu authored
When building (s)rpm files through the Makefile, a directory structure is created in /tmp to hold the various files. In case the user running the command has overridden some of the RPM path settings through their user profile (for example in `~/.rpmmacros`), these paths do not line up with the configuration, and the build fails. Make sure all paths used are properly defined. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ralf Ertzinger <ralf@skytale.net> Closes #15756
-
youzhongyang authored
On Linux x86_64, kmem cache can have size up to 4M, however increasing spl_kmem_cache_slab_limit can lead to crash due to the size check inconsistency. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Youzhong Yang <yyang@mathworks.com> Closes #15757
-
Ameer Hamza authored
If the AUX vdev is added using UUID, importing the pool falls back AUX vdev to open it with disk name instead of UUID due to the absence of path information for AUX vdevs. Since AUX label now have path information, this PR adds path handling for it in `label_path`. Reviewed-by:
Umer Saleem <usaleem@ixsystems.com> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ameer Hamza <ahamza@ixsystems.com> Closes #15737
-
Ameer Hamza authored
Pool import logic uses vdev paths, so it makes sense to add path information on AUX vdev as well. Reviewed-by:
Umer Saleem <usaleem@ixsystems.com> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ameer Hamza <ahamza@ixsystems.com> Closes #15737
-
Ameer Hamza authored
When spare or l2cache (aux) vdev is added during pool creation, spa->spa_uberblock is not dumped until that point. Subsequently, the aux label is never synchronized after its initial creation, resulting in the uberblock label remaining undumped. The uberblock is crucial for lib_blkid in identifying the ZFS partition type. To address this issue, we now ensure sync of the uberblock label once if it's not dumped initially. Reviewed-by:
Umer Saleem <usaleem@ixsystems.com> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ameer Hamza <ahamza@ixsystems.com> Closes #15737
-
Rich Ercolani authored
zdb -R has a minor flaw in which it will not always print the full output of a decompressed block. Oops. While I was in there, I also reworked the logic so it won't try ZLE unless everything else fails, which will hopefully avoid the problem ZDB_NO_ZLE was intended to mitigate of reporting a lot of false positives of ZLE compressed blocks... Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #15723
-
Umer Saleem authored
For block cloning, if we mmap the cloned file and write from the map into the file, it triggers a panic in dbuf_redirty() on Linux. The same scenario causes data corruption on FreeBSD. Both these issues are fixed under PR#15656 and PR#15665. It would be good to add a test for this scenario in ZTS. The test program and issue was produced by @robn. Reviewed-by:
Pawel Jakub Dawidek <pawel@dawidek.net> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Reviewed-by:
Ameer Hamza <ahamza@ixsystems.com> Signed-off-by:
Umer Saleem <usaleem@ixsystems.com> Closes #15717
-
- 12 Jan, 2024 10 commits
-
-
Brian Behlendorf authored
Drop the no_memory() call from zpool_in_use() when reading the label fails and instead return the error to the caller. This prevents a misleading "internal error: out of memory" error when the label can't be read. This will result in is_spare() returning B_FALSE instead of aborting, which is already safely handled. Furthermore, on Linux it's possible for EREMOTEIO to returned by an NVMe device if the device has been low-level formatted and not rescanned. In this case we want to fallback to the legacy scanning method and read any of the labels we can. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Issue #13538 Closes #15747
-
Benjamin Sherman authored
This change provides rpm spec macros to sign the zfs and spl kmods as the final step after the %install scriptlet. This is needed since the find-debuginfo.sh script strips out debug symbols plus signatures. Kernel module signing only occurs when the required files are present as typically required in the Linux source tree: - certs/signing_key.pem - certs/signing_key.x509 The method for overriding the default __spec_install_post macro is inspired by (and largely copied from) the Fedora kernel.spec. Reviewed-by:
Tony Hutter <hutter2@llnl.gov> Reviewed-by:
Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by:
Benjamin Sherman <benjamin@holyarmy.org> Closes #15744
-
Mark Johnston authored
For FreeBSD sysctls, we don't want the extra newline, since the sysctl(8) utility will format strings appropriately. Reviewed-by:
Rob Norris <robn@despairlabs.com> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Reported-by:
Peter Holm <pho@FreeBSD.org> Signed-off-by:
Mark Johnston <markj@FreeBSD.org> Closes #15719
-
Mark Johnston authored
sbuf_cpy() resets the sbuf state, which is wrong for sbufs allocated by sbuf_new_for_sysctl(). In particular, this code triggers an assertion failure in sbuf_clear(). Simplify by just using sysctl_handle_string() for both reading and setting the tunable. Fixes: 6930ecbb ("spa: make read/write queues configurable") Reviewed-by:
Rob Norris <robn@despairlabs.com> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Reported-by:
Peter Holm <pho@FreeBSD.org> Signed-off-by:
Mark Johnston <markj@FreeBSD.org> Closes #15719
-
Rich Ercolani authored
Profiling zdb -vvvvv on datasets with a lot of zstd blocks, we find ourselves spending quite a lot of time on malloc/free, because we allocate a 16M abd each call, and never free it, so we're leaking 16M per call as well. This seems sub-optimal. So let's just keep the buffer around and reuse it. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Rob Norris <robn@despairlabs.com> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #15721
-
Stefan Lendl authored
When running zfs share -a resetting the exports.d/zfs.exports makes sense the get a clean state. Truncating was also called with zfs mount which would not populate the file again. Add test to verify shares persist after mount -a. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Stefan Lendl <s.lendl@proxmox.com> Closes #15607 Closes #15660
-
Brian Behlendorf authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15749
-
Rich Ercolani authored
zdb -R with :d tries to use gzip decompression 9 times per size. There's absolutely no reason for that, they're all the same decompressor. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #15726
-
Mark Johnston authored
In general, VOPs must not load the "z_log" field until having called zfs_enter_verify_zp(). Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Mark Johnston <markj@FreeBSD.org> Closes #15752
-
Mark Johnston authored
We need to wait until after having done a zfs_enter() to load some fields from the zfsvfs structure. Otherwise a use-after-free is possible in the face of a concurrent rollback. Other functions in this file are careful to avoid this bug, I believe this is the only instance. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Mark Johnston <markj@FreeBSD.org> Closes #15752
-
- 09 Jan, 2024 7 commits
-
-
gofaster authored
This commit adds the zed_notify_gotify() function and hooks it into zed_notify(). This will allow ZED to send notifications to a self-hosted Gotify service, which can be received on a desktop or mobile device. It is configured with ZED_GOTIFY_URL, ZED_GOTIFY_APPTOKEN and ZED_GOTIFY_PRIORITY variables in zed.rc. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
gofaster <felix.gofaster@gmail.com> Closes #15693
-
Alexander Motin authored
Two block pointers in livelist pointing to the same location may be caused not only by dedup, but also by block cloning. We should not assert D bit set in them. Two block pointers in livelist pointing to the same location may have different logical birth time in case of dedup or cloning. We should assert identical physical birth time instead. Assert identical physical block size between pointers in addition to checksum, since that is what checksums are calculated on. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15732
-
Alexander Motin authored
- Fail if source block is smaller than destination. We can only grow blocks, not shrink them. - Fail if we do not have full znode range lock. In that case grow is not even called. We should improve zfs_rangelock_cb() somehow to know when cloning needs to grow the block size unlike write. - Fail of we tried to resize, but failed. There are many reasons for it to fail that we can not predict at this level, so be ready for them. Unlike write, that may proceed after growth failure, block cloning can't and must return error. This fixes assertion inside dmu_brt_clone() when it sees different number of blocks held in destination than it got block pointers. Builds without ZFS_DEBUG returned EXDEV, so are not affected much. Reviewed-by:
Pawel Jakub Dawidek <pawel@dawidek.net> Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15724 Closes #15735
-
Kent Ross authored
This function decompresses to two buffers and then compares them to check whether the (opaque) decompression process filled the whole buffer. Previously it began with lbuf uninitialized and lbuf2 filled with pseudorandom data. This neither guarantees that any bytes not written by the compressor would be different, nor seems incredibly sound otherwise! After these changes, instead of filling one buffer with generated pseudorandom data we overwrite each buffer with completely different data. This should remove the possibility of low-probability failures, as well as make the process simpler and cheaper. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Rich Ercolani <rincebrain@gmail.com> Signed-off-by:
Kent Ross <k@mad.cash> Closes #15733
-
Jose Luis Duran authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Jose Luis Duran <jlduran@gmail.com> Closes #15727
-
Alexander Motin authored
While picking parts from #14909 I've missed Linux tracing specific ones, that went unnoticed in default configurations, but breaks the build in some. Reviewed-by:
Ameer Hamza <ahamza@ixsystems.com> Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15730
-
Shengqi Chen authored
This patch adds check for `kernel_neon_*` symbols on arm and arm64 platforms to address the following issues: 1. Linux 6.2+ on arm64 has exported them with `EXPORT_SYMBOL_GPL`, so license compatibility must be checked before use. 2. On both arm and arm64, the definitions of these symbols are guarded by `CONFIG_KERNEL_MODE_NEON`, but their declarations are still present. Checking in configuration phase only leads to MODPOST errors (undefined references). Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Shengqi Chen <harry-chen@outlook.com> Closes #15711 Closes #14555 Closes: #15401
-
- 27 Dec, 2023 1 commit
-
-
Mark Johnston authored
- Mark some parameters to zpool_power*() as unused. - Add a stub zpool_disk_wait(). Fixes: a9520e6e ("zpool: Add slot power control, print power status") Signed-off-by:
Mark Johnston <markj@FreeBSD.org> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Reviewed-by:
Tony Hutter <hutter2@llnl.gov>
-
- 26 Dec, 2023 1 commit
-
-
Pawel Jakub Dawidek authored
The test mostly focus on testing various corner cases. The tests take a long time to run, so for the common.run runfile we randomly select a hundred tests. To run all the bclone tests, bclone.run runfile should be used. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Pawel Jakub Dawidek <pawel@dawidek.net> Closes #15631
-
- 21 Dec, 2023 4 commits
-
-
Brian Behlendorf authored
On some systems we already have blkdev_get_by_path() with 4 args but still the old FMODE_EXCL and not BLK_OPEN_EXCL defined. The vdev_bdev_mode() function was added to handle this case but there was no generic way to specify exclusive access. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #15692
-
chrisperedun authored
While 763ca47f closes the situation of block cloning creating unencrypted records in encrypted datasets, existing data still causes panic on read. Setting zfs_recover bypasses this but at the cost of potentially ignoring more serious issues. Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Chris Peredun <chris.peredun@ixsystems.com> Closes #15677
-
Alexander Motin authored
Track history in context of bursts, not individual log blocks. It allows to not blow away all the history by single large burst of many block, and same time allows optimizations covering multiple blocks in a burst and even predicted following burst. For each burst account its optimal block size and minimal first block size. Use that statistics from the last 8 bursts to predict first block size of the next burst. Remove predefined set of block sizes. Allocate any size we see fit, multiple of 4KB, as required by ZIL now. With compression enabled by default, ZFS already writes pretty random block sizes, so this should not surprise space allocator any more. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #15635
-
Tony Hutter authored
Add `zpool` flags to control the slot power to drives. This assumes your SAS or NVMe enclosure supports slot power control via sysfs. The new `--power` flag is added to `zpool offline|online|clear`: zpool offline --power <pool> <device> Turn off device slot power zpool online --power <pool> <device> Turn on device slot power zpool clear --power <pool> [device] Turn on device slot power If the ZPOOL_AUTO_POWER_ON_SLOT env var is set, then the '--power' option is automatically implied for `zpool online` and `zpool clear` and does not need to be passed. zpool status also gets a --power option to print the slot power status. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Mart Frauenlob <AllKind@fastest.cc> Signed-off-by:
Tony Hutter <hutter2@llnl.gov> Closes #15662
-
- 20 Dec, 2023 1 commit
-
-
Rob N authored
We are finding that as customers get larger and faster machines (hundreds of cores, large NVMe-backed pools) they keep hitting relatively low performance ceilings. Our profiling work almost always finds that they're running into bottlenecks on the SPA IO taskqs. Unfortunately there's often little we can advise at that point, because there's very few ways to change behaviour without patching. This commit adds two load-time parameters `zio_taskq_read` and `zio_taskq_write` that can configure the READ and WRITE IO taskqs directly. This achieves two goals: it gives operators (and those that support them) a way to tune things without requiring a custom build of OpenZFS, which is often not possible, and it lets us easily try different config variations in a variety of environments to inform the development of better defaults for these kind of systems. Because tuning the IO taskqs really requires a fairly deep understanding of how IO in ZFS works, and generally isn't needed without a pretty serious workload and an ability to identify bottlenecks, only minimal documentation is provided. Its expected that anyone using this is going to have the source code there as well. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rob Norris <rob.norris@klarasystems.com> Closes #15675
-