- 24 Feb, 2021 7 commits
-
-
Brian Behlendorf authored
Calling vdev_free() only requires the we acquire the spa config SCL_STATE_ALL locks, not the SCL_ALL locks. In particular, we need need to avoid taking the SCL_CONFIG lock (included in SCL_ALL) as a writer since this can lead to a deadlock. The txg_sync_thread() may block in spa_txg_history_init_io() when taking the SCL_CONFIG lock as a reading when it detects there's a pending writer. Reviewed-by:
Igor Kozhukhov <igor@dilos.org> Reviewed-by:
Mark Maybee <mark.maybee@delphix.com> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #11585
-
Tony Hutter authored
Given a DM device name, the old vdev_id script would extract any text after a 'p' as the partition number. It then appends "-part" + the partition number to the name, giving a by-vdev name like "L0-part5". This works fine if the DM name is like 'dm-2p5', but doesn't work if the DM name is a multipath name like "mpatha". In those cases it incorrectly matches the 'p' in "mpatha", giving by-vdev names like "L0-partatha". This patch fixes the issue by making the partition regex match stricter. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Tony Hutter <hutter2@llnl.gov> Closes #11637
-
Brian Behlendorf authored
On Linux increase the maximum allowed size of the src nvlist which can be passed to the /dev/zfs ioctl. Originally, this was set to a maximum of KMALLOC_MAX_SIZE (4M) because it was kmalloc'd. Since that time it's been converted to a vmalloc so that's no longer a hard limit, and it's desirable for `zfs send/recv` to allow larger nvlists so more snapshots can be sent at once. Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #6572 Closes #11638
-
Prakash Surya authored
This change modifies the behavior of how we determine how much slop space to use in the pool, such that now it has an upper limit. The default upper limit is 128G, but is configurable via a tunable. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Prakash Surya <prakash.surya@delphix.com> Closes #11023
-
Ryan Moeller authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11636
-
Ryan Moeller authored
gmake install fails when zpool.d compat links already exist. Force the symlinks to be recreated if already present. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11633
-
Cedric Maunoury authored
The behavior of a NULL fromsnap was inadvertently changed for a doall send when the send/recv logic in libzfs was updated. Restore the previous behavior by correcting send_iterate_snap() to include all the snapshots in the nvlist for this case. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Cedric Maunoury <cedric.maunoury@gmail.com> Closes #11608
-
- 21 Feb, 2021 4 commits
-
-
Adam D. Moss authored
Using zfs-sh -u on linux will fail with inaccurate message when the zfs modules are already unloaded. Deal with the case where a module is already unloaded; its USE_COUNT will be the empty string Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Adam Moss <c@yotes.com> Closes #11627
-
fbynite authored
This prevents a panic after a SLOG add/removal on the root pool followed by a zpool scrub. When a SLOG is removed, a hole takes its place - the vdev_ops for a hole is vdev_hole_ops, which defines the handler functions of vdev_op_hold and vdev_op_rele as NULL. This bug has been reported in illumos and FreeBSD, a different trigger in the FreeBSD report though. Credit for this patch goes to Patrick Mooney <pmooney@pfmooney.com> Obtained from: illumos-gate commit: c65bd18728f34725 External-issue: https://www.illumos.org/issues/12981 External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252396 Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rob Wing <rob.fx907@gmail.com> Closes #11623
-
Tony Hutter authored
A multpathed disk will have several 'underlying' paths to the disk. For example, multipath disk 'dm-0' may be made up of paths: /dev/{sda,sdb,sdc,sdd}. On many enclosures those underlying sysfs paths will have a symlink back to their enclosure device entry (like 'enclosure_device0/slot1'). This is used by the statechange-led.sh script to set/clear the fault LED for a disk, and by 'zpool status -c'. However, on some enclosures, those underlying paths may not all have symlinks back to the enclosure device. Maybe only two out of four of them might. This patch updates zfs_get_enclosure_sysfs_path() to favor returning paths that have symlinks back to their enclosure devices, rather than just returning the first path. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Tony Hutter <hutter2@llnl.gov> Closes #11617
-
Brian Atkinson authored
Making uio_impl.h the common header interface between Linux and FreeBSD so both OS's can share a common header file. This also helps reduce code duplication for zfs_uio_t for each OS. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Brian Atkinson <batkinson@lanl.gov> Closes #11622
-
- 20 Feb, 2021 6 commits
-
-
Christian Schwarz authored
I think this is the behavior that most users expect. Future work: have a separate flag, e.g., -O, to specify separate set_global_vars for the zdb child than for the ztest children. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by:
Christian Schwarz <me@cschwarz.com> Closes #11602
-
Christian Schwarz authored
Without set_global_var() in the child processes the -o option provides little use. Before this change set_global_var() was called as a side-effect of getopt processing which only happens for the parent ztest process. This change limits the set of options that can be set and makes them available to the child through ztest_shared_opts_t. Future work: support arbitrary option count and length. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by:
Christian Schwarz <me@cschwarz.com> Closes #11602
-
Christian Schwarz authored
Also fixes leak of the dlopen handle in the error case. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by:
Christian Schwarz <me@cschwarz.com> Closes #11602
-
Christian Schwarz authored
Without this patch I get the error Setting global variables is only supported on little-endian systems when using `zdb -o` on my amd64 machine. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by:
Christian Schwarz <me@cschwarz.com> Closes #11602
-
Ryan Moeller authored
Add zfs_racct_* interfaces for platform-dependent read/write accounting. Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11613
-
Don Brady authored
Fix regression seen in issue #11545 where checksum errors where not being counted or showing up in a zpool event. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Don Brady <don.brady@delphix.com> Closes #11609
-
- 18 Feb, 2021 4 commits
-
-
Mark Johnston authored
First, the crypto request completion handler contains a bug in that it fails to reset fs_done correctly after the request is completed. This is only a problem for asynchronous drivers. Second, some hardware drivers have input constraints which ZFS does not satisfy. For instance, ccp(4) apparently requires the AAD length for AES-GCM to be a multiple of the cipher block size, and with qat(4) the AES-GCM AAD length may not be longer than 240 bytes. FreeBSD's generic crypto framework doesn't have a mechanism to automatically fall back to a software implementation if a hardware driver cannot process a request, and ZFS does not tolerate such errors. The plan is to implement such a fallback mechanism, but with FreeBSD 13.0 approaching we should simply disable the use hardware drivers for now. Reviewed-by:
Ryan Moeller <ryan@iXsystems.com> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Mark Johnston <markj@FreeBSD.org> Closes #11612
-
Andriy Gapon authored
That happens because of an off-by-one mistake. share_mount_one_cb() calls report_mount_progress(current=sm_done) after having incremented sm_done by one. Then report_mount_progress() increments the parameter again. It appears that that logic became obsolete after commit a10d50f9 , parallel zfs mount. On FreeBSD I observe that zfs mount -a -v prints, for example, (null): (201/248) That happens because set_progress_header() is never called. With this change the output becomes correct: Mounting ZFS filesystems: (209/248) Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Andriy Gapon <avg@FreeBSD.org> Closes #11607
-
Ryan Libby authored
Remove function that become unused after refactoring in e2af2acc . Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ryan Libby <rlibby@FreeBSD.org> Closes #11614
-
Colm authored
Property to allow sets of features to be specified; for compatibility with specific versions / releases / external systems. Influences the behavior of 'zpool upgrade' and 'zpool create'. Initial man page changes and test cases included. Brief synopsis: zpool create -o compatibility=off|legacy|file[,file...] pool vdev... compatibility = off : disable compatibility mode (enable all features) compatibility = legacy : request that no features be enabled compatibility = file[,file...] : read features from specified files. Only features present in *all* files will be enabled on the resulting pool. Filenames may be absolute, or relative to /etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d (/etc checked first). Only affects zpool create, zpool upgrade and zpool status. ABI changes in libzfs: * New function "zpool_load_compat" to load and parse compat sets. * Add "zpool_compat_status_t" typedef for compatibility parse status. * Add ZPOOL_PROP_COMPATIBILITY to the pool properties enum * Add ZPOOL_STATUS_COMPATIBILITY_ERR to the pool status enum An initial set of base compatibility sets are included in cmd/zpool/compatibility.d, and the Makefile for cmd/zpool is modified to install these in $pkgdatadir/compatibility.d and to create symbolic links to a reasonable set of aliases. Reviewed-by: ericloewe Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Richard Laager <rlaager@wiktel.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Colm Buckley <colm@tuatha.org> Closes #11468
-
- 17 Feb, 2021 2 commits
-
-
Brian Behlendorf authored
Rather than conditionally compiling out the edonr code for FreeBSD update zfs_mod_supported_feature() to indicate this feature is unsupported. This ensures that all spa features are defined on every platform, even if they are not supported. Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #11605 Issue #11468
-
José Luis Salvador Rufo authored
There are two issues that don't allow ZFS to be compiled using uClibc. `backtrace()`, and `program_invocation_short_name` as a `const`. This patch adds uClibc to the conditionals in the same way there are already for Glibc for `backtrace()`; and removes the external param `program_invocation_short_name` because its only used here for the whole project. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
José Luis Salvador Rufo <salvador.joseluis@gmail.com> Closes #11600
-
- 15 Feb, 2021 1 commit
-
-
Ryan Moeller authored
FreeBSD's zfsd fails to build after e2af2acc due to strict type checking errors from the implicit conversion between bool and boolean_t in the inline predicate definitions in abd.h. Use conditionals to return the correct value type from these functions. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Igor Kozhukhov <igor@dilos.org> Signed-off-by:
Ryan Moeller <freqlabs@FreeBSD.org> Closes #11592
-
- 10 Feb, 2021 1 commit
-
-
Brian Behlendorf authored
Increase the Linux-Maximum version in the META file to 5.11. All of the required compatibility patches have been merged. Reviewed-by:
George Melikov <mail@gmelikov.ru> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #11586
-
- 09 Feb, 2021 3 commits
-
-
Arshad Hussain authored
Within function sas_handler() userspace commands like '/usr/sbin/multipath' have been replaced with sourcing device details from within sysfs which reduced a significant amount of overhead and processing time. Multiple JBOD enclosures and their order are sourced from the bsg driver (/sys/class/enclosure) to isolate chassis top-level expanders, which are then dynamically indexed based on host channel of the multipath subordinate disk member device being processed. Additionally added a "mixed" mode for slot identification for environments where a ZFS server system may contain SAS disk slots where there is no expander (direct connect to HBA) while an attached external JBOD with an expander have different slot identifier methods. How Has This Been Tested? ~~~~~~~~~~~~~~~~~~~~~~~~~ Testing was performed on a AMD EPYC based dual-server high-availability multipath environment with multiple HBAs per ZFS server and four SAS JBODs. The two primary JBODs were multipath/cross-connected between the two ZFS-HA servers. The secondary JBODs were daisy-chained off of the primary JBODs using aligned SAS expander channels (JBOD-0 expanderA--->JBOD-1 expanderA, JBOD-0 expanderB--->JBOD-1 expanderB, etc). Pools were created, exported and re-imported, imported globally with 'zpool import -a -d /dev/disk/by-vdev'. Low level udev debug outputs were traced to isolate and resolve errors. Result: ~~~~~~~ Initial testing of a previous version of this change showed how reliance on userspace utilities like '/usr/sbin/multipath' and '/usr/bin/lsscsi' were exacerbated by increasing numbers of disks and JBODs. With four 60-disk SAS JBODs and 240 disks the time to process a udevadm trigger was 3 minutes 30 seconds during which nearly all CPU cores were above 80% utilization. By switching reliance on userspace utilities to sysfs in this version, the udevadm trigger processing time was reduced to 12.2 seconds and negligible CPU load. This patch also fixes few shellcheck complains. Reviewed-by:
Gabriel A. Devenyi <gdevenyi@gmail.com> Reviewed-by:
Tony Hutter <hutter2@llnl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by:
Jeff Johnson <jeff.johnson@aeoncomputing.com> Signed-off-by:
Jeff Johnson <jeff.johnson@aeoncomputing.com> Signed-off-by:
Arshad Hussain <arshad.hussain@aeoncomputing.com> Closes #11526
-
khng300 authored
zfs_znode_update_vfs is a more platform-agnostic name than zfs_inode_update. Besides that, the function's prototype is moved to include/sys/zfs_znode.h as the function is also used in common code. Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ka Ho Ng <khng300@gmail.com> Sponsored by: The FreeBSD Foundation Closes #11580
-
Kleber Tarcísio authored
The first time through the loop prevdb and prevhdl are NULL. They are then both set, but only prevdb is checked. Add an ASSERT to make it clear that prevhdl must be set when prevdb is. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Kleber <klebertarcisio@yahoo.com.br> Closes #10754 Closes #11575
-
- 08 Feb, 2021 1 commit
-
-
Antonio Russo authored
3d40b655 refactored zfs_vnops.c, which shared much code verbatim between Linux and BSD. After a successful write, the suid/sgid bits are reset, and the mode to be written is stored in newmode. On Linux, this was propagated to both the in-memory inode and znode, which is then updated with sa_update. 3d40b655 accidentally removed the initialization of newmode, which happened to occur on the same line as the inode update (which has been moved out of the function). The uninitialized newmode can be saved to disk, leading to a crash on stat() of that file, in addition to a merely incorrect file mode. Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Antonio Russo <aerusso@aerusso.net> Closes #11474 Closes #11576
-
- 05 Feb, 2021 2 commits
-
-
наб authored
When all pools are exported ZFS will generate an empty cache file. This will cause the import service to fail, which is sub-optimal, since this means that dracut fails, and it necessary to run `zpool import -a` to boot, delete the file, and regenerate+reinstall the initrd. This resolves the issue by treating an zero-length cache files the same as a missing cache file. This aligns the behavior with that of the `zpool` command itself. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Richard Laager <rlaager@wiktel.com> Signed-off-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #11568
-
nssrikanth authored
The pool guid and vdev guid received by zfs_agent_post_event(), which calls zfs_retire_recv(), are normally non-zero. However, later in this same method they may be unconditionally reset to zero by the code which is intended to handle multipath, spare and l2arc vdevs. This will result in the EC_dev_remove not being handled. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>\ Co-authored-by:
Vipin Kumar Verma <vipin.verma@hpe.com> Signed-off-by:
Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> Closes #11564
-
- 04 Feb, 2021 1 commit
-
-
Brian Behlendorf authored
Clarify how to include snapshots in the `zpool list` output by referencing the full name of the `listsnapshots` pool property, and the `zpool list -t snapshot` option. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
George Melikov <mail@gmelikov.ru> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #11562 Closes #11565
-
- 02 Feb, 2021 4 commits
-
-
Christian Schwarz authored
Expand the comments to make it clear exactly what is guaranteed by dmu_tx_assign() and txg_hold_open(). Additionally, update the comment which refers to txg_exit() when it should reference txg_rele_to_sync(). Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Christian Schwarz <me@cschwarz.com> Closes #11521
-
George Melikov authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
George Melikov <mail@gmelikov.ru> Closes #11554
-
George Melikov authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
George Melikov <mail@gmelikov.ru> Closes #11554
-
George Melikov authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
George Melikov <mail@gmelikov.ru> Closes #11554
-
- 30 Jan, 2021 2 commits
-
-
Brian Behlendorf authored
This compatibility code is no longer needed. For it a while iov_iter_init_compat() was used by zfs_uio_prefaultpages() but this code should have been dropped as part of commit 83b91ae1 . Take care of that oversight and remove it. Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #11543
-
Matthew Ahrens authored
ABD's currently track their parent/child relationship. This applies to `abd_get_offset()` and `abd_borrow_buf()`. However, nothing depends on knowing this relationship, it's only used for consistency checks to verify that we are not destroying an ABD that's still in use. When we are creating/destroying ABD's frequently, the performance impact of maintaining these data structures (in particular the atomic increment/decrement operations) can be measurable. This commit removes this verification code on production builds, but keeps it when ZFS_DEBUG is set. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Brian Atkinson <batkinson@lanl.gov> Signed-off-by:
Matthew Ahrens <mahrens@delphix.com> Closes #11535
-
- 29 Jan, 2021 2 commits
-
-
nssrikanth authored
In ZED zfs_retire agent added a check to handle Distributed Spare replacement for Faulted VDEV also. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by:
Vipin Kumar Verma <vipin.verma@hpe.com> Signed-off-by:
Mark Maybee <mark.maybee@hpe.com> Closes #11354 Closes #11355
-
Brian Atkinson authored
I originally applied a fix in #11539 to fix a parent's child references when a gang ABD is free'd. However, I did not take into account abd_gang_add_gang(). We still need to make sure to update the child references in this function as well. In order to resolve this I removed decreasing the gang ABD's size in abd_free_gang() as well as moved back the original placeent of zfs_refcount_remove_many() in abd_free(). Reviewed-by:
Mark Maybee <mark.maybee@delphix.com> Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Brian Atkinson <batkinson@lanl.gov> Closes #11542
-