- 08 Jun, 2021 1 commit
-
-
Ryan Moeller authored
This enables ZED to auto-online vdevs that are not wholedisk managed by ZFS. Signed-off-by:
Ryan Moeller <ryan@iXsystems.com>
-
- 28 May, 2021 39 commits
-
-
Andrew authored
This implements NFSv41 (RFC 5661) ACLs in a manner compatible with vfs_nfs4acl_xattr in Samba and nfs4xdr-acl-tools. There are three key areas of change in this commit: 1) NFSv4 ACL management through system.nfs4_acl_xdr xattr. Install an xattr handler for "system.nfs4_acl_xdr" that presents an xattr containing full NFSv41 ACL structures generated through rpcgen using specification from the Samba project. This xattr is used by userspace programs to read and set permissions. 2) add an i_op->permissions endpoint: zpl_permissions(). This is used by the VFS in Linux to determine whether to allow / deny an operation. Wherever possible, we try to avoid having to call zfs_access(). If kernel has NFSv4 patch for VFS, then perform more complete check of avaiable access mask. 3) add capability-based overrides to secpolicy_vnode_access2() there are various situations in which ACL may need to be overridden based on capabilities. This logic is almost directly copied from Linux VFS. For instance, root needs to be able to always read / write ACLs (otherwise admin can get locked out from files). This is commit was initially inspired by work from Paul B. Henson to implement NFSv4.0 (RFC3530) ACLs in ZFS on Linux. Key areas of divergence are as follows: - ACL specification, xattr format, xattr name - Addition of handling for NFSv4 masks from Linux VFS - Addition of ACL overrides based on capabilities Signed-off-by:
Andrew Walker <awalker@ixsystems.com>
-
Ryan Moeller authored
"local" is reasonable enough to expect in a shell. Signed-off-by:
Ryan Moeller <ryan@iXsystems.com>
-
Andrew Walker authored
SB_LARGEXATTR is used in TrueNAS SCALE to indicate to the kernel that the filesystem supports large-size xattrs (greater than 64KiB). This flag is used to evaluate whether to allow large xattr read or write requests (up to 2 MiB). Signed-off-by:
Andrew Walker <awalker@ixsystems.com>
-
Andrew Walker authored
Signed-off-by:
Ryan Moeller <ryan@iXsystems.com>
-
Waqar Ahmed authored
Signed-off-by:
Waqar Ahmed <waqarahmedjoyia@live.com>
-
Waqar Ahmed authored
Signed-off-by:
Waqar Ahmed <waqarahmedjoyia@live.com>
-
Waqar Ahmed authored
Signed-off-by:
Ryan Moeller <ryan@iXsystems.com>
-
Ryan Moeller authored
-
Ryan Moeller authored
Signed-off-by:
Ryan Moeller <ryan@iXsystems.com>
-
Ryan Moeller authored
This allows parsing of zfs send progress by checking the process title. Doing so requires some changes to the send code in libzfs_sendrecv.c; primarily these changes move some of the accounting around, to allow for the code to be verbose as normal, or set the process title. Unlike BSD, setproctitle() isn't standard in Linux; I found a reference to it in libbsd, and included autoconf-related changes to test for that. Authored-by:
Sean Eric Fagan <sef@FreeBSD.org> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com>
-
Matt Macy authored
The assert does not account for the case where there is a single buffer in the chain that is decompressed and has a valid checksum. Signed-off-by:
Matt Macy <mmacy@FreeBSD.org>
-
Ryan Moeller authored
FreeBSD historically has not cared about the xattr property; it was always treated as xattr=on. With xattr=on, xattrs are stored as files in a hidden xattr directory. With xattr=sa, xattrs are stored as system attributes and get cached in nvlists during xattr operations. This makes SA xattrs simpler and more efficient to manipulate. FreeBSD needs to implement the SA xattr operations for feature parity with Linux and to ensure that SA xattrs are accessible when migrated or replicated from Linux. Following the example set by Linux, refactor our existing extattr vnops to split off the parts handling dir style xattrs, and add the corresponding SA handling parts. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11997
-
Ryan Moeller authored
Convert use of ASSERT() to ASSERT0(), ASSERT3U(), ASSERT3S(), ASSERT3P(), and likewise for VERIFY(). In some cases it ended up making more sense to change the code, such as VERIFY on nvlist operations that I have converted to use fnvlist instead. In one place I changed an internal struct member from int to boolean_t to match its use. Some asserts that combined multiple checks with && in a single assert have been split to separate asserts, to make it apparent which check fails. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11971
-
Brian Behlendorf authored
Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov>
-
Armin Wehrfritz authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Armin Wehrfritz <dkxls23@gmail.com> Closes #12124
-
Rich Ercolani authored
configure on s390x has a key check fail with an error about a variable being used uninitialized. So let's initialize it. Reviewed-by:
Colin Ian King <colin.king@canonical.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #12126
-
Rich Ercolani authored
Just like #12087, the set_acl signature changed with all the bolted-on *userns parameters, which disabled set_acl usage, and caused #12076. Turn zpl_set_acl into zpl_set_acl and zpl_set_acl_impl, and add a new configure test for the new version. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #12076 Closes #12093
-
Rich Ercolani authored
In case of AIO failure, we should probably fallback to the old behavior and still work. Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Reviewed-by:
Alan Somers <asomers@gmail.com> Reviewed-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #12032 Closes #12040
-
наб authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12111
-
наб authored
No changes to the text itself Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12111
-
наб authored
Linux man-pages' mount(8) points at fcntl(2), as does mount(2), and support for it is little-used, deprecated, and configurable since 4.5. As far as I can tell, FreeBSD doesn't support nbmand at all ‒ mandatory locks are mostly dead Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12111
-
наб authored
Reviewed-by:
Tony Hutter <hutter2@llnl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12103
-
наб authored
Reviewed-by:
Tony Hutter <hutter2@llnl.gov> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #12103
-
Alexander Motin authored
Previous commit added accounting for geom mode, but not for dev. In geom mode we actually have GEOM statistics, while in dev mode additional accounting actually makes more sense. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Ryan Moeller <ryan@iXsystems.com> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Closes #12097
-
Rich Ercolani authored
The change correctly handles BrokenPipeError and improves the associated tests. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
John Kennedy <john.kennedy@delphix.com> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #12037 Closes #12036
-
Alexander Motin authored
Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Closes #12049
-
Alexander Motin authored
ZFS does not expect transient errors from crypto. For read they are counted as checksum errors, while for write end up in panic. To not panic on random low memory conditions retry ENOMEM errors in the OCF wrapper function. While there remove unneeded timeout and priority from msleep(). External-issue: https://reviews.freebsd.org/D30339 Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Mark Maybee <mark.maybee@delphix.com> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #12077
-
Rich Ercolani authored
I looked for a bit, and couldn't find any documentation on how to print all logged dbgmsg entries, just messages since the DTrace probe started, until @allanjude kindly pointed me toward the sysctl. So let's add that note where the DTrace probe is mentioned for FreeBSD, so other people can find it. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Ryan Moeller <ryan@iXsystems.com> Reviewed-by:
Allan Jude <allan@klarasystems.com> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #12113
-
vermavipinkumar authored
Propagate vdev child state to parents on invalid label Add VDEV_AUX_BAD_LABEL to print_import_config() Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Mark Maybee <mark.maybee@delphix.com> Co-authored-by:
Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> Signed-off-by:
Vipin Kumar Verma <vipin.verma@hpe.com> Closes #12088
-
Rich Ercolani authored
Linux changed the tmpfile() signature again in torvalds/linux@6521f89, which in turn broke our HAVE_TMPFILE detection in configure. Update that macro to include the new case, and change the signature of zpl_tmpfile as appropriate. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes: #12060 Closes: #12087
-
Brian Behlendorf authored
This change addresses two distinct scenarios which are possible when performing a sequential resilver to a dRAID pool with vdevs that contain silent unknown damage. Which in this circumstance took the form of the devices being intentionally overwritten with zeros. However, it could also result from a device returning incorrect data while a sequential resilver was in progress. Scenario 1) A sequential resilver is performed while all of the dRAID vdevs are ONLINE and there is silent damage present on the vdev being resilvered. In this case, nothing will be repaired by vdev_raidz_io_done_reconstruct_known_missing() because rc->rc_error isn't set on any of the raid columns. To address this vdev_draid_io_start_read() has been updated to always mark the resilvering column as ESTALE for sequential resilver IO. Scenario 2) Multiple columns contain silent damage for the same block and a sequential resilver is performed. In this case it's impossible to generate the correct data from parity unless all of the damaged columns are being sequentially resilvered (and thus only good data is used to generate parity). This is as expected and there's nothing which can be done about it. However, we need to be careful not to make to situation worse. Since we can't verify the data is actually good without a checksum, we must only repair the devices which are being sequentially resilvered. Otherwise, an incorrect repair to a device which previously contained good data could effectively lock in the damage and make reconstruction impossible. A check for this was added to vdev_raidz_io_done_verified() along with a new test case. Lastly, this change updates the redundancy_draid_spare1 and redundancy_draid_spare3 test cases to be more representative of normal dRAID replacement operation. Specifically, what we care about is that the scrub run after a sequential resilver does not find additional blocks which need repair. This would indicate the sequential resilver failed to rebuild a section of one of the devices. Note also the tests were switched to using the verify_pool() function which still checks for checksum errors. Reviewed-by:
Mark Maybee <mark.maybee@delphix.com> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #12061
-
Lauri Tirkkonen authored
Reviewed-by:
John Kennedy <john.kennedy@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Lauri Tirkkonen <lauri@hacktheplanet.fi> Closes #12064
-
Rich Ercolani authored
Renamed _fini too for symmetry. Suggested-by: @ensch Reviewed-by:
Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Rich Ercolani <rincebrain@gmail.com> Closes #12059 Closes: #11987 Closes: #12056
-
Alexander Motin authored
While use of dynamic taskqs allows to reduce number of idle threads, hardcoded 8 taskqs of each kind is a big overkill for small systems, complicating CPU scheduling, increasing I/O reorder, etc, while providing no real locking benefits, just not needed there. On another side, 12*8 worker threads per kind are able to overload almost any system nowadays. For example, pool of several fast SSDs with SHA256 checksum makes system barely responsive during scrub, or with dedup enabled barely responsive during large file deletion. To address both problems this patch introduces ZTI_SCALE macro, alike to ZTI_BATCH, but with multiple taskqs, depending on number of CPUs, to be used in places where lock scalability is needed, while request ordering is not so much. The code is made to create new taskq for ~6 worker threads (less for small systems, but more for very large) up to 80% of CPU cores (previous 75% was not good for rounding down). Both number of threads and threads per taskq are now tunable in case somebody really wants to use all of system power for ZFS. While obviously some benchmarks show small peak performance reduction (not so big really, especially on systems with SMT, where use of the second threads does not give as much performance as the first ones), they also show dramatic latency reduction and much more smooth user- space operation in case of high CPU usage by ZFS. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by:
Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes #11966
-
Brian Behlendorf authored
The redundancy_draid.ksh and redundancy_raidz.ksh tests were updated by commit 93c8e91f to additionally verify self-healing. This additional check increased the run time which can now occasionally exceed the default maximum timeout in the CI environment. To prevent this from causing failures increase the default timeout for the redundancy test cases. Reviewed-by:
John Kennedy <john.kennedy@delphix.com> Reviewed-by:
George Melikov <mail@gmelikov.ru> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Closes #12043
-
Paul Zuchowski authored
Use dsl_dataset_has_resume_receive_state() not dsl_dataset_is_zapified() to check if stream is resumable. Reviewed-by:
Matthew Ahrens <mahrens@delphix.com> Reviewed-by:
Alek Pinchuk <apinchuk@axcient.com> Reviewed-by:
Ryan Moeller <ryan@ixsystems.com> Signed-off-by:
Paul Zuchowski <pzuchowski@datto.com> Closes #12034
-
Ryan Moeller authored
Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11997
-
Ryan Moeller authored
The kernel will use the xattr property by default when not overridden by a mount option. Reviewed-by:
Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by:
Alexander Motin <mav@FreeBSD.org> Signed-off-by:
Ryan Moeller <ryan@iXsystems.com> Closes #11997
-
Brian Behlendorf authored
Commit d1d47691 takes into account the encryption key version to decide if the local_mac could be zeroed out. However, this could lead to failure mounting encrypted datasets created with intermediate versions of ZFS encryption available in master between major releases. In order to prevent this situation revert d1d47691 pending a more comprehensive fix which addresses the mount failure case. Reviewed-by:
George Amanakis <gamanakis@gmail.com> Signed-off-by:
Brian Behlendorf <behlendorf1@llnl.gov> Issue #11294 Issue #12025 Issue #12300 Closes #12033
-