• Andrew Walker's avatar
    Improve CTDB private IPs handling (#11643) · 6bd31c65
    Andrew Walker authored
    This PR makes several critical changes to how the ctdb nodes
    file (private IPs) are managed in middleware. Some types of
    issues that are addressed by this PR are:
    
    * Inconsistencies in nodes configuration file on different
      cluster nodes
    * Duplicate nodes entries being added via ctdb.shared.volume.create
      (e.g. same node twice with different IPs)
    * Lack of clear mapping between ctdb nodes and gluster peers
    
    Originally there was easy mapping of entries in the ctdb nodes
    file back to the originating gluster peer. This PR changes the
    nodes files entry from form of
    ```
    <ipaddress>
    <ipaddress>
    <ipaddress>
    ```
    
    to
    ```
    <ipaddress>#{<peer information>}
    <ipaddress>#{<peer information>}
    ```
    
    Above required a change to nodes file parsing function in ctdbd,
    which means this PR may not be backported into stable/bluefin.
    
    Currently only peer UUID is stored in the file, but in future
    we can potentially expand to include additional information.
    Store gluster node UUID in ctdb nodes file. This allows us to
    more tightly couple gluster TSP configuration with our CTDB nodes
    configuration (e.g. include peer UUID in ctdb.private.ips.query
    return).
    
    Since this change requires that the backend maintain the mapping
    between nodes and peers, the ctdb private ips API was changed to
    require caller to provide a gluster peer UUID for the nodes entry.
    
    The ctdb.shared.volume.create method originally would automatically
    append nodes file entries for missing gluster peers based on DNS
    lookup results, but this has over time proven to be somewhat less
    than reliable (users may have DNS misconfigurations that result
    in multiple nodes files entries or incorrect interfaces being used).
    
    This PR allows caller of ctdb.shared.volume.create to optionally
    include a list of private address + gluster peer UUID mappings
    to be added (if necessary) to the nodes file. This method will now
    explicitly fail in the following situations:
    
    * gluster peers that are not present in resulting file
    * peer UUIDs in payload that do not exist
    * entries in nodes file that do not map back to gluster UUID
    
    This has somewhat wide-ranging impact for how our backend APIs are
    used. For instance, if a new gluster peer is added to an existing
    TSP, then a private IP for the nodes file must also be supplied.
    
    A new cluster management service is also being added via this
    pull request to enforce consistency in how SCALE clusters are
    configured (and ensure that proper validation takes place to prevent
    issue reports on drastically misconfigured clutsers). It also
    provides a stub of an API for adding new cluster nodes (expansion).
    
    The end-goal of this API is to force the caller to provide us with
    a payload containing gluster peer, brick, and private address (nodes
    file) information for each proposed cluster node.
    6bd31c65
test_cluster_cleanup.py 5.01 KB