online gluster and ovirt expansion

Core problem:

The existing ovirt cluster has only two hosts to run VMs and cannot run all desired VMs on a single machine because of inadequate memory (and possibly other resource constraints) - so we do not have the intended redundancy to maintain all VMs if there is a server hardware failure. (In late 2024, the desired hosts were dean, a dashboard1 replacement, and ovirt-engine. This has since expanded to additional upgraded online database servers in anticipation of STAR end-of-life and the need to migrate services to SDCC possibly as soon as early 2026.)

Basic structure: Three servers provide the core services for the ovirt cluster: ovirt1, ovirt2, and ovirt4 (not a typo, there was no ovirt3 as this got underway). Ovirt1 and ovirt2 act as hosts for the VMs and are part of the gluster pool, while ovirt4 only provides gluster bricks. Each of ovirt[124] has three gluster bricks, for a total of nine, of which 3 are arbiters (see gluster pool info below; I think this configuration is designated "replica 3 arbiter 1" though in fact there are only 2 copies of the data).

Attempting to expand the ovirt cluster to include additional hosts has had some snags, one of them being the expansion of the gluster pool. Apparently it is not possible to expand directly from the current state of 3 x (2 + 1). Having arbiters prevents adding additional redundant copies.

Options / questions:

- Was there a reason to make the gluster with arbiters instead of pure replicas?  I see no benefit to using arbiters in our setup (or really in any setup - only a potential for a small cost savings as far as I can tell, with greater risk and possibly worse performance)
--- Should we convert arbiters to full replicas? see https://lists.gluster.org/pipermail/gluster-users/2018-July/034385.html

- Can ovirt4 serve as a VM host? I suspect it cannot, possibly because of a hardware (processor) incompatibility with ovirt1 and ovirt2, but this is not confirmed as I write this on April 11, 2025.

- Build a whole new ovirt cluster

- The gluster data distribution appears to be rather unbalanced.  Of the three "spans", one is 14% full while the other two are only 5% full.  It might be worthwhile to attempt a rebalance.

- On ovirt1, in /data/brick1, there is what appears to be an extraneous directory, "gv1".  Can this be removed safely?

[root@ovirt1 gv0]# gluster volume info

Volume Name: ovirt
Type: Distributed-Replicate
Volume ID: 1356094c-8397-46a9-bf7c-72564eee4318
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (2 + 1) = 9
Transport-type: tcp
Bricks:
Brick1: ovirt1:/data/brick1/gv0
Brick2: ovirt2:/data/brick1/gv0
Brick3: ovirt4:/data/brick1/gv0 (arbiter)
Brick4: ovirt4:/data/brick2/gv0
Brick5: ovirt1:/data/brick2/gv0
Brick6: ovirt2:/data/brick2/gv0 (arbiter)
Brick7: ovirt2:/data/brick3/gv0
Brick8: ovirt4:/data/brick3/gv0
Brick9: ovirt1:/data/brick3/gv0 (arbiter)
Options Reconfigured:
cluster.lookup-optimize: off
server.keepalive-count: 5
server.keepalive-interval: 2
server.keepalive-time: 10
server.tcp-user-timeout: 20
server.event-threads: 4
client.event-threads: 4
cluster.choose-local: off
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.eager-lock: enable
performance.strict-o-direct: on
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.client-io-threads: on
transport.address-family: inet
storage.fips-mode-rchecksum: on
cluster.granular-entry-heal: on
cluster.quorum-type: auto
network.ping-timeout: 20
auth.allow: *
storage.owner-uid: 36
storage.owner-gid: 36