Skip to content

fix(vm): allow disk updates when VM is stopped#2407

Open
hardcoretime wants to merge 2 commits into
mainfrom
fix/vm-allow-disk-update-when-vm-stopped
Open

fix(vm): allow disk updates when VM is stopped#2407
hardcoretime wants to merge 2 commits into
mainfrom
fix/vm-allow-disk-update-when-vm-stopped

Conversation

@hardcoretime
Copy link
Copy Markdown
Contributor

@hardcoretime hardcoretime commented May 26, 2026

Description

Fixed the issue where disk changes on a stopped VirtualMachine with WaitForFirstConsumer (wffc) volume binding mode were not being applied to the KVVM (intvirtvm) resource.

Changes made to pkg/controller/kvbuilder/kvvm_utils.go:

  1. Added isVmRunning check to the paravirtualization condition in applyBlockDeviceRefs to allow new disks when VM is stopped
  2. Updated cleanupRemovedStaticDisks function to accept isVmRunning parameter and remove all disks unconditionally when VM is stopped (not just non-hotpluggable ones)
  3. Added VMBDA filtering to prevent removing disks that are attached via VirtualMachineBlockDeviceAttachment - these disks should remain in KVVM spec to be visible in stopped VM status

Added unit tests in pkg/controller/kvbuilder/kvvm_utils_test.go to verify:

  • When VM is stopped: all disks not in VM spec are removed (regardless of hotpluggable flag)
  • When VM is running: only non-hotpluggable disks not in VM spec are removed
  • VMBDA-attached disks are preserved even when VM is stopped

Why do we need it, and what problem does it solve?

The DataExport e2e test was failing when using storage class with WaitForFirstConsumer (wffc) volume binding mode. The scenario:

  1. Create a VM with one disk (vd-data)
  2. Stop the VM
  3. Run DataExport which:
    • Removes the original disk (vd-data)
    • Creates new disks from snapshot/source (vd-restored-from-disk, vd-restored-from-snapshot)
  4. Start the VM

The VM got stuck in starting phase because:

  • Old disk (vd-data) was NOT removed from KVVM (because it was marked as hotpluggable)
  • New restored disks were NOT added to KVVM (because paravirtualization was enabled and VM had existing volumes)

This happened because the code was designed for running VMs with hotplug handler, not for stopped VMs where changes should be applied directly.

What is the expected result?

When a VM is stopped and disks are changed in the VM spec (disks added/removed), the KVVM resource should be updated accordingly. After the fix:

  • Old disks not in spec are removed from KVVM (when VM is stopped)
  • New disks in spec are added to KVVM (when VM is stopped)
  • For running VMs, the behavior remains unchanged (hotplug handler manages disk changes)
  • Disks attached via VMBDA are preserved in KVVM spec to remain visible in stopped VM status

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: vm
type: fix
summary: "Fixed disk update for stopped VMs with WaitForFirstConsumer storage class. Previously, when a VM was stopped and disks were changed, the KVVM was not updated, causing the VM to get stuck in starting phase."

@hardcoretime hardcoretime requested review from loktev-d and removed request for Isteb4k and yaroslavborbat May 26, 2026 23:19
@hardcoretime hardcoretime added this to the v1.9.0 milestone May 26, 2026
@hardcoretime hardcoretime force-pushed the fix/vm-allow-disk-update-when-vm-stopped branch from 0622ef2 to 01e81fd Compare May 26, 2026 23:32
@hardcoretime hardcoretime added the e2e/run Run e2e test on cluster of PR author label May 27, 2026
@deckhouse-BOaTswain
Copy link
Copy Markdown
Contributor

deckhouse-BOaTswain commented May 27, 2026

Workflow has started.
Follow the progress here: Workflow Run

The target step completed with status: failure.

@deckhouse-BOaTswain deckhouse-BOaTswain removed the e2e/run Run e2e test on cluster of PR author label May 27, 2026
Roman Sysoev added 2 commits May 27, 2026 19:13
When a VM with paravirtualization is stopped and disks are changed in
VM spec, the KVVM was not being updated. This caused the old disks
to remain and new disks to not be added.

Additionally, added wait for 'lsblk' command to be ready in the
block_device_hotplug e2e test to prevent intermittent failures.

Changes:
- Add isVmRunning check to paravirtualization condition in applyBlockDeviceRefs
  to allow new disks when VM is stopped
- Update cleanupRemovedStaticDisks to remove all disks unconditionally
  when VM is stopped
- Add unit tests for cleanupRemovedStaticDisks with isVmRunning parameter
- Add util.UntilGuestCommandsReady for lsblk in block_device_hotplug test

Signed-off-by: Roman Sysoev <[email protected]>
When VM is stopped and disks are being cleaned up from KVVM spec,
disks that are attached via VMBDA (VirtualMachineBlockDeviceAttachment)
should not be removed. These disks are visible in the stopped VM status
and removing them would cause inconsistency.

Changes:
- Add vmbdaByBlockDeviceRef parameter to ApplyVirtualMachineSpec
- Add VMBDAByBlockDeviceRef field to BlockDevicesState
- Update cleanupRemovedStaticDisks to filter out VMBDA-attached disks
- Add unit test for VMBDA disk preservation

Signed-off-by: Roman Sysoev <[email protected]>
@hardcoretime hardcoretime force-pushed the fix/vm-allow-disk-update-when-vm-stopped branch from a59a36c to 7993060 Compare May 27, 2026 16:15
@hardcoretime hardcoretime added the e2e/run Run e2e test on cluster of PR author label May 27, 2026
@deckhouse-BOaTswain
Copy link
Copy Markdown
Contributor

deckhouse-BOaTswain commented May 27, 2026

Workflow has started.
Follow the progress here: Workflow Run

The target step completed with status: failure.

@deckhouse-BOaTswain deckhouse-BOaTswain removed the e2e/run Run e2e test on cluster of PR author label May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants