Describe the bug
All audio routed through the SOF DSP — internal speakers, headphone jack, internal microphones, HDMI audio — stopped working mid-session on a Meteor Lake-P laptop (ThinkPad X1 Carbon Gen 12, SOF IPC4 firmware 2.13.0.1, kernel 6.12.90). USB audio on the dock and Bluetooth audio still work because they don't traverse the DSP.
The trigger was a single DSP IPC timeout during a DSP resume. From that point on, every hw_params against the affected PCMs fails with failed to assign pipeline id for pipeline.N: -28. That error string is emitted from sound/soc/sof/ipc4-topology.c immediately after ida_alloc_max(&pipeline_ida, …) returns -ENOSPC — i.e. the failure is a kernel-side IDA pool exhaustion, the IPC is never sent to the DSP. See Root cause analysis below.
The state has persisted across a userspace audio stack restart, a PCI driver unbind/bind that re-uploaded firmware, and an S3 suspend/resume cycle. It is reproducible at will from this point (every hw_params against the affected PCMs fails the same way), but I do not have an isolated reproducer for the triggering IPC timeout itself.
What I tried to diagnose: inspected dmesg for the IPC timeout sequence and the cascading -ENOSPC errors, captured PipeWire user journal at the moments of failure, restarted the audio stack, did a PCI driver rebind (firmware re-uploaded but failure persisted), did an S3 cycle (failure persisted), inspected /sys/kernel/debug/sof/* text entries (firmware is in state 7 = SOF_FW_BOOT_COMPLETE), captured amixer -c 0, and extracted strings from sof-hda-generic-2ch.tplg to map the failing pipeline numbers back to widgets.
Environment
- Kernel: Linux
v6.12.90 (linux-6.12.y stable tag, packaged in NixOS 25.11 from the upstream tarball mirror://kernel/linux/kernel/v6.x/linux-6.12.90.tar.xz; no separate downstream SHA).
- SOF firmware:
sof-bin v2025.05.1 (prebuilt release tarball from thesofproject/sof-bin). Booted firmware reports ADSPFW 2.13.0.1. No separate firmware-source SHA pinned in nixpkgs.
- soft (tools / topology): Whatever ships inside
sof-bin v2025.05.1. Loaded topology: intel/sof-ace-tplg/sof-hda-generic-2ch.tplg, Topology ABI 3:29:1, Kernel ABI 3:23:1.
- Topology file:
intel/sof-ace-tplg/sof-hda-generic-2ch.tplg
- Platform: Lenovo ThinkPad X1 Carbon Gen 12 (DMI:
LENOVO-21KC00EEMX-ThinkPadX1CarbonGen12). Intel Meteor Lake-P HD Audio Controller, PCI 0000:00:1f.3 (8086:7e28). Realtek ALC287 HDA codec (HDA:10ec0287,17aa231e,00100002). 2 digital microphones; HDMI declared as iec61937-pcm:5,4,3. ALSA card sof-hda-dsp (driver snd_soc_skl_hda_dsp). Userspace: PipeWire 1.4.9, WirePlumber 0.5.x.
Reproducibility Rate
The triggering event (one DSP IPC timeout on a DSP resume) occurred 1 time over ~30 hours of uptime with many lid-driven S3 suspend/resume cycles and several Thunderbolt 3 dock connect/disconnect events. I cannot put a meaningful rate on the trigger.
After it occurred, the downstream symptom is 100% deterministic: every hw_params attempt against the affected PCMs fails identically, and the failure has persisted through one userspace-services restart, one PCI driver unbind/bind, and one S3 suspend/resume. I have not yet rebooted.
Steps to reproduce
I do not have an isolated reproducer for the triggering IPC timeout. The session that hit it:
- Boot normally (2026-05-28 ~08:18 CEST). Audio works.
- Use the laptop, suspending/resuming via the lid many times across the day. A Thunderbolt 3 dock with an attached HDMI display was connected and disconnected several times.
- At 2026-05-28 14:59:18 the dock is disconnected (
pciehp: Card not present, undocked from hotplug port replicator). The HDMI display goes away with it.
- ~19 minutes of dmesg silence (laptop presumably idle; runtime PM is not logged in this kernel build).
- At 2026-05-28 15:18:57 a DSP IPC times out (
ipc timed out for 0x44000007|0x30000018). The kernel error path emits error: set pcm hw_params after resume, which indicates sof_pcm_prepare() was running with the set_hw_params_upon_resume flag set — i.e. the DSP was being resumed (almost certainly runtime PM, since no PM: suspend exit appears near this event). DSP reports fw_state: SOF_FW_BOOT_COMPLETE (7) and ROM_EXT, state: FW_ENTERED, running — firmware did not crash, just failed to respond inside the IPC timeout window. Kernel emits IPC/DSP dumps and abandons the post-resume pcm 0 dir 0 setup with -ETIMEDOUT (-110).
- Continue using the laptop. Many more suspend/resume cycles.
- On the next system S3 resume (2026-05-29 08:49:34, 4 s after
PM: suspend exit), the kernel tries to set up pcm 0 dir 0 again and fails with failed to assign pipeline id for pipeline.1: -28. Audio is broken from this point on.
- The downstream
-ENOSPC symptom then occurs on every subsequent hw_params attempt and persists across the one userspace-services restart, the one PCI driver unbind/bind, and the one S3 cycle I attempted (see Actual Result).
Expected Result
- An IPC timeout during DSP resume should either not occur, or — if it does — should not leave the kernel unable to allocate pipeline IDs on subsequent attempts.
- A PCI driver unbind/bind should be a sufficient recovery without requiring a reboot. Today it isn't — see the Root cause analysis section for why (
pipeline_ida is a file-static global, not per-snd_sof_dev).
Actual Result
- After the IPC timeout on 2026-05-28 15:18:57, every later
hw_params attempt has failed with failed to assign pipeline id for pipeline.N: -28.
- The failing pipeline number tracks what the host is trying to instantiate:
pipeline.1 for pcm 0 dir 0 (HDA Analog / Speaker path) on every post-suspend retry, and pipeline.15 for pcm 31 dir 0 after I restarted pipewire (wireplumber then probed a different PCM).
- Userspace symptom:
spa.alsa: set_hw_params: No space left on device; sinks go to error; canberra-gtk-play -i bell hangs ~30 s and exits with Failed to play sound: IO error. (canberra-gtk-play -i bogusname returns File or data not found immediately, confirming the audio server is reachable.)
- After
systemctl --user restart pipewire pipewire-pulse wireplumber, the HiFi UCM profile no longer registers — wpctl inspect shows the card's EnumProfile containing only off and pro-audio. All Speaker and HDMI sinks disappear. The dock USB audio (separate ALSA card) and Bluetooth audio (off the SOF path entirely) continue to work — only paths routed through the SOF DSP are affected.
Recovery attempts that did not clear the state:
- Restart of
pipewire pipewire-pulse wireplumber — services restarted cleanly; HiFi profile gone afterwards.
- PCI driver unbind + bind:
echo 0000:00:1f.3 | sudo tee /sys/bus/pci/drivers/sof-audio-pci-intel-mtl/unbind
echo 0000:00:1f.3 | sudo tee /sys/bus/pci/drivers/sof-audio-pci-intel-mtl/bind
Firmware re-uploaded (dmesg shows Loaded firmware library: ADSPFW, version: 2.13.0.1 and Booted firmware version: 2.13.0.1). Card 0 came back with five Pro N sinks under the pro-audio profile, but pipeline.15: -28 fired immediately during the topology probe and continued every time any client tried to open a PCM.
systemctl suspend (S3) and resume. No Booted firmware version / Loaded firmware library line appeared in dmesg on this transition, i.e. the firmware was not re-uploaded by S3 resume on this platform. The -ENOSPC errors continued unchanged.
Recovery attempt that did clear the state, without a reboot:
- Full unload + reload of the SOF kernel modules. After stopping PipeWire/WirePlumber (services and their sockets) and confirming
/dev/snd/* had no holders, sequential rmmod of snd_soc_skl_hda_dsp, snd_sof_probes, snd_hda_intel, snd_sof_pci_intel_mtl (cascading several deps), then snd_sof_intel_hda_common, snd_sof_intel_hda, snd_sof_pci, and finally snd_sof itself, followed by modprobe snd_sof_pci_intel_mtl and restarting the audio stack. Card 0 came back fully functional (HiFi UCM profile present, Speaker/HDMI sinks restored, playback works). This empirically confirms the leaked state lives entirely inside the snd_sof module — see the root cause analysis below for why module unload resets it but PCI rebind doesn't.
Impact
Showstopper for all SOF-routed audio (internal speakers, headphone jack, built-in microphones, HDMI audio) until reboot. USB audio on the dock and Bluetooth audio continue to work because they don't traverse the SOF DSP. In practice it forced me to fall back to Bluetooth or the dock for any audio.
Proof
Pre-trigger context — most recent dmesg activity before the IPC timeout (2026-05-28 14:59:18, dock disconnect, then ~19 min of silence):
maj 28 14:59:18 anka-nixos kernel: pcieport 0000:00:07.0: pciehp: Slot(12): Link Down
maj 28 14:59:18 anka-nixos kernel: pcieport 0000:00:07.0: pciehp: Slot(12): Card not present
maj 28 14:59:18 anka-nixos kernel: xhci_hcd 0000:22:00.0: remove, state 1
maj 28 14:59:18 anka-nixos kernel: usb usb6: USB disconnect, device number 1
maj 28 14:59:18 anka-nixos kernel: thinkpad_acpi: undocked from hotplug port replicator
maj 28 14:59:18 anka-nixos kernel: xhci_hcd 0000:22:00.0: USB bus 6 deregistered
maj 28 14:59:18 anka-nixos kernel: xhci_hcd 0000:22:00.0: remove, state 1
maj 28 14:59:18 anka-nixos kernel: usb usb5: USB disconnect, device number 1
maj 28 14:59:18 anka-nixos kernel: xhci_hcd 0000:22:00.0: USB bus 5 deregistered
maj 28 14:59:18 anka-nixos kernel: pci_bus 0000:22: busn_res: [bus 22] is released
maj 28 14:59:18 anka-nixos kernel: pci_bus 0000:23: busn_res: [bus 23-49] is released
maj 28 14:59:18 anka-nixos kernel: pci_bus 0000:21: busn_res: [bus 21-49] is released
[...no further kernel log entries between 14:59:18 and the IPC timeout at 15:18:57...]
Triggering event — IPC timeout, dmesg, 2026-05-28 15:18:57:
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc timed out for 0x44000007|0x30000018
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump start ]------------
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Host IPC initiator: 0x44000007|0x30000018|0x0, target: 0xe4000000|0x30000018|0x0, ctl: 0x3
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ IPC dump end ]------------
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump start ]------------
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: IPC timeout
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: 0x50000005: module: ROM_EXT, state: FW_ENTERED, running
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware state: 0x5, status/error code: 0x0
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Core dump is not available due to invalid separator 0xc0de
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ------------[ DSP dump end ]------------
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: sof_ipc4_set_get_data: large config set failed at offset 0: -110
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set volume update for Pre Mixer Analog Playback Volume
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: kcontrol 4 set up failed for widget gain.1.1
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 0 dir 0
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: set pcm hw_params after resume
maj 28 15:18:57 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_prepare on 0000:00:1f.3: -110
Next-day S3 resume, dmesg, 2026-05-29 08:49:30–08:49:34 (first -ENOSPC, with surrounding suspend-exit context):
maj 29 08:49:30 anka-nixos kernel: i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.53.0
maj 29 08:49:30 anka-nixos kernel: i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.bin version 70.53.0
maj 29 08:49:30 anka-nixos kernel: i915 0000:00:02.0: [drm] GT1: HuC firmware i915/mtl_huc_gsc.bin version 8.5.4
maj 29 08:49:30 anka-nixos kernel: i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
maj 29 08:49:30 anka-nixos kernel: i915 0000:00:02.0: [drm] GT1: GUC: RC enabled
maj 29 08:49:30 anka-nixos kernel: OOM killer enabled.
maj 29 08:49:30 anka-nixos kernel: Restarting tasks ... done.
maj 29 08:49:30 anka-nixos kernel: random: crng reseeded on system resumption
maj 29 08:49:30 anka-nixos kernel: PM: suspend exit
maj 29 08:49:34 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: failed to assign pipeline id for pipeline.1: -28
maj 29 08:49:34 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
maj 29 08:49:34 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 0 dir 0
maj 29 08:49:34 anka-nixos kernel: sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -28
maj 29 08:49:34 anka-nixos kernel: HDA Analog: ASoC: error at __soc_pcm_hw_params on HDA Analog: -28
maj 29 08:49:34 anka-nixos kernel: HDA Analog: ASoC: error at dpcm_fe_dai_hw_params on HDA Analog: -28
Corresponding PipeWire user-journal (journalctl --user -u pipewire) at the same moment:
maj 29 08:49:34 anka-nixos pipewire[1614]: spa.alsa: set_hw_params: No space left on device
maj 29 08:49:34 anka-nixos pipewire[1614]: pw.node: (alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__Speaker__sink-57) suspended -> error (Start error: No space left on device)
maj 29 08:49:34 anka-nixos pipewire[1614]: pw.link: 0x...: one of the nodes is in error out:suspended in:error
maj 29 08:49:39 anka-nixos pipewire[1614]: pw.node: (alsa_output.pci-0000_00_1f.3-platform-skl_hda_dsp_generic.HiFi__Speaker__sink-57) suspended -> error ((null))
maj 29 08:50:05 anka-nixos pipewire[1614]: spa.alsa: set_hw_params: No space left on device
[same pattern repeats]
Dmesg from the PCI rebind (2026-05-29 14:26:40–14:26:41) showing firmware was re-uploaded but -ENOSPC returned immediately:
sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware paths/files for ipc type 1:
sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware file: intel/sof-ipc4/mtl/sof-mtl.ri
sof-audio-pci-intel-mtl 0000:00:1f.3: Firmware lib path: intel/sof-ipc4-lib/mtl
sof-audio-pci-intel-mtl 0000:00:1f.3: Topology file: intel/sof-ace-tplg/sof-hda-generic-2ch.tplg
sof-audio-pci-intel-mtl 0000:00:1f.3: Loaded firmware library: ADSPFW, version: 2.13.0.1
sof-audio-pci-intel-mtl 0000:00:1f.3: Booted firmware version: 2.13.0.1
sof-audio-pci-intel-mtl 0000:00:1f.3: Topology: ABI 3:29:1 Kernel ABI 3:23:1
sof-audio-pci-intel-mtl 0000:00:1f.3: failed to assign pipeline id for pipeline.15: -28
sof-audio-pci-intel-mtl 0000:00:1f.3: Failed to set up connected widgets
sof-audio-pci-intel-mtl 0000:00:1f.3: error: failed widget list set up for pcm 31 dir 0
sof-audio-pci-intel-mtl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_hw_params on 0000:00:1f.3: -28
amixer -c 0 in the broken state:
Simple mixer control 'Master',0
Mono: Playback 0 [0%] [-65.25dB] [off]
Simple mixer control 'Headphone',0
Front Left: Playback 87 [100%] [0.00dB] [off]
Front Right: Playback 87 [100%] [0.00dB] [off]
Simple mixer control 'Speaker',0
Front Left: Playback 87 [100%] [0.00dB] [off]
Front Right: Playback 87 [100%] [0.00dB] [off]
Simple mixer control 'IEC958',0
Mono: Playback [off]
Simple mixer control 'Capture',0
Front Left: Capture 0 [0%] [-17.25dB] [off]
Front Right: Capture 0 [0%] [-17.25dB] [off]
Simple mixer control 'Auto-Mute Mode',0
Item0: 'Enabled'
Simple mixer control 'Dmic0',0
Front Left: Capture 45 [100%] [0.00dB] [off]
Front Right: Capture 45 [100%] [0.00dB] [off]
[…similar pattern; everything routed through SOF is [off]; full output attached]
Master is muted, every playback/capture switch is [off]. Included for reference per the bug-tracking guide.
Attachments
I will attach (or can attach on request):
- Full
journalctl -k -b from the affected boot.
- Full
journalctl --user -u pipewire from the affected boot.
- Full
amixer -c 0 output.
- Decompressed topology source (
sof-hda-generic-2ch.tplg, strings-extracted) showing pipeline → widget mapping used for the analysis below.
- Binary
/sys/kernel/debug/sof/{pp,exception,fw_regs,dsp,hda} captured while the bug was active (I do not have a matching .ldc to decode them locally; sof-tools 2.10 on this system is older than the booted firmware 2.13.0.1, and this kernel build does not expose etrace / trace either, so sof-logger cannot be run live).
Missing data
No live sof-logger trace is available for either the original IPC timeout or the current broken state — this kernel build does not expose etrace / trace under /sys/kernel/debug/sof/, and the sof-tools dictionary on the system (sof-tools 2.10) is older than the booted firmware (2.13.0.1) anyway. No core dump was generated: dmesg shows Core dump is not available due to invalid separator 0xc0de.
Root cause analysis (from Claude — read with scepticism)
This entire report — log capture, analysis, the source walk below — was drafted by Claude with me reviewing. Below, "verified" means Claude read the actual source listed against the file paths; "inferred" means it followed from the source plus the observed symptoms but has not been reproduced with a kernel-side patch yet. I have not independently audited the source claims — please verify.
Verified against linux-stable v6.12.90 (the booted kernel) and cross-checked against thesofproject/linux topic/sof-dev HEAD:
- The kernel log line
failed to assign pipeline id for pipeline.N: -28 is emitted from sound/soc/sof/ipc4-topology.c:2637, immediately after ida_alloc_max(&pipeline_ida, ipc4_data->max_num_pipelines, GFP_KERNEL) returns -ENOSPC. The IPC for pipeline-create is never sent to the DSP in this path — it returns before the sof_ipc_tx_message_no_reply call at line 2758. So the failure is host-side ID-pool exhaustion, not a DSP-side resource allocator failure.
pipeline_ida is static DEFINE_IDA(pipeline_ida); at sound/soc/sof/ipc4-topology.c:37 — file-scope, module-global. It is not part of struct snd_sof_dev and is not reinitialised on PCI unbind/bind. Only ida_destroy at module unload would reset it. This explains why both the PCI rebind and the S3 cycle failed to recover.
- No IPC4 firmware reply status maps to
-ENOSPC in the kernel's IPC4 status-to-errno table (sound/soc/sof/ipc4.c:106–127); the default mapping is -EINVAL. So the -28 cannot have come from a DSP reply — it has to be the IDA, which matches the call-site reading.
- The same
sof_widget_setup_unlocked code (the suspect leak path described below) is present in current topic/sof-dev HEAD with no behavioural change since v6.12.90 — i.e. if the analysis is correct, the upstream SOF tree is also vulnerable.
Inferred (consistent with source and symptoms, not yet reproduced under a patch):
- The timed-out IPC
0x44000007|0x30000018 was a LARGE_CONFIG_SET for the Pre Mixer Analog Playback Volume kcontrol (gain.1.1) — this is what the immediately-following kernel lines say (sof_ipc4_set_get_data: large config set failed at offset 0: -110 / Failed to set volume update for Pre Mixer Analog Playback Volume / kcontrol 4 set up failed for widget gain.1.1). So the IPC that timed out was a kcontrol restore during post-resume widget-list setup, not a pipeline-create.
- The leak appears to be in
sof_widget_setup_unlocked (sound/soc/sof/sof-audio.c:134). For a dynamic pipeline widget (in this topology, gain.1.1 and its peers), the setup path is:
- Recursive call at line 167 sets up the scheduler
pipe_widget first → pipe_widget.use_count: 0→1, IDA allocated at ipc4-topology.c:2634, Create Pipeline IPC sent and replied OK.
swidget.use_count: 0→1, tplg_ops->widget_setup(gain) succeeds.
widget_kcontrol_setup(gain) is called at line 208. This is where the LARGE_CONFIG_SET was sent and timed out, returning -ETIMEDOUT.
- Control jumps to the
widget_free: label at line 217.
widget_free: calls sof_widget_free_unlocked(swidget=gain):
gain.use_count: 1→0, proceeds to free.
- Sends
Delete Module Instance for the gain.
- Reaches
sof-audio.c:109. Because gain.dynamic_pipeline_widget == true and gain.id != snd_soc_dapm_scheduler, recursively calls sof_widget_free_unlocked(pipe_widget=scheduler):
scheduler.use_count: 1→0, proceeds.
- Sends
Delete Pipeline, frees the IDA at ipc4-topology.c:2809. ← correct, first free.
- Falls through (no goto) into the
pipe_widget_free: label at line 221.
swidget=gain is not a scheduler, so it calls sof_widget_free_unlocked(pipe_widget=scheduler) a second time.
- At line 58:
if (--swidget->use_count) return 0; — scheduler.use_count goes 0 → -1, the early-return fires, no IPC is sent, no IDA is freed. But use_count is now stuck at -1 for the lifetime of this snd_sof_widget.
- After this corruption, every subsequent setup/teardown cycle of that scheduler leaks one ID from the global
pipeline_ida:
- Setup: line 150 increments to
0. if (0 > 1) is false, so it does not treat the widget as already-set-up and proceeds with a fresh ida_alloc_max + Create Pipeline. Use-count is now 0 instead of the expected 1.
- Teardown: line 58 decrements to
-1. if (-1) is truthy → early-returns. Delete Pipeline is not sent and ida_free is not called.
- Per cycle, this leaks exactly one ID from
pipeline_ida. After enough resume/teardown cycles (Claude estimates max_num_pipelines is roughly mid-double-digits on MTL, the user did many lid suspend/resume cycles in the ~17 h between the trigger and the next-day failure) the global pool fills and every ida_alloc_max returns -ENOSPC.
Why this matches every observed symptom:
- The trigger is specifically a failure inside the
widget_free: label fall-through (i.e. dai_config or widget_kcontrol_setup failing after the recursive pipe_widget setup succeeded). A failure of tplg_ops->widget_setup (line 186) jumps directly to pipe_widget_free: and is not affected. The user's log shows the failure at widget_kcontrol_setup — the exact path.
- "Survives PCI rebind, survives S3" —
pipeline_ida is module-global; neither operation resets it.
- "Different pipeline IDs eventually all fail" — pool is global across all schedulers, so once exhausted, every scheduler's
ida_alloc_max fails.
- "Failing pipeline number tracks the host's request" — the number in the error message is
swidget->widget->name (topology name), not the IDA value; the IDA was the negative return.
Suggested patch (drafted by Claude, untested — neither compiled nor run):
Made against linux-stable tag v6.12.90 (commit 2538fbeff8a94ee2b54eb09d92209e24a1e650d4, the running kernel). Same patch also applies to thesofproject/linux topic/sof-dev at commit 3a0f2aeac2e3a8020488c21afef5b483027514fc (HEAD as of 2026-05-29) — Claude diffed the surrounding region in both trees and the context is identical.
Skip the pipe_widget_free: label when the inner sof_widget_free_unlocked already propagated the free to pipe_widget via the dynamic_pipeline_widget branch at sof-audio.c:109. The non-dynamic and scheduler-itself paths still need pipe_widget_free: for the core-refcount decrement, so the label can't simply be removed.
diff --git a/sound/soc/sof/sof-audio.c b/sound/soc/sof/sof-audio.c
--- a/sound/soc/sof/sof-audio.c
+++ b/sound/soc/sof/sof-audio.c
@@ -215,9 +215,20 @@ static int sof_widget_setup_unlocked(struct snd_sof_dev *sdev,
return 0;
widget_free:
- /* widget use_count will be decremented by sof_widget_free() */
+ /*
+ * widget use_count will be decremented by sof_widget_free_unlocked().
+ * For a dynamic non-scheduler widget, that call also recursively
+ * frees swidget->spipe->pipe_widget (see the dynamic_pipeline_widget
+ * branch in sof_widget_free_unlocked()), so we must skip the
+ * pipe_widget_free label below — otherwise pipe_widget is freed
+ * twice, its use_count underflows to -1, and subsequent
+ * setup/teardown cycles leak pipeline IDs from pipeline_ida.
+ */
sof_widget_free_unlocked(sdev, swidget);
use_count_decremented = true;
+ if (swidget->dynamic_pipeline_widget &&
+ swidget->id != snd_soc_dapm_scheduler)
+ goto use_count_dec;
pipe_widget_free:
if (swidget->id != snd_soc_dapm_scheduler) {
sof_widget_free_unlocked(sdev, swidget->spipe->pipe_widget);
Verified with git apply --check against both trees (clean apply, exit 0 each).
Still to confirm (would benefit from someone with SOF familiarity):
- Whether
widget_kcontrol_setup is in fact called the way I described in the resume path for HDA-generic-2ch — i.e. whether the LARGE_CONFIG_SET for gain.1.1 really runs from sof_widget_setup_unlocked line 208 in this scenario.
- Whether
max_num_pipelines reported by firmware 2.13.0.1 on MTL is small enough that ~tens of leaks suffice to exhaust the pool. (SOF_IPC4_FW_CFG_MAX_PPL_COUNT is reported per-boot; I don't have it captured in dmesg from this boot.)
- Whether the original IPC timeout itself has a separate root cause worth chasing, independent of the leak it triggers. The DSP reported
running after the timeout, so it was almost certainly a kernel↔DSP scheduling/timing issue and not a DSP crash — but I haven't investigated further.
Topology context (for cross-reference):
- Topology strings (extracted from
sof-hda-generic-2ch.tplg) confirm pipeline.1 is the Analog Playback front-end (gain.1.1 → mixin.1.1 → pipeline.1 → dai-copier.HDA.Analog.playback) — the same path whose volume restore timed out. pipeline.15 is the Deepbuffer HDA Analog playback front-end (gain.15.1 "Pre Mixer Deepbuffer HDA Analog Volume" → mixin.15.1 → pipeline.15). HDMI uses entirely different pipelines (pipeline.50/.51/.60/.61/.70/.71 via dai-copier.HDA.iDisp{1,2,3}.playback).
- The HiFi UCM profile disappearing after the pipewire restart is plausibly a downstream consequence of the analog-playback topology probe failing — if
pcm 0 setup must succeed for HiFi to register, the failure would leave only pro-audio. Not verified against UCM source.
- The original timeout happened on a DSP runtime resume, not an S3 system resume — no
PM: suspend exit appears near 15:18:57. The most recent external event in dmesg is the dock disconnect 19 minutes earlier at 14:59:18. The next-day failure at 08:49:34 is an S3 resume.
Describe the bug
All audio routed through the SOF DSP — internal speakers, headphone jack, internal microphones, HDMI audio — stopped working mid-session on a Meteor Lake-P laptop (ThinkPad X1 Carbon Gen 12, SOF IPC4 firmware 2.13.0.1, kernel 6.12.90). USB audio on the dock and Bluetooth audio still work because they don't traverse the DSP.
The trigger was a single DSP IPC timeout during a DSP resume. From that point on, every
hw_paramsagainst the affected PCMs fails withfailed to assign pipeline id for pipeline.N: -28. That error string is emitted fromsound/soc/sof/ipc4-topology.cimmediately afterida_alloc_max(&pipeline_ida, …)returns-ENOSPC— i.e. the failure is a kernel-side IDA pool exhaustion, the IPC is never sent to the DSP. See Root cause analysis below.The state has persisted across a userspace audio stack restart, a PCI driver unbind/bind that re-uploaded firmware, and an S3 suspend/resume cycle. It is reproducible at will from this point (every
hw_paramsagainst the affected PCMs fails the same way), but I do not have an isolated reproducer for the triggering IPC timeout itself.What I tried to diagnose: inspected dmesg for the IPC timeout sequence and the cascading
-ENOSPCerrors, captured PipeWire user journal at the moments of failure, restarted the audio stack, did a PCI driver rebind (firmware re-uploaded but failure persisted), did an S3 cycle (failure persisted), inspected/sys/kernel/debug/sof/*text entries (firmware is in state7 = SOF_FW_BOOT_COMPLETE), capturedamixer -c 0, and extracted strings fromsof-hda-generic-2ch.tplgto map the failing pipeline numbers back to widgets.Environment
v6.12.90(linux-6.12.y stable tag, packaged in NixOS 25.11 from the upstream tarballmirror://kernel/linux/kernel/v6.x/linux-6.12.90.tar.xz; no separate downstream SHA).sof-bin v2025.05.1(prebuilt release tarball fromthesofproject/sof-bin). Booted firmware reportsADSPFW 2.13.0.1. No separate firmware-source SHA pinned in nixpkgs.sof-bin v2025.05.1. Loaded topology:intel/sof-ace-tplg/sof-hda-generic-2ch.tplg, Topology ABI 3:29:1, Kernel ABI 3:23:1.intel/sof-ace-tplg/sof-hda-generic-2ch.tplgLENOVO-21KC00EEMX-ThinkPadX1CarbonGen12). Intel Meteor Lake-P HD Audio Controller, PCI0000:00:1f.3(8086:7e28). Realtek ALC287 HDA codec (HDA:10ec0287,17aa231e,00100002). 2 digital microphones; HDMI declared asiec61937-pcm:5,4,3. ALSA cardsof-hda-dsp(driversnd_soc_skl_hda_dsp). Userspace: PipeWire 1.4.9, WirePlumber 0.5.x.Reproducibility Rate
The triggering event (one DSP IPC timeout on a DSP resume) occurred 1 time over ~30 hours of uptime with many lid-driven S3 suspend/resume cycles and several Thunderbolt 3 dock connect/disconnect events. I cannot put a meaningful rate on the trigger.
After it occurred, the downstream symptom is 100% deterministic: every
hw_paramsattempt against the affected PCMs fails identically, and the failure has persisted through one userspace-services restart, one PCI driver unbind/bind, and one S3 suspend/resume. I have not yet rebooted.Steps to reproduce
I do not have an isolated reproducer for the triggering IPC timeout. The session that hit it:
pciehp: Card not present,undocked from hotplug port replicator). The HDMI display goes away with it.ipc timed out for 0x44000007|0x30000018). The kernel error path emitserror: set pcm hw_params after resume, which indicatessof_pcm_prepare()was running with theset_hw_params_upon_resumeflag set — i.e. the DSP was being resumed (almost certainly runtime PM, since noPM: suspend exitappears near this event). DSP reportsfw_state: SOF_FW_BOOT_COMPLETE (7)andROM_EXT, state: FW_ENTERED, running— firmware did not crash, just failed to respond inside the IPC timeout window. Kernel emits IPC/DSP dumps and abandons the post-resumepcm 0 dir 0setup with-ETIMEDOUT(-110).PM: suspend exit), the kernel tries to set uppcm 0 dir 0again and fails withfailed to assign pipeline id for pipeline.1: -28. Audio is broken from this point on.-ENOSPCsymptom then occurs on every subsequenthw_paramsattempt and persists across the one userspace-services restart, the one PCI driver unbind/bind, and the one S3 cycle I attempted (see Actual Result).Expected Result
pipeline_idais a file-static global, not per-snd_sof_dev).Actual Result
hw_paramsattempt has failed withfailed to assign pipeline id for pipeline.N: -28.pipeline.1forpcm 0 dir 0(HDA Analog / Speaker path) on every post-suspend retry, andpipeline.15forpcm 31 dir 0after I restarted pipewire (wireplumber then probed a different PCM).spa.alsa: set_hw_params: No space left on device; sinks go toerror;canberra-gtk-play -i bellhangs ~30 s and exits withFailed to play sound: IO error. (canberra-gtk-play -i bogusnamereturnsFile or data not foundimmediately, confirming the audio server is reachable.)systemctl --user restart pipewire pipewire-pulse wireplumber, the HiFi UCM profile no longer registers —wpctl inspectshows the card'sEnumProfilecontaining onlyoffandpro-audio. AllSpeakerand HDMI sinks disappear. The dock USB audio (separate ALSA card) and Bluetooth audio (off the SOF path entirely) continue to work — only paths routed through the SOF DSP are affected.Recovery attempts that did not clear the state:
pipewire pipewire-pulse wireplumber— services restarted cleanly; HiFi profile gone afterwards.Loaded firmware library: ADSPFW, version: 2.13.0.1andBooted firmware version: 2.13.0.1). Card 0 came back with fivePro Nsinks under thepro-audioprofile, butpipeline.15: -28fired immediately during the topology probe and continued every time any client tried to open a PCM.systemctl suspend(S3) and resume. NoBooted firmware version/Loaded firmware libraryline appeared in dmesg on this transition, i.e. the firmware was not re-uploaded by S3 resume on this platform. The-ENOSPCerrors continued unchanged.Recovery attempt that did clear the state, without a reboot:
/dev/snd/*had no holders, sequentialrmmodofsnd_soc_skl_hda_dsp,snd_sof_probes,snd_hda_intel,snd_sof_pci_intel_mtl(cascading several deps), thensnd_sof_intel_hda_common,snd_sof_intel_hda,snd_sof_pci, and finallysnd_sofitself, followed bymodprobe snd_sof_pci_intel_mtland restarting the audio stack. Card 0 came back fully functional (HiFi UCM profile present, Speaker/HDMI sinks restored, playback works). This empirically confirms the leaked state lives entirely inside thesnd_sofmodule — see the root cause analysis below for why module unload resets it but PCI rebind doesn't.Impact
Showstopper for all SOF-routed audio (internal speakers, headphone jack, built-in microphones, HDMI audio) until reboot. USB audio on the dock and Bluetooth audio continue to work because they don't traverse the SOF DSP. In practice it forced me to fall back to Bluetooth or the dock for any audio.
Proof
Pre-trigger context — most recent dmesg activity before the IPC timeout (2026-05-28 14:59:18, dock disconnect, then ~19 min of silence):
Triggering event — IPC timeout, dmesg, 2026-05-28 15:18:57:
Next-day S3 resume, dmesg, 2026-05-29 08:49:30–08:49:34 (first
-ENOSPC, with surrounding suspend-exit context):Corresponding PipeWire user-journal (
journalctl --user -u pipewire) at the same moment:Dmesg from the PCI rebind (2026-05-29 14:26:40–14:26:41) showing firmware was re-uploaded but
-ENOSPCreturned immediately:amixer -c 0 in the broken state:
Master is muted, every playback/capture switch is
[off]. Included for reference per the bug-tracking guide.Attachments
I will attach (or can attach on request):
journalctl -k -bfrom the affected boot.journalctl --user -u pipewirefrom the affected boot.amixer -c 0output.sof-hda-generic-2ch.tplg, strings-extracted) showing pipeline → widget mapping used for the analysis below./sys/kernel/debug/sof/{pp,exception,fw_regs,dsp,hda}captured while the bug was active (I do not have a matching.ldcto decode them locally; sof-tools 2.10 on this system is older than the booted firmware 2.13.0.1, and this kernel build does not exposeetrace/traceeither, sosof-loggercannot be run live).Missing data
No live
sof-loggertrace is available for either the original IPC timeout or the current broken state — this kernel build does not exposeetrace/traceunder/sys/kernel/debug/sof/, and the sof-tools dictionary on the system (sof-tools 2.10) is older than the booted firmware (2.13.0.1) anyway. No core dump was generated: dmesg showsCore dump is not available due to invalid separator 0xc0de.Root cause analysis (from Claude — read with scepticism)
This entire report — log capture, analysis, the source walk below — was drafted by Claude with me reviewing. Below, "verified" means Claude read the actual source listed against the file paths; "inferred" means it followed from the source plus the observed symptoms but has not been reproduced with a kernel-side patch yet. I have not independently audited the source claims — please verify.
Verified against
linux-stable v6.12.90(the booted kernel) and cross-checked againstthesofproject/linuxtopic/sof-devHEAD:failed to assign pipeline id for pipeline.N: -28is emitted fromsound/soc/sof/ipc4-topology.c:2637, immediately afterida_alloc_max(&pipeline_ida, ipc4_data->max_num_pipelines, GFP_KERNEL)returns-ENOSPC. The IPC for pipeline-create is never sent to the DSP in this path — it returns before thesof_ipc_tx_message_no_replycall at line 2758. So the failure is host-side ID-pool exhaustion, not a DSP-side resource allocator failure.pipeline_idaisstatic DEFINE_IDA(pipeline_ida);atsound/soc/sof/ipc4-topology.c:37— file-scope, module-global. It is not part ofstruct snd_sof_devand is not reinitialised on PCI unbind/bind. Onlyida_destroyat module unload would reset it. This explains why both the PCI rebind and the S3 cycle failed to recover.-ENOSPCin the kernel's IPC4 status-to-errno table (sound/soc/sof/ipc4.c:106–127); the default mapping is-EINVAL. So the-28cannot have come from a DSP reply — it has to be the IDA, which matches the call-site reading.sof_widget_setup_unlockedcode (the suspect leak path described below) is present in currenttopic/sof-devHEAD with no behavioural change since v6.12.90 — i.e. if the analysis is correct, the upstream SOF tree is also vulnerable.Inferred (consistent with source and symptoms, not yet reproduced under a patch):
0x44000007|0x30000018was aLARGE_CONFIG_SETfor thePre Mixer Analog Playback Volumekcontrol (gain.1.1) — this is what the immediately-following kernel lines say (sof_ipc4_set_get_data: large config set failed at offset 0: -110/Failed to set volume update for Pre Mixer Analog Playback Volume/kcontrol 4 set up failed for widget gain.1.1). So the IPC that timed out was a kcontrol restore during post-resume widget-list setup, not a pipeline-create.sof_widget_setup_unlocked(sound/soc/sof/sof-audio.c:134). For a dynamic pipeline widget (in this topology,gain.1.1and its peers), the setup path is:pipe_widgetfirst →pipe_widget.use_count: 0→1, IDA allocated atipc4-topology.c:2634,Create PipelineIPC sent and replied OK.swidget.use_count: 0→1,tplg_ops->widget_setup(gain)succeeds.widget_kcontrol_setup(gain)is called at line 208. This is where theLARGE_CONFIG_SETwas sent and timed out, returning-ETIMEDOUT.widget_free:label at line 217.widget_free:callssof_widget_free_unlocked(swidget=gain):gain.use_count: 1→0, proceeds to free.Delete Module Instancefor the gain.sof-audio.c:109. Becausegain.dynamic_pipeline_widget == trueandgain.id != snd_soc_dapm_scheduler, recursively callssof_widget_free_unlocked(pipe_widget=scheduler):scheduler.use_count: 1→0, proceeds.Delete Pipeline, frees the IDA atipc4-topology.c:2809. ← correct, first free.pipe_widget_free:label at line 221.swidget=gainis not a scheduler, so it callssof_widget_free_unlocked(pipe_widget=scheduler)a second time.if (--swidget->use_count) return 0;—scheduler.use_countgoes0 → -1, the early-return fires, no IPC is sent, no IDA is freed. Butuse_countis now stuck at-1for the lifetime of thissnd_sof_widget.pipeline_ida:0.if (0 > 1)is false, so it does not treat the widget as already-set-up and proceeds with a freshida_alloc_max+Create Pipeline. Use-count is now0instead of the expected1.-1.if (-1)is truthy → early-returns.Delete Pipelineis not sent andida_freeis not called.pipeline_ida. After enough resume/teardown cycles (Claude estimatesmax_num_pipelinesis roughly mid-double-digits on MTL, the user did many lid suspend/resume cycles in the ~17 h between the trigger and the next-day failure) the global pool fills and everyida_alloc_maxreturns-ENOSPC.Why this matches every observed symptom:
widget_free:label fall-through (i.e.dai_configorwidget_kcontrol_setupfailing after the recursivepipe_widgetsetup succeeded). A failure oftplg_ops->widget_setup(line 186) jumps directly topipe_widget_free:and is not affected. The user's log shows the failure atwidget_kcontrol_setup— the exact path.pipeline_idais module-global; neither operation resets it.ida_alloc_maxfails.swidget->widget->name(topology name), not the IDA value; the IDA was the negative return.Suggested patch (drafted by Claude, untested — neither compiled nor run):
Made against
linux-stabletagv6.12.90(commit2538fbeff8a94ee2b54eb09d92209e24a1e650d4, the running kernel). Same patch also applies tothesofproject/linuxtopic/sof-devat commit3a0f2aeac2e3a8020488c21afef5b483027514fc(HEAD as of 2026-05-29) — Claude diffed the surrounding region in both trees and the context is identical.Skip the
pipe_widget_free:label when the innersof_widget_free_unlockedalready propagated the free topipe_widgetvia thedynamic_pipeline_widgetbranch atsof-audio.c:109. The non-dynamic and scheduler-itself paths still needpipe_widget_free:for the core-refcount decrement, so the label can't simply be removed.Verified with
git apply --checkagainst both trees (clean apply, exit 0 each).Still to confirm (would benefit from someone with SOF familiarity):
widget_kcontrol_setupis in fact called the way I described in the resume path for HDA-generic-2ch — i.e. whether the LARGE_CONFIG_SET forgain.1.1really runs fromsof_widget_setup_unlockedline 208 in this scenario.max_num_pipelinesreported by firmware 2.13.0.1 on MTL is small enough that ~tens of leaks suffice to exhaust the pool. (SOF_IPC4_FW_CFG_MAX_PPL_COUNTis reported per-boot; I don't have it captured in dmesg from this boot.)runningafter the timeout, so it was almost certainly a kernel↔DSP scheduling/timing issue and not a DSP crash — but I haven't investigated further.Topology context (for cross-reference):
sof-hda-generic-2ch.tplg) confirmpipeline.1is the Analog Playback front-end (gain.1.1→mixin.1.1→pipeline.1→dai-copier.HDA.Analog.playback) — the same path whose volume restore timed out.pipeline.15is the Deepbuffer HDA Analog playback front-end (gain.15.1"Pre Mixer Deepbuffer HDA Analog Volume" →mixin.15.1→pipeline.15). HDMI uses entirely different pipelines (pipeline.50/.51/.60/.61/.70/.71viadai-copier.HDA.iDisp{1,2,3}.playback).pcm 0setup must succeed for HiFi to register, the failure would leave onlypro-audio. Not verified against UCM source.PM: suspend exitappears near 15:18:57. The most recent external event in dmesg is the dock disconnect 19 minutes earlier at 14:59:18. The next-day failure at 08:49:34 is an S3 resume.