vulkan suites and multiples maintainers

Sat Apr 25 16:14:47 BST 2026

Olivier,

* Correction (use_vulkan should be ON of course):
# ollama
ollama_enable="YES"
ollama_user={your-user}
ollama_use_vulkan=1

I believe that Vulkan support is somewhat broken in ollama so that it 
sometimes acts up if models do not fit but smaller model like qwen2:1.5b 
only needs <2 GB and should fit into any device.
It should work meaning you should be able to chat with it, and the log 
at /var/ollama-{user}.log should show that most or all layers were 
offloaded to the GPU (grep for the work "offload").

Thanks,
Yuri

On 4/25/26 08:04, Yuri Victorovich wrote:
> Hi Olivier,
>
>
> Please run the ollama daemon with some smaller models thatwould fit 
> into your Vulkan device to see if the work.
>
> # ollama
> ollama_enable="YES"
> ollama_user={your-user}
> ollama_use_vulkan=0
>
>
> Thanks,
> Yuri
>
>
>
>
> On 4/25/26 02:27, Olivier Cochard-Labbé wrote:
>> Hi,
>>
>> The vulkan suite seems to be upgraded all together, but we have 3 
>> different maintainers:
>> - graphics/spirv-cross : vvd
>> - graphics/spirv-headers : Yuri
>> - graphics/vulkan-*: kde team
>>
>> So, how does it work when we would like to upgrade it ? :-)
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=294716
>>
>> And due to the sensibility of this API, what kind of regression tests 
>> should we run seriously to test the new version ?
>> I’m using some vkcube and llama.cpp llama-bench on my side, but would 
>> like to get some input from your side.
>>
>> Thanks!
>> Olivier
>
>