TurboQuant+ v0.1.0: TQ4_1S native kernel 3.5x faster, AMD RDNA4 arch dispatch, Vulkan turbo3 KV, multi-GPU fix, static build fix
TurboQuant+ v0.1.0: TQ4_1S native kernel 3.5x faster, AMD RDNA4 arch dispatch, Vulkan turbo3 KV, multi-GPU fix, static build fix