Generic Tensor #3090

ivarflakstad · 2025-09-20T20:11:46Z

Makes tensor backend generic. Will allow us to support any number of backends. With a few tweaks to the repo the backend can even be defined outside of the main repo - as opposed to the existing enum approach.

This PR introduces the traits BackendStorage, BackendDevice, and QuantizedBackend for quantization support.
A candle tensor now has the following definition

#[derive(Clone)]
pub struct Tensor<B>(Arc<Tensor_<B>>)
where
    B: BackendStorage;

Where previously the backend storage was provided by this enum

pub enum Storage {
    Cpu(CpuStorage),
    Cuda(CudaStorage),
    Metal(MetalStorage),
}

Now we instead implement BackendStorage for the three variants in the enum, and any new backends we would like to support in the future.

The original Storage is kept around and also implements BackendStorage and can be used just like before. All you have to do is specify your tensor as Tensor<Storage> and all code that depends on the inner enum etc will work as before. If you want to try transitioning to the new scheme try the backend of your choice like Tensor<CudaStorage>.

Original Storage is kept for now because

It makes transitioning easier for projects that depend on candle.
There is no easy or elegant way to use generics with pyo3 (ref issue).
In my experience many people who use candle do it partly because they would like to not use python, so I'm not sure how much traction candle-pyo3 has. If it is not valuable to the community it would probably be better for the project as a whole to deprecate it. Deprecating the old Storage would be a logical next step, as we push users to use the new approach to backends.

…he correct storage

greenrazer · 2025-09-29T18:07:54Z

Some things I found for examples that use metal:

based
- Metal conv1d BF16 not implemented
biet
- Metal strided affine I64 not implemented
Deepseek
- thread '<unnamed>' panicked at /Users/kb/Documents/webclones/candle/candle-transformers/src/models/deepseek2.rs:1071:37: called Result::unwrap()on anErrvalue: Metal error Could not lock kernel map: Command buffer map note: run withRUST_BACKTRACE=1environment variable to display a backtrace
Depth Anything has some bigger issues, but I'll work on that

ivarflakstad · 2025-09-30T12:10:48Z

candle-transformers/src/models/deepseek2.rs:1071:37: called Result::unwrap()on anErrvalue: Metal error Could not lock kernel map: Command buffer map note: run withRUST_BACKTRACE=1environment variable to display a backtrace

Oh woops this was not supposed to be included in this PR. I was testing using rayon inside model code to launch several pieces simultaneously. Cut the initial loading time in half so we should explore, but not in this PR :)

Just for completeness the fix is simply to ensure the current thread actually has a command buffer

pub fn command_buffer(&mut self) -> Result<(bool, CommandBuffer), MetalKernelError> {
    let mut command_buffers = self.command_buffers.lock()?;
    let command_buffer = match command_buffers.get_mut(&thread::current().id()) {
        Some(command_buffer) => command_buffer,
        None => {
            let command_buffer = create_command_buffer(&self.command_queue)?;
            command_buffers.insert(thread::current().id(), command_buffer);
            command_buffers.get_mut(&thread::current().id()).unwrap()
        }
    };
    ...

* got depth anything v2 example working * fixed tensor::try_from -> tensor::new * Depth anything v2 example: init device generically. Use anyhow result --------- Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>

ivarflakstad added 30 commits September 1, 2025 17:58

Begin generic Tensor<B: Backend> work

aa60643

Prepare Storage and Device concepts for generic backend

bcead22

Simplify same_device trait requirements for now

47ee6bc

Generic Var<B: BackendStorage>

2806680

Begin implementation of BackendDevice for Device

c0b5678

Generic TensorScalar<B: BackendStorage>

638b18e

Begin implementing safetensors support for generic Tensors

1caa41e

Generic TensorIndexer

4909419

impl shape for generic tensors

43e0b95

update conv for generic tensors

04ac5a5

impl conversions for generic tensors

2b977cc

impl npy for Tensor<CpuStorage>

fa9149b

update pickle to support generic tensors

0391577

update display to support generic tensors

c0f46db

update custom ops to support generic tensors

dd27f17

update ops to support generic tensors

3d3f1fb

Update Module and ModuleT to support generic tensors

a2d5fbe

update sort to support genric tensors

a045ff1

update streaming to support genric tensors

81f0edc

update test utils

c32ca01

update backpropagation to support genric tensors

f08c36c

update cpu backend to support generic tensors

4164202

update metal backend to support generic tensors

9d7c3ad

update cuda backend to support generic tensors

d5c1cb1

Begin implementing quantization support for generic Tensors

44ff490

Merge branch 'main' into generic-tensor

78216cd

Update generic backend and storage

773a089

Update generic quantized file utils

5a6bdcc

update pickle to only expect cpu storage as intended

9451ece

Fix tensor rand fn calls

78de49d

ivarflakstad added 16 commits September 22, 2025 11:36

Update candle-onnx to support generic tensor backend

b37e204

clippy

f24c0d3

Working on fixing storage conversions

df9509f

candle-onnx requires classic Storage because of candle-pyo3

87f4c2b

Fix original Device new impl

9585f97

Fix var builder backend storage usage. Simplify conversions

df83d3b

clippy

528bb3e

happy clippy

c7360f7

Minor backprop tweak

c41b820

Tidy up candle core convert

9d8bfd6

Add back backend specific tests using TypeId

f0f0e5b

Tidy up some more examples

9c933b8

dequantize() does not require passing in device as it is already on t…

5c549dc

…he correct storage

Tidy up tests

2b76924

Fix dequantize calls

7905078

Merge branch 'main' into generic-tensor

a4d42da

greenrazer mentioned this pull request Sep 29, 2025

Fixed depth anything example for generic tensor #3099

Merged

ivarflakstad and others added 5 commits October 2, 2025 13:26

Update a couple of examples

f6d6280

Merge branch 'main' into generic-tensor

0040b9a

Working on fixing deepseek2 example

f9f8553

Delete accidentally added asset

fc5d973

ivarflakstad changed the base branch from main to v2 October 9, 2025 19:41

Merge branch 'main' into generic-tensor

00637f1

ivarflakstad changed the title ~~[WIP] Generic Tensor~~ Generic Tensor Oct 9, 2025

ivarflakstad marked this pull request as ready for review October 9, 2025 19:57

ivarflakstad merged commit df1a203 into v2 Oct 9, 2025
18 of 21 checks passed

ivarflakstad deleted the generic-tensor branch October 9, 2025 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generic Tensor<B> #3090

Generic Tensor<B> #3090

Uh oh!

ivarflakstad commented Sep 20, 2025 •

edited

Loading

Uh oh!

greenrazer commented Sep 29, 2025

Uh oh!

ivarflakstad commented Sep 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Generic Tensor<B> #3090

Generic Tensor<B> #3090

Uh oh!

Conversation

ivarflakstad commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greenrazer commented Sep 29, 2025

Uh oh!

ivarflakstad commented Sep 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ivarflakstad commented Sep 20, 2025 •

edited

Loading