API reference (Backends)¶
This section lists all array classes that are available in the various backend-specific submodules.
Dr.Jit types derive from drjit.ArrayBase and generally do not
implement any methods beyond those of the base class, which makes this section
rather repetitious.
Scalar array namespace (drjit.scalar)¶
The scalar backend directly operates on individual floating point/integer values without the use of parallelization or vectorization.
For example, a drjit.scalar.Array3f instance represents a
simple 3D vector with 3 float-valued entries. In the JIT-compiled
backends (CUDA, LLVM), the same Array3f type represents an array of 3D
vectors partaking in a parallel computation.
Scalars¶
- drjit.scalar.Bool: type = bool¶
- drjit.scalar.Float16: type = half¶
- drjit.scalar.Float: type = float¶
- drjit.scalar.Float64: type = float¶
- drjit.scalar.Int: type = int¶
- drjit.scalar.Int8: type = int¶
- drjit.scalar.Int64: type = int¶
- drjit.scalar.UInt: type = int¶
- drjit.scalar.UInt8: type = int¶
- drjit.scalar.UInt64: type = int¶
1D arrays¶
- class drjit.scalar.Array0b¶
- class drjit.scalar.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.scalar.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array22f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array33f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array44f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.scalar.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Quaternion4f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix2f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix3f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix4f16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.scalar.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.scalar.TensorXi64¶
Derives from
drjit.ArrayBase.
Textures¶
- class drjit.scalar.Texture1f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf16¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture2f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf16¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture3f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf16¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture1f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture2f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture3f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture1f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf64¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array1f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array1f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array1f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture2f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf64¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array2f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array2f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array2f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.scalar.Texture3f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.scalar.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.scalar.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.scalar.ArrayXf64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.scalar.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.scalar.ArrayXf64¶
Return the texture data as an array object
- tensor(self) drjit.scalar.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[float]¶
- eval(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[float]
- eval(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[float]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[list[float]]¶
- eval_fetch(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[list[float]]
- eval_fetch(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[list[float]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]¶
- eval_cubic(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
- eval_cubic(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True), force_nonaccel: bool | None = False) list[float]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.scalar.Array3f, active: bool | None = Bool(True)) list[float]¶
- eval_cubic_helper(self, pos: drjit.scalar.Array3f16, active: bool | None = Bool(True)) list[float]
- eval_cubic_helper(self, pos: drjit.scalar.Array3f64, active: bool | None = Bool(True)) list[float]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
Random number generators¶
- class drjit.scalar.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: int = UInt64(0x853c49e6748fea9b), initseq: int = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.scalar.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: int = UInt64(0x853c49e6748fea9b), initseq: int = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.scalar.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: int = UInt64(0x853c49e6748fea9b), initseq: int = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) float¶
- next_float16(self, arg: bool, /) float
Overloaded function.
next_float16(self) -> float
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: bool, /) -> float
- next_float32(self) float¶
- next_float32(self, arg: bool, /) float
Overloaded function.
next_float32(self) -> float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: bool, /) -> float
- next_float64(self) float¶
- next_float64(self, arg: bool, /) float
Overloaded function.
next_float64(self) -> float
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: bool, /) -> float
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) float¶
- next_float16_normal(self, arg: bool, /) float
Overloaded function.
next_float16_normal(self) -> float
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: bool, /) -> float
- next_float32_normal(self) float¶
- next_float32_normal(self, arg: bool, /) float
Overloaded function.
next_float32_normal(self) -> float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: bool, /) -> float
- next_float64_normal(self) float¶
- next_float64_normal(self, arg: bool, /) float
Overloaded function.
next_float64_normal(self) -> float
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: bool, /) -> float
- next_uint32(self) int¶
- next_uint32(self, arg: bool, /) int
Overloaded function.
next_uint32(self) -> int
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: bool, /) -> int
- next_uint64(self) int¶
- next_uint64(self, arg: bool, /) int
Overloaded function.
next_uint64(self) -> int
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: bool, /) -> int
- next_uint32_bounded(self, bound: int, mask: bool = Bool(True)) int¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: bool = Bool(True)) int¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: int, /) drjit.scalar.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: int, /) drjit.scalar.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: int, /) drjit.scalar.PCG32¶
- __sub__(self, arg: drjit.scalar.PCG32, /) int
Overloaded function.
__sub__(self, arg: int, /) -> drjit.scalar.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.scalar.PCG32, /) -> int
- __isub__(self, arg: int, /) drjit.scalar.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.scalar.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: int, counter_0: int, counter_1: int = 0, counter_2: int = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.scalar.Philox4x32) None
Overloaded function.
__init__(self, seed: int, counter_0: int, counter_1: int = 0, counter_2: int = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.scalar.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: bool = True) drjit.scalar.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: bool = True) drjit.scalar.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: bool = True) drjit.scalar.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: bool = True) drjit.scalar.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: bool = True) drjit.scalar.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: bool = True) drjit.scalar.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: bool = True) drjit.scalar.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: bool = True) drjit.scalar.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.scalar.Array2u
- property counter¶
(self) -> drjit.scalar.Array4u
- property iterations¶
(self) -> int
LLVM array namespace (drjit.llvm)¶
The LLVM backend is vectorized, hence types listed as scalar actually represent an array of scalars partaking in a parallel computation (analogously, 1D arrays are arrays of 1D arrays, etc.).
Scalar¶
- class drjit.llvm.Bool¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Float16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Float¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Float64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.UInt¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.UInt8¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.UInt64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Int¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Int8¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Int64¶
Derives from
drjit.ArrayBase.
1D arrays¶
- class drjit.llvm.Array0b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.llvm.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.llvm.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Quaternion4f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix2f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix3f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix4f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.llvm.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.TensorXi64¶
Derives from
drjit.ArrayBase.
Textures¶
- class drjit.llvm.Texture1f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- update_inplace(self, migrate: bool = False) None¶
Update the texture after applying an indirect update to its tensor representation (obtained with py:func:tensor()).
A tensor representation of this texture object can be retrived with py:func:tensor(). That representation can be modified, but in order to apply it succesfuly to the texture, this method must also be called. In short, this method will use the tensor representation to update the texture’s internal state.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.)
- value(self) drjit.llvm.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture2f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture3f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture1f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture2f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture3f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture1f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture2f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.Texture3f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
Random number generators¶
- class drjit.llvm.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: drjit.llvm.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.llvm.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: drjit.llvm.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.UInt64 = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.llvm.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: drjit.llvm.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) drjit.llvm.Float16¶
- next_float16(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float16
Overloaded function.
next_float16(self) -> drjit.llvm.Float16
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float16
- next_float32(self) drjit.llvm.Float¶
- next_float32(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float
Overloaded function.
next_float32(self) -> drjit.llvm.Float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float
- next_float64(self) drjit.llvm.Float64¶
- next_float64(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float64
Overloaded function.
next_float64(self) -> drjit.llvm.Float64
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float64
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) drjit.llvm.Float16¶
- next_float16_normal(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float16
Overloaded function.
next_float16_normal(self) -> drjit.llvm.Float16
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float16
- next_float32_normal(self) drjit.llvm.Float¶
- next_float32_normal(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float
Overloaded function.
next_float32_normal(self) -> drjit.llvm.Float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float
- next_float64_normal(self) drjit.llvm.Float64¶
- next_float64_normal(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float64
Overloaded function.
next_float64_normal(self) -> drjit.llvm.Float64
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float64
- next_uint32(self) drjit.llvm.UInt¶
- next_uint32(self, arg: drjit.llvm.Bool, /) drjit.llvm.UInt
Overloaded function.
next_uint32(self) -> drjit.llvm.UInt
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.UInt
- next_uint64(self) drjit.llvm.UInt64¶
- next_uint64(self, arg: drjit.llvm.Bool, /) drjit.llvm.UInt64
Overloaded function.
next_uint64(self) -> drjit.llvm.UInt64
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.UInt64
- next_uint32_bounded(self, bound: int, mask: drjit.llvm.Bool = Bool(True)) drjit.llvm.UInt¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: drjit.llvm.Bool = Bool(True)) drjit.llvm.UInt64¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
- __sub__(self, arg: drjit.llvm.PCG32, /) drjit.llvm.Int64
Overloaded function.
__sub__(self, arg: drjit.llvm.Int64, /) -> drjit.llvm.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.llvm.PCG32, /) -> drjit.llvm.Int64
- __isub__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.llvm.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: drjit.llvm.UInt64, counter_0: drjit.llvm.UInt, counter_1: drjit.llvm.UInt = 0, counter_2: drjit.llvm.UInt = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.llvm.Philox4x32) None
Overloaded function.
__init__(self, seed: drjit.llvm.UInt64, counter_0: drjit.llvm.UInt, counter_1: drjit.llvm.UInt = 0, counter_2: drjit.llvm.UInt = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.llvm.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.llvm.Array2u
- property counter¶
(self) -> drjit.llvm.Array4u
- property iterations¶
(self) -> int
LLVM array namespace with automatic differentiation (drjit.llvm.ad)¶
The LLVM AD backend is vectorized, hence types listed as scalar actually represent an array of scalars partaking in a parallel computation (analogously, 1D arrays are arrays of 1D arrays, etc.).
Scalars¶
- class drjit.llvm.ad.Bool¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Float16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Float¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Float64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.UInt¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.UInt8¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.UInt64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Int¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Int8¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Int64¶
Derives from
drjit.ArrayBase.
1D arrays¶
- class drjit.llvm.ad.Array0b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.llvm.ad.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array22f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array33f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array44f16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.llvm.ad.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.llvm.ad.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.llvm.ad.TensorXi64¶
Derives from
drjit.ArrayBase.
Textures¶
- class drjit.llvm.ad.Texture1f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture2f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture3f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture1f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture2f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture3f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture1f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture2f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.llvm.ad.Texture3f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
Random number generators¶
- class drjit.llvm.ad.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: drjit.llvm.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.llvm.ad.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: drjit.llvm.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.llvm.ad.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: drjit.llvm.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) drjit.llvm.ad.Float16¶
- next_float16(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float16
Overloaded function.
next_float16(self) -> drjit.llvm.ad.Float16
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float16
- next_float32(self) drjit.llvm.ad.Float¶
- next_float32(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float
Overloaded function.
next_float32(self) -> drjit.llvm.ad.Float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float
- next_float64(self) drjit.llvm.ad.Float64¶
- next_float64(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float64
Overloaded function.
next_float64(self) -> drjit.llvm.ad.Float64
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float64
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) drjit.llvm.ad.Float16¶
- next_float16_normal(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float16
Overloaded function.
next_float16_normal(self) -> drjit.llvm.ad.Float16
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float16
- next_float32_normal(self) drjit.llvm.ad.Float¶
- next_float32_normal(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float
Overloaded function.
next_float32_normal(self) -> drjit.llvm.ad.Float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float
- next_float64_normal(self) drjit.llvm.ad.Float64¶
- next_float64_normal(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float64
Overloaded function.
next_float64_normal(self) -> drjit.llvm.ad.Float64
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float64
- next_uint32(self) drjit.llvm.ad.UInt¶
- next_uint32(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.UInt
Overloaded function.
next_uint32(self) -> drjit.llvm.ad.UInt
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.UInt
- next_uint64(self) drjit.llvm.ad.UInt64¶
- next_uint64(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.UInt64
Overloaded function.
next_uint64(self) -> drjit.llvm.ad.UInt64
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.UInt64
- next_uint32_bounded(self, bound: int, mask: drjit.llvm.ad.Bool = Bool(True)) drjit.llvm.ad.UInt¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: drjit.llvm.ad.Bool = Bool(True)) drjit.llvm.ad.UInt64¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
- __sub__(self, arg: drjit.llvm.ad.PCG32, /) drjit.llvm.ad.Int64
Overloaded function.
__sub__(self, arg: drjit.llvm.ad.Int64, /) -> drjit.llvm.ad.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.llvm.ad.PCG32, /) -> drjit.llvm.ad.Int64
- __isub__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.llvm.ad.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: drjit.llvm.ad.UInt64, counter_0: drjit.llvm.ad.UInt, counter_1: drjit.llvm.ad.UInt = 0, counter_2: drjit.llvm.ad.UInt = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.llvm.ad.Philox4x32) None
Overloaded function.
__init__(self, seed: drjit.llvm.ad.UInt64, counter_0: drjit.llvm.ad.UInt, counter_1: drjit.llvm.ad.UInt = 0, counter_2: drjit.llvm.ad.UInt = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.llvm.ad.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.llvm.ad.Array2u
- property counter¶
(self) -> drjit.llvm.ad.Array4u
- property iterations¶
(self) -> int
CUDA array namespace (drjit.cuda)¶
The CUDA backend is vectorized, hence types listed as scalar actually represent an array of scalars partaking in a parallel computation (analogously, 1D arrays are arrays of 1D arrays, etc.).
Scalars¶
- class drjit.cuda.Bool¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Float¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Float64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.UInt¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.UInt8¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.UInt64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Int¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Int8¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Int64¶
Derives from
drjit.ArrayBase.
1D arrays¶
- class drjit.cuda.Array0b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.cuda.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array22f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array33f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array44f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.cuda.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Quaternion4f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix2f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix3f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix4f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.cuda.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.TensorXi64¶
Derives from
drjit.ArrayBase.
Random number generators¶
- class drjit.cuda.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: drjit.cuda.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.cuda.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.cuda.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: drjit.cuda.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.cuda.UInt64 = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.cuda.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: drjit.cuda.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.cuda.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) drjit.cuda.Float16¶
- next_float16(self, arg: drjit.cuda.Bool, /) drjit.cuda.Float16
Overloaded function.
next_float16(self) -> drjit.cuda.Float16
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.Float16
- next_float32(self) drjit.cuda.Float¶
- next_float32(self, arg: drjit.cuda.Bool, /) drjit.cuda.Float
Overloaded function.
next_float32(self) -> drjit.cuda.Float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.Float
- next_float64(self) drjit.cuda.Float64¶
- next_float64(self, arg: drjit.cuda.Bool, /) drjit.cuda.Float64
Overloaded function.
next_float64(self) -> drjit.cuda.Float64
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.Float64
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) drjit.cuda.Float16¶
- next_float16_normal(self, arg: drjit.cuda.Bool, /) drjit.cuda.Float16
Overloaded function.
next_float16_normal(self) -> drjit.cuda.Float16
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.Float16
- next_float32_normal(self) drjit.cuda.Float¶
- next_float32_normal(self, arg: drjit.cuda.Bool, /) drjit.cuda.Float
Overloaded function.
next_float32_normal(self) -> drjit.cuda.Float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.Float
- next_float64_normal(self) drjit.cuda.Float64¶
- next_float64_normal(self, arg: drjit.cuda.Bool, /) drjit.cuda.Float64
Overloaded function.
next_float64_normal(self) -> drjit.cuda.Float64
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.Float64
- next_uint32(self) drjit.cuda.UInt¶
- next_uint32(self, arg: drjit.cuda.Bool, /) drjit.cuda.UInt
Overloaded function.
next_uint32(self) -> drjit.cuda.UInt
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.UInt
- next_uint64(self) drjit.cuda.UInt64¶
- next_uint64(self, arg: drjit.cuda.Bool, /) drjit.cuda.UInt64
Overloaded function.
next_uint64(self) -> drjit.cuda.UInt64
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: drjit.cuda.Bool, /) -> drjit.cuda.UInt64
- next_uint32_bounded(self, bound: int, mask: drjit.cuda.Bool = Bool(True)) drjit.cuda.UInt¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: drjit.cuda.Bool = Bool(True)) drjit.cuda.UInt64¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: drjit.cuda.Int64, /) drjit.cuda.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: drjit.cuda.Int64, /) drjit.cuda.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: drjit.cuda.Int64, /) drjit.cuda.PCG32¶
- __sub__(self, arg: drjit.cuda.PCG32, /) drjit.cuda.Int64
Overloaded function.
__sub__(self, arg: drjit.cuda.Int64, /) -> drjit.cuda.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.cuda.PCG32, /) -> drjit.cuda.Int64
- __isub__(self, arg: drjit.cuda.Int64, /) drjit.cuda.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.cuda.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: drjit.cuda.UInt64, counter_0: drjit.cuda.UInt, counter_1: drjit.cuda.UInt = 0, counter_2: drjit.cuda.UInt = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.cuda.Philox4x32) None
Overloaded function.
__init__(self, seed: drjit.cuda.UInt64, counter_0: drjit.cuda.UInt, counter_1: drjit.cuda.UInt = 0, counter_2: drjit.cuda.UInt = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.cuda.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: drjit.cuda.Bool = True) drjit.cuda.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.cuda.Array2u
- property counter¶
(self) -> drjit.cuda.Array4u
- property iterations¶
(self) -> int
CUDA array namespace with automatic differentiation (drjit.cuda.ad)¶
The CUDA AD backend is vectorized, hence types listed as scalar actually represent an array of scalars partaking in a parallel computation (analogously, 1D arrays are arrays of 1D arrays, etc.).
Scalars¶
- class drjit.cuda.ad.Bool¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Float¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Float64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.UInt¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.UInt8¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.UInt64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Int¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Int8¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Int64¶
Derives from
drjit.ArrayBase.
1D arrays¶
- class drjit.cuda.ad.Array0b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.cuda.ad.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array22f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array33f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array44f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.cuda.ad.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Quaternion4f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix2f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix3f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix4f16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.cuda.ad.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.cuda.ad.TensorXi64¶
Derives from
drjit.ArrayBase.
Random number generators¶
- class drjit.cuda.ad.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: drjit.cuda.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.cuda.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.cuda.ad.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: drjit.cuda.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.cuda.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.cuda.ad.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: drjit.cuda.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.cuda.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) drjit.cuda.ad.Float16¶
- next_float16(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.Float16
Overloaded function.
next_float16(self) -> drjit.cuda.ad.Float16
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.Float16
- next_float32(self) drjit.cuda.ad.Float¶
- next_float32(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.Float
Overloaded function.
next_float32(self) -> drjit.cuda.ad.Float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.Float
- next_float64(self) drjit.cuda.ad.Float64¶
- next_float64(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.Float64
Overloaded function.
next_float64(self) -> drjit.cuda.ad.Float64
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.Float64
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) drjit.cuda.ad.Float16¶
- next_float16_normal(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.Float16
Overloaded function.
next_float16_normal(self) -> drjit.cuda.ad.Float16
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.Float16
- next_float32_normal(self) drjit.cuda.ad.Float¶
- next_float32_normal(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.Float
Overloaded function.
next_float32_normal(self) -> drjit.cuda.ad.Float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.Float
- next_float64_normal(self) drjit.cuda.ad.Float64¶
- next_float64_normal(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.Float64
Overloaded function.
next_float64_normal(self) -> drjit.cuda.ad.Float64
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.Float64
- next_uint32(self) drjit.cuda.ad.UInt¶
- next_uint32(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.UInt
Overloaded function.
next_uint32(self) -> drjit.cuda.ad.UInt
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.UInt
- next_uint64(self) drjit.cuda.ad.UInt64¶
- next_uint64(self, arg: drjit.cuda.ad.Bool, /) drjit.cuda.ad.UInt64
Overloaded function.
next_uint64(self) -> drjit.cuda.ad.UInt64
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: drjit.cuda.ad.Bool, /) -> drjit.cuda.ad.UInt64
- next_uint32_bounded(self, bound: int, mask: drjit.cuda.ad.Bool = Bool(True)) drjit.cuda.ad.UInt¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: drjit.cuda.ad.Bool = Bool(True)) drjit.cuda.ad.UInt64¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: drjit.cuda.ad.Int64, /) drjit.cuda.ad.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: drjit.cuda.ad.Int64, /) drjit.cuda.ad.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: drjit.cuda.ad.Int64, /) drjit.cuda.ad.PCG32¶
- __sub__(self, arg: drjit.cuda.ad.PCG32, /) drjit.cuda.ad.Int64
Overloaded function.
__sub__(self, arg: drjit.cuda.ad.Int64, /) -> drjit.cuda.ad.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.cuda.ad.PCG32, /) -> drjit.cuda.ad.Int64
- __isub__(self, arg: drjit.cuda.ad.Int64, /) drjit.cuda.ad.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.cuda.ad.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: drjit.cuda.ad.UInt64, counter_0: drjit.cuda.ad.UInt, counter_1: drjit.cuda.ad.UInt = 0, counter_2: drjit.cuda.ad.UInt = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.cuda.ad.Philox4x32) None
Overloaded function.
__init__(self, seed: drjit.cuda.ad.UInt64, counter_0: drjit.cuda.ad.UInt, counter_1: drjit.cuda.ad.UInt = 0, counter_2: drjit.cuda.ad.UInt = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.cuda.ad.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: drjit.cuda.ad.Bool = True) drjit.cuda.ad.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.cuda.ad.Array2u
- property counter¶
(self) -> drjit.cuda.ad.Array4u
- property iterations¶
(self) -> int
Automatic array namespace (drjit.cuda)¶
The automatic backend by default wraps drjit.cuda when an CUDA-capable device was detected, otherwise it wraps drjit.llvm.
You can use the function drjit.set_backend() to redirect this module.
This backend is always vectorized, hence types listed as scalar actually represent an array of scalars partaking in a parallel computation (analogously, 1D arrays are arrays of 1D arrays, etc.).
Scalars¶
- class drjit.auto.Bool¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Float¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Float64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.UInt¶
Derives from
drjit.ArrayBase.
- class drjit.auto.UInt8¶
Derives from
drjit.ArrayBase.
- class drjit.auto.UInt64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Int¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Int8¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Int64¶
Derives from
drjit.ArrayBase.
1D arrays¶
- class drjit.auto.Array0b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.auto.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array22f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array33f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array44f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.auto.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Quaternion4f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix2f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix3f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix4f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.auto.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.TensorXi64¶
Derives from
drjit.ArrayBase.
Textures¶
- class drjit.auto.Texture1f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture2f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture3f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture1f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture2f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture3f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture1f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array1f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array1f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array1f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture2f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array2f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array2f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array2f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.Texture3f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float]]¶
- eval_fetch(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float16]]
- eval_fetch(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[list[drjit.llvm.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float]¶
- eval_cubic(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float16]
- eval_cubic(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.Array3f, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.Array3f16, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.Array3f64, active: drjit.llvm.Bool | None = Bool(True)) list[drjit.llvm.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
Random number generators¶
- class drjit.auto.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: drjit.llvm.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.llvm.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: drjit.llvm.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.UInt64 = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.llvm.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: drjit.llvm.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) drjit.llvm.Float16¶
- next_float16(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float16
Overloaded function.
next_float16(self) -> drjit.llvm.Float16
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float16
- next_float32(self) drjit.llvm.Float¶
- next_float32(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float
Overloaded function.
next_float32(self) -> drjit.llvm.Float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float
- next_float64(self) drjit.llvm.Float64¶
- next_float64(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float64
Overloaded function.
next_float64(self) -> drjit.llvm.Float64
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float64
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) drjit.llvm.Float16¶
- next_float16_normal(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float16
Overloaded function.
next_float16_normal(self) -> drjit.llvm.Float16
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float16
- next_float32_normal(self) drjit.llvm.Float¶
- next_float32_normal(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float
Overloaded function.
next_float32_normal(self) -> drjit.llvm.Float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float
- next_float64_normal(self) drjit.llvm.Float64¶
- next_float64_normal(self, arg: drjit.llvm.Bool, /) drjit.llvm.Float64
Overloaded function.
next_float64_normal(self) -> drjit.llvm.Float64
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.Float64
- next_uint32(self) drjit.llvm.UInt¶
- next_uint32(self, arg: drjit.llvm.Bool, /) drjit.llvm.UInt
Overloaded function.
next_uint32(self) -> drjit.llvm.UInt
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.UInt
- next_uint64(self) drjit.llvm.UInt64¶
- next_uint64(self, arg: drjit.llvm.Bool, /) drjit.llvm.UInt64
Overloaded function.
next_uint64(self) -> drjit.llvm.UInt64
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: drjit.llvm.Bool, /) -> drjit.llvm.UInt64
- next_uint32_bounded(self, bound: int, mask: drjit.llvm.Bool = Bool(True)) drjit.llvm.UInt¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: drjit.llvm.Bool = Bool(True)) drjit.llvm.UInt64¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
- __sub__(self, arg: drjit.llvm.PCG32, /) drjit.llvm.Int64
Overloaded function.
__sub__(self, arg: drjit.llvm.Int64, /) -> drjit.llvm.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.llvm.PCG32, /) -> drjit.llvm.Int64
- __isub__(self, arg: drjit.llvm.Int64, /) drjit.llvm.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.auto.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: drjit.llvm.UInt64, counter_0: drjit.llvm.UInt, counter_1: drjit.llvm.UInt = 0, counter_2: drjit.llvm.UInt = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.llvm.Philox4x32) None
Overloaded function.
__init__(self, seed: drjit.llvm.UInt64, counter_0: drjit.llvm.UInt, counter_1: drjit.llvm.UInt = 0, counter_2: drjit.llvm.UInt = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.llvm.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: drjit.llvm.Bool = True) drjit.llvm.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.llvm.Array2u
- property counter¶
(self) -> drjit.llvm.Array4u
- property iterations¶
(self) -> int
Automatic array namespace with automatic differentiation (drjit.auto.ad)¶
The automatic AD backend by default wraps drjit.cuda.ad when an CUDA-capable device was detected, otherwise it wraps drjit.llvm.ad.
You can use the function drjit.set_backend() to redirect this module.
This backend is always vectorized, hence types listed as scalar actually represent an array of scalars partaking in a parallel computation (analogously, 1D arrays are arrays of 1D arrays, etc.).
Scalars¶
- class drjit.auto.ad.Bool¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Float¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Float64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.UInt¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.UInt8¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.UInt64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Int¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Int8¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Int64¶
Derives from
drjit.ArrayBase.
1D arrays¶
- class drjit.auto.ad.Array0b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXb¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXf16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXf¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4u¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXu¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4i¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXi¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXf64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4u64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXu64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array0i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array1i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array2i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array3i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array4i64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.ArrayXi64¶
Derives from
drjit.ArrayBase.
2D arrays¶
- class drjit.auto.ad.Array22b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array33b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array44b¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array22f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array33f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array44f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array22f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array33f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array44f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array22f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array33f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Array44f64¶
Derives from
drjit.ArrayBase.
Special (complex numbers, etc.)¶
- class drjit.auto.ad.Complex2f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Complex2f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Quaternion4f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Quaternion4f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Quaternion4f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix2f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix3f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix4f16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix2f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix3f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix4f¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix2f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix3f64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.Matrix4f64¶
Derives from
drjit.ArrayBase.
Tensors¶
- class drjit.auto.ad.TensorXb¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXf16¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXf¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXu¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXi¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXf64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXu64¶
Derives from
drjit.ArrayBase.
- class drjit.auto.ad.TensorXi64¶
Derives from
drjit.ArrayBase.
Textures¶
- class drjit.auto.ad.Texture1f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture2f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture3f16(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf16, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float16, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf16, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float16¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf16¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture1f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture2f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture3f(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture1f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array1f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture2f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array2f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
- class drjit.auto.ad.Texture3f64(*args, **kwargs)¶
- __init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None¶
- __init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) None
Overloaded function.
__init__(self, shape: collections.abc.Sequence[int], channels: int, use_accel: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Create a new texture with the specified size and channel count
On CUDA, this is a slow operation that synchronizes the GPU pipeline, so texture objects should be reused/updated via
set_value()andset_tensor()as much as possible.When
use_accelis set toFalseon CUDA mode, the texture will not use hardware acceleration (allocation and evaluation). In other modes this argument has no effect.The
filter_modeparameter defines the interpolation method to be used in all evaluation routines. By default, the texture is linearly interpolated. Besides nearest/linear filtering, the implementation also provides a clamped cubic B-spline interpolation scheme in case a higher-order interpolation is needed. In CUDA mode, this is done using a series of linear lookups to optimally use the hardware (hence, linear filtering must be enabled to use this feature).When evaluating the texture outside of its boundaries, the
wrap_modedefines the wrapping method. The default behavior isdrjit.WrapMode.Clamp, which indefinitely extends the colors on the boundary along each dimension.__init__(self, tensor: drjit.llvm.ad.TensorXf64, use_accel: bool = True, migrate: bool = True, filter_mode: drjit.FilterMode = FilterMode.Linear, wrap_mode: drjit.WrapMode = WrapMode.Clamp) -> None
Construct a new texture from a given tensor.
This constructor allocates texture memory with the shape information deduced from
tensor. It subsequently invokesset_tensor(tensor)()to fill the texture memory with the provided tensor.When both
migrateanduse_accelare set toTruein CUDA mode, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage. Note that the texture is still differentiable even when migrated.
- set_value(self, value: drjit.llvm.ad.Float64, migrate: bool = False) None¶
Override the texture contents with the provided linearized 1D array.
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- set_tensor(self, tensor: drjit.llvm.ad.TensorXf64, migrate: bool = False) None¶
Override the texture contents with the provided tensor.
This method updates the values of all texels. Changing the texture resolution or its number of channels is also supported. However, on CUDA, such operations have a significantly larger overhead (the GPU pipeline needs to be synchronized for new texture objects to be created).
In CUDA mode, when both the argument
migrateanduse_accel()areTrue, the texture exclusively stores a copy of the input data as a CUDA texture to avoid redundant storage.Note that the texture is still differentiable even when migrated.
- value(self) drjit.llvm.ad.Float64¶
Return the texture data as an array object
- tensor(self) drjit.llvm.ad.TensorXf64¶
Return the texture data as a tensor object
- filter_mode(self) drjit.FilterMode¶
Return the filter mode
- wrap_mode(self) drjit.WrapMode¶
Return the wrap mode
- use_accel(self) bool¶
Return whether texture uses the GPU for storage and evaluation
- migrated(self) bool¶
Return whether textures with
use_accel()set toTrueonly store the data as a hardware-accelerated CUDA texture.If
Falsethen a copy of the array data will additionally be retained .
- property shape¶
Return the texture shape
- eval(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Evaluate the linear interpolant represented by this texture.
When hardware-acceleration is not available, the numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_fetch(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float]]¶
- eval_fetch(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float16]]
- eval_fetch(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[list[drjit.llvm.ad.Float64]]
Fetch the texels that would be referenced in a texture lookup with linear interpolation without actually performing this interpolation.
The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float]¶
- eval_cubic(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float16]
- eval_cubic(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True), force_nonaccel: bool | None = False) list[drjit.llvm.ad.Float64]
Evaluate a clamped cubic B-Spline interpolant represented by this texture
Instead of interpolating the texture via B-Spline basis functions, the implementation transforms this calculation into an equivalent weighted sum of several linear interpolant evaluations. In CUDA mode, this can then be accelerated by hardware texture units, which runs faster than a naive implementation. More information can be found in:
GPU Gems 2, Chapter 20, “Fast Third-Order Texture Filtering” by Christian Sigg.
When the underlying grid data and the query position are differentiable, this transformation cannot be used as it is not linear with respect to position (thus the default AD graph gives incorrect results). The implementation calls
eval_cubic_helper()function to replace the AD graph with a direct evaluation of the B-Spline basis functions in that case.The numerical precision of the interpolation is dictated by the floating point precision of the query point type.
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_grad(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple¶
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
- eval_cubic_hessian(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) tuple
Evaluate the positional gradient and hessian matrix of a cubic B-Spline
This implementation computes the result directly from explicit differentiated basis functions. It has no autodiff support.
The resulting gradient and hessian have been multiplied by the spatial extents to count for the transformation from the unit size volume to the size of its shape.
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float]¶
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f16, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float16]
- eval_cubic_helper(self, pos: drjit.llvm.ad.Array3f64, active: drjit.llvm.ad.Bool | None = Bool(True)) list[drjit.llvm.ad.Float64]
Helper function to evaluate a clamped cubic B-Spline interpolant
This is an implementation detail and should only be called by the
eval_cubic()function to construct an AD graph. When only the cubic evaluation result is desired, theeval_cubic()function is faster than this simple implementation
Random number generators¶
- class drjit.auto.ad.PCG32(*args, **kwargs)¶
Implementation of PCG32, a member of the PCG family of random number generators proposed by Melissa O’Neill.
PCG32 is a stateful pseudorandom number generator that combines a linear congruential generator (LCG) with a permutation function. It provides high statistical quality with a remarkably fast and compact implementation. Details on the PCG family of pseudorandom number generators can be found here.
To create random tensors of different sizes in Python, prefer the higher-level
dr.rng()interface, which internally uses thePhilox4x32generator. The properties of PCG32 makes it most suitable for Monte Carlo applications requiring long sequences of random variates.Key properties of the PCG variant implemented here include:
Compact: 128 bits total state (64-bit state + 64-bit increment)
Output: 32-bit output with a period of 2^64 per stream
Streams: Multiple independent streams via the increment parameter (with caveats, see below)
Low-cost sample generation: a single 64 bit integer multiply-add plus a bit permutation applied to the output.
Extra features: provides fast multi-step advance/rewind functionality.
Caveats: PCG32 produces random high-quality variates within each random number stream. For a given initial state, PCG32 can also produce multiple output streams by specifying a different sequence increment (
initseq) to the constructor. However, the level of statistical independence across streams is generally insufficient when doing so. To obtain a series of high-quality independent parallel streams, it is recommended to use another method (e.g., the Tiny Encryption Algorithm) to seed the state and inc parameters. This ensures independence both within and across streams.In Python, the
PCG32class is implemented as a PyTree, which means that it is compatible with symbolic function calls, loops, etc.Note
Please watch out for the following pitfall when using the PCG32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float()) changes the internal RNG state. If this state is never explicitly evaluated, the computation graph describing the state transformation keeps growing without bound, causing kernel compilation of increasingly large programs to eventually become a bottleneck. To evaluate the RNG, simply runrng: PCG32 = .... dr.eval(rng)
For computation involving very large arrays, storing the RNG state (16 bytes per entry) can be prohibitive. In this case, it is better to keep the RNG in symbolic form and re-seed it at every optimization iteration.
In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 API directly to seed once and reuse the random number generator throughout the loop.
The
drjit.rngAPI avoids these pitfalls by eagerly evaluating the RNG state.Comparison with ref Philox4x32:
PCG32: State-based, better for sequential generation, low per-sample cost.Philox4x32: Counter-based, better for parallel generation, higher per-sample cost.
- __init__(self, size: int = 1, initstate: drjit.llvm.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
- __init__(self, arg: drjit.llvm.ad.PCG32) None
Overloaded function.
__init__(self, size: int = 1, initstate: drjit.llvm.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) -> None
Initialize a random number generator that generates
sizevariates in parallel.The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their defaults values are based on the original implementation.The implementation of this routine internally calls py:func:seed, with one small twist. When multiple random numbers are being generated in parallel, the constructor adds an offset equal to
drjit.arange(UInt64, size)to bothinitstateandinitseqto de-correlate the generated sequences.__init__(self, arg: drjit.llvm.ad.PCG32) -> None
Copy-construct a new PCG32 instance from an existing instance.
- seed(self, initstate: drjit.llvm.ad.UInt64 = UInt64(0x853c49e6748fea9b), initseq: drjit.llvm.ad.UInt64 = UInt64(0xda3e39cb94b95bdb)) None¶
Seed the random number generator with the given initial state and sequence ID.
The
initstateandinitseqinputs determine the initial state and increment of the linear congruential generator. Their values are the defaults from the original implementation.
- next_float(self, dtype: type, mask: object = True) object¶
Generate a uniformly distributed precision floating point number on the interval \([0, 1)\).
The function analyzes the provided target
dtypeand either invokesnext_float16(),next_float32()ornext_float64()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16(self) drjit.llvm.ad.Float16¶
- next_float16(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float16
Overloaded function.
next_float16(self) -> drjit.llvm.ad.Float16
Generate a uniformly distributed half precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float16
- next_float32(self) drjit.llvm.ad.Float¶
- next_float32(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float
Overloaded function.
next_float32(self) -> drjit.llvm.ad.Float
Generate a uniformly distributed single precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float
- next_float64(self) drjit.llvm.ad.Float64¶
- next_float64(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float64
Overloaded function.
next_float64(self) -> drjit.llvm.ad.Float64
Generate a uniformly distributed double precision floating point number on the interval \([0, 1)\).
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float64
- next_float_normal(self, dtype: type, mask: object = True) object¶
Generate a (standard) normally distributed precision floating point number.
The function analyzes the provided target
dtypeand either invokesnext_float16_normal(),next_float32_normal()ornext_float64_normal()depending on the requested precision.A mask can be optionally provided. Masked entries do not advance the PRNG state.
- next_float16_normal(self) drjit.llvm.ad.Float16¶
- next_float16_normal(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float16
Overloaded function.
next_float16_normal(self) -> drjit.llvm.ad.Float16
Generate a (standard) normally distributed half precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float16_normal(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float16
- next_float32_normal(self) drjit.llvm.ad.Float¶
- next_float32_normal(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float
Overloaded function.
next_float32_normal(self) -> drjit.llvm.ad.Float
Generate a (standard) normally distributed single precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float32_normal(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float
- next_float64_normal(self) drjit.llvm.ad.Float64¶
- next_float64_normal(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.Float64
Overloaded function.
next_float64_normal(self) -> drjit.llvm.ad.Float64
Generate a (standard) normally distributed double precision floating point number.
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_float64_normal(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.Float64
- next_uint32(self) drjit.llvm.ad.UInt¶
- next_uint32(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.UInt
Overloaded function.
next_uint32(self) -> drjit.llvm.ad.UInt
Generate a uniformly distributed unsigned 32-bit random number
Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint32(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.UInt
- next_uint64(self) drjit.llvm.ad.UInt64¶
- next_uint64(self, arg: drjit.llvm.ad.Bool, /) drjit.llvm.ad.UInt64
Overloaded function.
next_uint64(self) -> drjit.llvm.ad.UInt64
Generate a uniformly distributed unsigned 64-bit random number
Internally, the function calls
next_uint32()twice.Two overloads of this function exist: the masked variant does not advance the PRNG state of entries
iwheremask[i] == False.next_uint64(self, arg: drjit.llvm.ad.Bool, /) -> drjit.llvm.ad.UInt64
- next_uint32_bounded(self, bound: int, mask: drjit.llvm.ad.Bool = Bool(True)) drjit.llvm.ad.UInt¶
Generate a uniformly distributed 32-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- next_uint64_bounded(self, bound: int, mask: drjit.llvm.ad.Bool = Bool(True)) drjit.llvm.ad.UInt64¶
Generate a uniformly distributed 64-bit integer number on the interval \([0, \texttt{bound})\).
To ensure an unbiased result, the implementation relies on an iterative scheme that typically finishes after 1-2 iterations.
- __add__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
Advance the pseudorandom number generator.
This function implements a multi-step advance function that is equivalent to (but more efficient than) calling the random number generator
argtimes in sequence.This is useful to advance a newly constructed PRNG to a certain known state.
- __iadd__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
In-place addition operator based on
__add__().
- __sub__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
- __sub__(self, arg: drjit.llvm.ad.PCG32, /) drjit.llvm.ad.Int64
Overloaded function.
__sub__(self, arg: drjit.llvm.ad.Int64, /) -> drjit.llvm.ad.PCG32
Rewind the pseudorandom number generator.
This function implements the opposite of
__add__to step a PRNG backwards. It can also compute the difference (as counted by the number of internalnext_uint32steps) between twoPCG32instances. This assumes that the two instances were consistently seeded.__sub__(self, arg: drjit.llvm.ad.PCG32, /) -> drjit.llvm.ad.Int64
- __isub__(self, arg: drjit.llvm.ad.Int64, /) drjit.llvm.ad.PCG32¶
In-place subtraction operator based on
__sub__().
- property inc¶
Sequence increment of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- property state¶
Sequence state of the PCG32 PRNG (an unsigned 64-bit integer or integer array). Please see the original paper for details on this field.
- class drjit.auto.ad.Philox4x32(*args, **kwargs)¶
Philox4x32 counter-based PRNG
This class implements the Philox 4x32 counter-based pseudo-random number generator based on the paper Parallel Random Numbers: As Easy as 1, 2, 3 by Salmon et al. [2011]. It uses strength-reduced cryptographic primitives to realize a complex transition function that turns a seed and set of counter values onto 4 pseudorandom outputs. Incrementing any of the counters or choosing a different seed produces statistically independent samples.
The implementation here uses a reduced number of bits (32) for the arithmetic and sets the default number of rounds to 7. However, even with these simplifications it passes the Test01 stringent
BigCrushtests (a battery of statistical tests for non-uniformity and correlations). Please see the paper Random number generators for massively parallel simulations on GPU by Manssen et al. [2012] for details.Functions like
next_uint32x4()ornext_float32x4()advance the PRNG state by incrementing the counterctr[3].Key properties include:
Counter-based design: generation from counter + key
192-bit bit state: 4x32-bit counters, 64-bit key
Trivial jump-ahead capability through counter manipulation
The
Philox4x32class is implemented as a PyTree, making it compatible with symbolic function calls, loops, etc.Note
Philox4x32naturally produces 4 samples at a time, which may be awkward for applications that need individual random values.Note
For a comparison of use cases between
Philox4x32andPCG32, see thePCG32class documentation. In brief: usePCG32for sequential generation with lowest cost per sample; usePhilox4x32for parallel generation where independent streams are critical.Note
Please watch out for the following pitfall when using the Philox4x32 class in long-running Dr.Jit calculations (e.g., steps of a gradient-based optimizer). Consuming random variates (e.g., through
next_float_4x32()) changes the internal RNG counter value. If this state is never explicitly evaluated, the computation graph describing this cahnge keeps growing causing kernel compilation of increasingly large programs to eventually become a bottleneck. Thedrjit.rngAPI avoids this pitfall by eagerly evaluating the RNG counter when needed.In cases where a sampler is repeatedly used in a symbolic loop, it is more efficient to use the PCG32 PRNG with its lower per-sample cost. You can seed this method once and reuse the random number generator throughout the loop.
- __init__(self, seed: drjit.llvm.ad.UInt64, counter_0: drjit.llvm.ad.UInt, counter_1: drjit.llvm.ad.UInt = 0, counter_2: drjit.llvm.ad.UInt = 0, iterations: int = 7) None¶
- __init__(self, arg: drjit.llvm.ad.Philox4x32) None
Overloaded function.
__init__(self, seed: drjit.llvm.ad.UInt64, counter_0: drjit.llvm.ad.UInt, counter_1: drjit.llvm.ad.UInt = 0, counter_2: drjit.llvm.ad.UInt = 0, iterations: int = 7) -> None
Initialize a Philox4x32 random number generator.
The function takes a
seedand three of fourcountercomponent. The last component is zero-initialized and incremented by calls to thesample_*methods.- Parameters:
seed – The 64-bit seed value used as the key for the mapping
ctr_0 – The first 32-bit counter value (least significant)
ctr_1 – The second 32-bit counter value (default: 0)
ctr_2 – The third 32-bit counter value (default: 0)
iterations – Number of rounds to apply (default: 7, range: 4-10)
For parallel stream generation, simply use different counter values - each combination of counter values produces an independent random stream.
__init__(self, arg: drjit.llvm.ad.Philox4x32) -> None
Copy constructor
- next_uint32x4(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4u¶
Generate 4 random 32-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 32-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random 32-bit unsigned integers
- next_uint64x2(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array2u64¶
Generate 2 random 64-bit unsigned integers.
Advances the internal counter and applies the Philox mapping to produce 4 independent 64-bit random values.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random 64-bit unsigned integers
- next_float16x4(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f16¶
Generate 4 random half-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to half precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float32x4(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f¶
Generate 4 random single-precision floats in \([0, 1)\).
Generates 4 random 32-bit unsigned integers and converts them to single precision floats that are uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats on the half-open interval \([0, 1)\)
- next_float64x2(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array2f64¶
Generate 2 random double-precision floats in \([0, 1)\).
Generates 2 random 64-bit unsigned integers and converts them to floats uniformly distributed on the half-open interval \([0, 1)\).
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats on the half-open interval \([0, 1)\)
- next_float16x4_normal(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f16¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float32x4_normal(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array4f¶
Generate 4 normally distributed single-precision floats
Advances the internal counter and applies the Philox mapping to produce 4 single precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 4 random floats from a standard normal distribution
- next_float64x2_normal(self, mask: drjit.llvm.ad.Bool = True) drjit.llvm.ad.Array2f64¶
Generate 2 normally distributed double-precision floats
Advances the internal counter and applies the Philox mapping to produce 2 double precision floats following a standard normal distribution.
- Parameters:
mask – Optional mask to control which lanes are updated
- Returns:
Array of 2 random floats from a standard normal distribution
- property seed¶
(self) -> drjit.llvm.ad.Array2u
- property counter¶
(self) -> drjit.llvm.ad.Array4u
- property iterations¶
(self) -> int