Textures¶
Dr.Jit further provides the ability to perform texture sampling on array types, with the Python interface exposing half, single and double-precision floating-point textures in 1, 2 and 3 dimensions. Tensors can be supplied to initialize these textures
import drjit as dr
n_channels = 3
tensor = dr.full(dr.cuda.TensorXf, 2, shape=[1024, 768, n_channels])
tex = dr.cuda.Texture2f(tensor)
In C++, the template class dr::Texture can be instantiated with any Dr.Jit array or scalar floating-point type, along with the associated dimensions
using Float = dr::CUDAArray<float>;
size_t shape[2] = { 1024, 768 };
dr::Texture<Float, 2> tex(shape, 3);
Given an array of texture coordinates \(p_i \in [0,1]^d\), we can sample a
texture of \(d\) dimensions at positions \(p_i\) using the
eval()
function
tex = dr.cuda.Texture2f(tensor)
pos = dr.cuda.Array2f([0.25, 0.5, 0.9], [0.1, 0.3, 0.5])
out = tex.eval(pos)
where the texture filtering and wrap-mode methods used for interpolation are specified during initialization
tex = dr.cuda.Texture2f(tensor_data, filter_mode=dr.FilterMode.Linear,
wrap_mode=dr.WrapMode.Repeat)
Moreover the eval_cubic()
function provides an independent interface
for sampling a texture using a clamped cubic B-Spline interpolant.
Hardware acceleration¶
Dr.Jit textures targeting the CUDA backend can benefit from hardware-accelerated
texture lookups. Internally, textures initialized with use_accel=True
(which is enabled by default) will create an associated CUDA texture object
that leverages GPU hardware intrinsics to perform sampling
tex = dr.cuda.Texture2f(tensor_data, use_accel=True)
Note
Only single and half-precision floating-point CUDA texture objects are supported. Double-precision textures can be initialized but won’t benefit from hardware-acceleration.
Warning
Hardware-accelerated lookups use a 9-bit fixed-point format with 8-bits of fractional value for storing the weights used for linear interpolation. See the CUDA programming guide for more details.
Migration¶
When CUDA texture objects aren’t utilised, the underlying storage type of a Dr.Jit texture is exclusively a tensor,
tex = dr.cuda.Texture2f(tensor_data, use_accel=False)
tensor_data = tex.tensor()
array_data = tex.value()
however hardware-accelerated Dr.Jit textures can be initialized to retain both a copy of the data as a CUDA texture object as well as a tensor by disabling migration
tex = dr.cuda.Texture2f(tensor_data, use_accel=True, migrate=False)
While the default behavior of texture intialization is to set migrate=True
to
minimize redundant storage, it’s important to note that attempting to fetch
either the tensor()
or value()
data requires converting a
CUDA texture object into a tensor and hence a side effect of these function
calls is to disable migration.
Automatic differentiation¶
Suppose we want to compute the gradient of a lookup with respect to the input tensor of a texture
import drjit as dr
N = 3
TensorXf = dr.cuda.ad.TensorXf
Texture1f = dr.cuda.ad.Texture1f
Array1f = dr.cuda.ad.Array1f
tensor = TensorXf([3,5,8], shape=(N, 1))
dr.enable_grad(tensor)
tex = Texture1f(tensor)
pos = Array1f(0.4)
out = Array1f(tex.eval(pos))
dr.backward(out)
grad = dr.grad(tensor)
In order to propagate gradients, the associated AD graph needs to track the collection of coordinate wrapping, texel fetching and filtering operations that are performed on the underlying tensor as part of sampling. While hardware-accelerated textures here rely on GPU intrinsics, such textures are indeed still differentiable. Internally, while the primal lookup operation is hardware-accelerated, a subsequent non-accelerated lookup is additionally performed solely to record each individual operation into the AD graph. More importantly, computing gradients does not require disabling migration and texture data can continue to exclusively be stored as a CUDA texture object.