Fast#

rms_norm(x, weight, eps, *[, stream])

Root Mean Square normalization (RMS norm).

layer_norm(x, weight, bias, eps, *[, stream])

Layer normalization.

rope(a, dims, *, traditinoal, base, scale, ...)

Apply rotary positional encoding to the input.

scaled_dot_product_attention(q, k, v, *, scale)

A fast implementation of multi-head attention: O = softmax(Q @ K.T, dim=-1) @ V.