Fast

Fast#

rms_norm(x, weight, eps, *[, stream])

Root Mean Square normalization (RMS norm).

layer_norm(x, weight, bias, eps, *[, stream])

Layer normalization.

rope(a, dims, *, traditional, base, scale, ...)

Apply rotary positional encoding to the input.

scaled_dot_product_attention(q, k, v, *, scale)

A fast implementation of multi-head attention: O = softmax(Q @ K.T, dim=-1) @ V.

affine_quantize(w, /, scales, biases[, ...])

Quantize the matrix w using the provided scales and biases and the group_size and bits configuration.

metal_kernel(name, input_names, ...[, ...])

A jit-compiled custom Metal kernel defined from a source string.