pack_a

Function pack_a 

Source
pub unsafe fn pack_a<T: TropicalScalar>(
    m: usize,
    k: usize,
    a: *const T,
    lda: usize,
    layout: Layout,
    trans: Transpose,
    packed: *mut T,
    mr: usize,
)
Expand description

Pack a panel of matrix A into a contiguous buffer.

The packed format stores mc rows in column-major order within blocks of mr rows. This improves cache locality during the microkernel computation.

§Layout

For A with dimensions m×k:

Original A (row-major, m=6, k=4, mr=4):
[ a00 a01 a02 a03 ]
[ a10 a11 a12 a13 ]
[ a20 a21 a22 a23 ]
[ a30 a31 a32 a33 ]
[ a40 a41 a42 a43 ]
[ a50 a51 a52 a53 ]

Packed (column-major within mr×k blocks):
Block 0 (rows 0-3): a00 a10 a20 a30 | a01 a11 a21 a31 | a02 a12 a22 a32 | a03 a13 a23 a33
Block 1 (rows 4-5): a40 a50 0   0   | a41 a51 0   0   | a42 a52 0   0   | a43 a53 0   0

§Safety

  • a must point to valid memory for at least m * lda elements
  • packed must have capacity for at least ((m + mr - 1) / mr) * mr * k elements