pub unsafe fn pack_a<T: TropicalScalar>(
m: usize,
k: usize,
a: *const T,
lda: usize,
layout: Layout,
trans: Transpose,
packed: *mut T,
mr: usize,
)Expand description
Pack a panel of matrix A into a contiguous buffer.
The packed format stores mc rows in column-major order within
blocks of mr rows. This improves cache locality during the
microkernel computation.
§Layout
For A with dimensions m×k:
Original A (row-major, m=6, k=4, mr=4):
[ a00 a01 a02 a03 ]
[ a10 a11 a12 a13 ]
[ a20 a21 a22 a23 ]
[ a30 a31 a32 a33 ]
[ a40 a41 a42 a43 ]
[ a50 a51 a52 a53 ]
Packed (column-major within mr×k blocks):
Block 0 (rows 0-3): a00 a10 a20 a30 | a01 a11 a21 a31 | a02 a12 a22 a32 | a03 a13 a23 a33
Block 1 (rows 4-5): a40 a50 0 0 | a41 a51 0 0 | a42 a52 0 0 | a43 a53 0 0§Safety
amust point to valid memory for at leastm * ldaelementspackedmust have capacity for at least((m + mr - 1) / mr) * mr * kelements