Problem in Fortran Matrix Offloading: Transpose Example

02 Dec, 2025

Fortran stores matrix in Column-Major ordering, while normal backends stores matrix in Row-Major ordering, which will brings trouble. I will use matrix transpose as an example to explain.

1 Problem

Consider following example:

In Fortran, we create a matrix $A$, print it, stores in memory and returns a pointer $p_F$.
Pass $p_{F}$ to a different backend, suppose C++, do a transpose and stores the result in $p_{c}$
Back to Fortran, recover data from $p_{C}$, print it.

Question: will step 3 print the correct result as we expected?

2 Case 1: $N \times N$ Matrix

Suppose the original matrix is $3 \times 3$ matrix:

$$ A_{Fortran} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \tag{Printed in Step 1} $$

When Fortran stores it in memory, it will become something like:

$$ [3, 4, 7, 2, 5, 8, 3, 6, 9] \tag{Memory Stored from Fortran} $$

Pass to C++ backend, when recovering, C++ would think it is a matrix like:

$$ A_{C++} = \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 9 \end{bmatrix} $$

Do the transpose, it becomes:

$$ (A_{C++})^T = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$

Save to $p_{C}$:

$$ [1, 2, 3, 4, 5, 6, 7, 8, 9] \tag{Memory Stored from Cpp} $$

When recovering from it, Fortran again scans it thinking is stores in columns:

$$ (A_{Fortran})^T = \begin{bmatrix} 1 & 4 & 7\\ 2 & 5 & 8\\ 3 & 6 & 9 \end{bmatrix} \tag{Printed in Step 3} $$

Notice that the result is correct by coincident! Because we scan the memory in a wrong way TWICE.

3 Case 2: $N \times M$ Matrix ($N \neq M$)

We’re not this lucky if we’re going to transpose matrix with different length in size.

Suppose the original matrix is $2 \times 3$ matrix:

$$ B_{Fortran} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \tag{Printed in Step 1} $$

When Fortran stores it in memory, it will become something like:

$$ [1, 4, 2, 5, 3, 6] \tag{Memory Stored from Fortran} $$

Pass to C++ backend, when recovering, C++ would think it is a matrix like:

$$ B_{C++} = \begin{bmatrix} 1 & 4 & 2 \\ 5 & 3 & 6 \end{bmatrix} $$

Do the transpose, it becomes:

$$ (B_{C++})^T = \begin{bmatrix} 1 & 5 \\ 4 & 3 \\ 2 & 6 \end{bmatrix} $$

Save to $p_{C}$:

$$ [1, 5, 4, 3, 2, 6] \tag{Memory Stored from Cpp} $$

When recovering from it, Fortran again scans it thinking is stores in columns:

$$ (B_{Fortran})^T = \begin{bmatrix} 1 & 3\\ 5 & 2\\ 4 & 6 \end{bmatrix} \tag{Printed in Step 3} $$

Notice that this is NOT the correct transpose of step 1.

If you have any feedback to this article, feel free to comment here or send an email to me

Prev: 健康的日子

Next: 甲子园伦理与做题家精神