New River, New Water

Problem in Fortran Matrix Offloading: Transpose Example

Fortran stores matrix in Column-Major ordering, while normal backends stores matrix in Row-Major ordering, which will brings trouble. I will use matrix transpose as an example to explain.

1 Problem

Consider following example:

  1. In Fortran, we create a matrix $A$, print it, stores in memory and returns a pointer $p_F$.
  2. Pass $p_{F}$ to a different backend, suppose C++, do a transpose and stores the result in $p_{c}$
  3. Back to Fortran, recover data from $p_{C}$, print it.

Question: will step 3 print the correct result as we expected?

2 Case 1: $N \times N$ Matrix

Suppose the original matrix is $3 \times 3$ matrix:

$$ A_{Fortran} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \tag{Printed in Step 1} $$

When Fortran stores it in memory, it will become something like:

$$ [3, 4, 7, 2, 5, 8, 3, 6, 9] \tag{Memory Stored from Fortran} $$

Pass to C++ backend, when recovering, C++ would think it is a matrix like:

$$ A_{C++} = \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 9 \end{bmatrix} $$

Do the transpose, it becomes:

$$ (A_{C++})^T = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} $$

Save to $p_{C}$:

$$ [1, 2, 3, 4, 5, 6, 7, 8, 9] \tag{Memory Stored from Cpp} $$

When recovering from it, Fortran again scans it thinking is stores in columns:

$$ (A_{Fortran})^T = \begin{bmatrix} 1 & 4 & 7\\ 2 & 5 & 8\\ 3 & 6 & 9 \end{bmatrix} \tag{Printed in Step 3} $$

Notice that the result is correct by coincident! Because we scan the memory in a wrong way TWICE.

3 Case 2: $N \times M$ Matrix ($N \neq M$)

We’re not this lucky if we’re going to transpose matrix with different length in size.

Suppose the original matrix is $2 \times 3$ matrix:

$$ B_{Fortran} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \tag{Printed in Step 1} $$

When Fortran stores it in memory, it will become something like:

$$ [1, 4, 2, 5, 3, 6] \tag{Memory Stored from Fortran} $$

Pass to C++ backend, when recovering, C++ would think it is a matrix like:

$$ B_{C++} = \begin{bmatrix} 1 & 4 & 2 \\ 5 & 3 & 6 \end{bmatrix} $$

Do the transpose, it becomes:

$$ (B_{C++})^T = \begin{bmatrix} 1 & 5 \\ 4 & 3 \\ 2 & 6 \end{bmatrix} $$

Save to $p_{C}$:

$$ [1, 5, 4, 3, 2, 6] \tag{Memory Stored from Cpp} $$

When recovering from it, Fortran again scans it thinking is stores in columns:

$$ (B_{Fortran})^T = \begin{bmatrix} 1 & 3\\ 5 & 2\\ 4 & 6 \end{bmatrix} \tag{Printed in Step 3} $$

Notice that this is NOT the correct transpose of step 1.


If you have any feedback to this article, feel free to comment here or send an email to me