Refactoring Session #2a: Matrix Calculation – Code Smells

A while ago, Twitter user @vaughncato sent me the piece of code for this refactoring session. Since there is a lot of things to do on this code, I split this session in two parts. This part will deal with general code smells.

Like the last time, I have done a step by step refactoring of the original code and put it all on GitHub. Each step is a single commit, so you can follow along in the file history. So let’s start with the original code (with some indentation changes applied for the blog):

#include <vector>
#include <cmath>
#include <cassert>
#include <iostream>

using std::vector;
using std::cout;

struct Matrix : vector<vector<float>> {
  using vector<vector<float>>::vector;
  int rows() const { return size(); }
  int cols() const { return (*this)[0].size(); }
};

typedef vector<float> Vector;

// Solve y=m*x for x using Gauss-Jordan Elimination.
// Result is placed back in y
// Identity is placed back in m
void solve(Matrix &m,Vector &y) {
  int n = m.rows();
  assert(n==m.cols());
  vector<int> ref(n);

  for (int i=0;i<n;++i) {
    ref[i] = i;
  }

  for (int row=0; row<n; ++row) {
    // Find a row that has a non-zero value in the current column
    {
      int i = row;
      for (;;++i) {
        assert(i<n);
        if (m[i][row]!=0) {
          break;
        }
      }
      for (int j=0; j!=n; ++j) {
        float temp = m[row][j];
        m[row][j] = m[i][j];
        m[i][j] = temp;
      }
      {
        float temp = y[i];
        y[i] = y[row];
        y[row] = temp;
      }
      {
        int temp = ref[i];
        ref[i] = ref[row];
        ref[row] = temp;
      }
    }
    {
      // Normalize row to have diagonal element be 1.0
      float v = m[row][row];
      for (int j=row;j<n;++j) {
        m[row][j] /= v;
      }
      y[row] /= v;
    }
    // Make all lower rows have zero in this column
    for (int j=0;j<n;++j) {
      if (j!=row) {
        float v = m[j][row];
        for (int k=row;k<n;++k) {
          m[j][k] -= m[row][k]*v;
        }
        y[j] -= y[row]*v;
      }
    }
  }
  for (int i=0;i<n;++i) {
    float temp = y[i];
    y[i] = y[ref[i]];
    y[ref[i]] = temp;
  }
}

static void print_vector(const char *name,const Vector &b) {
  cout << name << "=" << "\n";
  for (int i=0, n=b.size(); i!=n; ++i) {
    cout << "  " << b[i] << "\n";
  }
  cout << "\n";
}

static void print_matrix(const char *name,const Matrix &temp) {
  cout << name << "=\n";
  for (int i=0, m=temp.size(); i!=m; ++i) {
    for (int j=0, n=temp[i].size(); j!=n; ++j) {
      cout << "  " << temp[i][j];
    }
    cout << "\n";
  }
  cout << "\n";
}

static bool is_near(float actual,float expected,float tolerance) {
  float delta = fabsf(actual-expected);
  return delta<=tolerance;
}

static Vector product(const Matrix &m,const Vector &x) {
  Vector a(x.size());

  for (int i=0; i!=3; ++i) {
    float sum = 0;
    for (int j=0; j!=3; ++j) {
      sum += m[i][j]*x[j];
    }
    a[i] = sum;
  }

  return a;
}

int main() {
  Matrix m = {
    {1.1, 2.4, 3.7},
    {1.2, 2.5, 4.8},
    {2.3, 3.6, 5.9},
  };

  Vector y = {0.5,1.2,2.3};

  Matrix temp = m;
  Vector x = y;
  solve(temp,x);

  Vector mx = product(m,x);

  print_matrix("m",m);
  print_vector("y",y);
  print_vector("x",x);
  print_vector("m*x",mx);

  float tolerance = 1e-5;

  for (int i=0, n=y.size(); i!=n; ++i) {
    assert(is_near(mx[i],y[i],tolerance));
  }
}

This is  a lot of stuff. I won’t go into smaller details this time, like includes and helper functions. Instead, I will concentrate on the central function of this code example – except one major pet peeve.

Refactoring some code smells

When I skim this code there are two code smells that immediately attract my attention: Poor naming and Matrix deriving from std::vector.  The poor naming is ubiquitous: there are a lot of one-letter names  for variables that are not simple loop counters, and ref isn’t very descriptive either.

Renaming the central function

The name that irritates me most is not any variable name inside the function, but the name of the function itself. It has global visibility and therefore should really say what the function does. solve does not tell us anything.

So the first order of the day is to find a better name for the function. It is more important than any of the other issues in the code,  because it is the part that affects the maintainability of any code that calls the function. That is a potentially larger area than the code we are refactoring.

As the comment at the top of the function suggests, it calculates the inverse of a matrix to solve the equation m*x=y for x with given m and y. If it was about numbers, this would be a division, but the concept does not exist for matrices. So, in want of a better name, I renamed the function to invertMatrixMultiplication.

Pet peeve: Deriving from standard containers

The next step was the pet peeve I mentioned earlier: Matrix deriving from std::vector. Standard library containers are not designed to be derived from, and inheritance is a far too close coupling.

Instead, aggregation is the appropriate thing to do here. So, I redesigned the Matrix class to have the vector as a class member:

class Matrix {
  typedef vector<float> Row;
  vector<Row> values;
public:
  Matrix(std::initializer_list<vector<float>> matrixValues)
    : values{matrixValues}
  {}

  int rows() const { return values.size(); }
  int cols() const { return values[0].size(); }
  Row& operator[](std::size_t index) { 
    return values[index]; 
  }
  Row const& operator[](std::size_t index) const { 
    return values[index]; 
  }
};

It has the same interface as before, as far as it had been used. There is only one exception: The print_matrix function used the size method inherited from std::vector before. In the refactoring I changed it to a call to rows, which is consistent with the rest of the code.

I did not make any further changes to the class, although it definitely could use some more refactoring. Instead, I went back to the central function.

Prefer standard algorithms over manual implementations

The next point I found was a bunch of blocks that looked not only similar, but very familiar:

{
  float temp = y[i];
  y[i] = y[row];
  y[row] = temp;
}

If we look closely, this code simply swaps two variables. Instead of doing it manually, which is hard to read and possibly introduces subtle bugs, we should just use std::swap(y[i], y[row]). That is what I refactored next – missing one occurrence that looked slightly more complicated:

for (int j=0; j!=n; ++j) {
  float temp = m[row][j];
  m[row][j] = m[i][j];
  m[i][j] = temp;
}

This code swaps m[row][j] with m[i][j] for all j. m[row] and m[i] are just vectors, and swapping all their members is just the same as swapping the vectors themselves (which also happens to be more performant). So the whole loop can be replaced by a single swap, which I did a few steps later:

std::swap(m[i], m[row]);

Manually implementing well known algorithms is one of the code smells that can have a serious impact on the readability of our code. Therefore it is important to not only know our language but also the libraries that are available to us.

Out-parameters

The central function had two out-parameters, i.e. parameters that were taken by non-const reference and changed inside the function. That way the changed values are made available to the caller. However, this form of hidden return values is not very intuitive. A normal return value for the function should be preferred.

One side effect of the out-parameters is that callers of the function who wish to preserve the arguments the pass to it, have to create copies and pass those to the function. This has to be done regardless whether the changed value is of interest or not:

Matrix temp = m;
Vector x = y;
invertMatrixMultiplication(temp, x);
//temp is never used...

So, the next two steps are to refactor each parameter to be a pure input parameter. Since copies of the arguments are used and modified inside the function, I decided to take the arguments by value. The calculated vector is needed, so I return it, other than the matrix, which seems to be of no interest.

Vector invertMatrixMultiplication(Matrix m, Vector y) {
  // ...
  return y;
}

// ...

//no unneeded temp matrix here:
Vector x = invertMatrixMultiplication(m, y);

I did the refactoring in two steps, first one for the Matrix, then one for the Vector.

The intermediate code

For now, the most obvious code smells have been handled. I have not touched the helper functions – and I won’t, since they are of little interest for now. The Matrix class could be done better, but the best way to design it depends largely on its use, including outside the function. Lacking that information, I’ll leave it as it is.

The central function is a bit shorter due to the use of std::swap, but it’s still too long. The refactorings done here did not need a closer look at the algorithm used. Instead they only fixed superficial smells that can be seen without much knowledge of the Gauss-Jordan elimination algorithm.

You can see the current state of the code below. Next week I’ll dig deeper into the implementation details of the function with a focus on what @vaughncato originally asked me: Extracting a class for the algorithm.

#include <vector>
#include <cmath>
#include <cassert>
#include <iostream>
#include <algorithm>

using std::vector;
using std::cout;

class Matrix {
  typedef vector<float> Row;
  vector<Row> values;
public:
  Matrix(std::initializer_list<vector<float>> matrixValues)
    : values{matrixValues}
  {}

  int rows() const { return values.size(); }
  int cols() const { return values[0].size(); }
  Row& operator[](std::size_t index) { 
    return values[index]; 
  }
  Row const& operator[](std::size_t index) const { 
    return values[index]; 
  }
};

typedef vector<float> Vector;

// Solve y=m*x for x using Gauss-Jordan Elimination.
// Result is placed back in y
// Identity is placed back in m
Vector invertMatrixMultiplication(Matrix m, Vector y) {
  int n = m.rows();
  assert(n==m.cols());
  vector<int> ref(n);

  for (int i=0;i<n;++i) {
    ref[i] = i;
  }

  for (int row=0; row<n; ++row) {
    // Find a row that has a non-zero value in the current column
    {
      int i = row;
      for (;;++i) {
        assert(i<n);
        if (m[i][row]!=0) {
          break;
        }
      }
      std::swap(m[i], m[row]);
      std::swap(y[i], y[row]);
      std::swap(ref[i], ref[row]);
    }
    {
      // Normalize row to have diagonal element be 1.0
      float v = m[row][row];
      for (int j=row;j<n;++j) {
        m[row][j] /= v;
      }
      y[row] /= v;
    }
    // Make all lower rows have zero in this column
    for (int j=0;j<n;++j) {
      if (j!=row) {
        float v = m[j][row];
        for (int k=row;k<n;++k) {
          m[j][k] -= m[row][k]*v;
        }
        y[j] -= y[row]*v;
      }
    }
  }
  for (int i=0;i<n;++i) {
    std::swap(y[i], y[ref[i]]);
  }
  return y;
}

static void print_vector(const char *name,const Vector &b) {
  cout << name << "=" << "\n";
  for (int i=0, n=b.size(); i!=n; ++i) {
    cout << "  " << b[i] << "\n";
  }
  cout << "\n";
}

static void print_matrix(const char *name,const Matrix &temp) {
  cout << name << "=\n";
  for (int i=0, m=temp.rows(); i!=m; ++i) {
    for (int j=0, n=temp[i].size(); j!=n; ++j) {
      cout << "  " << temp[i][j];
    }
    cout << "\n";
  }
  cout << "\n";
}

static bool is_near(float actual,float expected,float tolerance) {
  float delta = fabsf(actual-expected);
  return delta<=tolerance;
}

static Vector product(const Matrix &m,const Vector &x) {
  Vector a(x.size());

  for (int i=0; i!=3; ++i) {
    float sum = 0;
    for (int j=0; j!=3; ++j) {
      sum += m[i][j]*x[j];
    }
    a[i] = sum;
  }

  return a;
}

int main() {
  Matrix m = {
    {1.1, 2.4, 3.7},
    {1.2, 2.5, 4.8},
    {2.3, 3.6, 5.9},
  };

  Vector y = {0.5,1.2,2.3};

  Vector x = invertMatrixMultiplication(m, y);

  Vector mx = product(m,x);

  print_matrix("m",m);
  print_vector("y",y);
  print_vector("x",x);
  print_vector("m*x",mx);

  float tolerance = 1e-5;

  for (int i=0, n=y.size(); i!=n; ++i) {
    assert(is_near(mx[i],y[i],tolerance));
  }
}
Previous Post
Next Post

14 Comments






  1. Good example, a real world case as noted in one comment.

    Before refactoring such code, I would try to make (unit) tests so that I know I am not breaking anything. That would give more confidence in the process.

    Reply

    1. Hey Bartlomiej, thanks for tuning in. You’re right if course that unit tests are crucial for refactorings.
      In this case I was confident that the single test provided by main() would be sufficient to secure the trivial refactoring steps I used.

      Reply

  2. Why not call the central function GaussJordanElimination?

    The ‘documentation’ for the central function is now out-of-date, as you return the result and pass in the parameters by value.

    Wouldn’t it be better if the code was templated on the underlying data type? Using float might for certain applications be fine, but double is often more appropriate.

    Is having a vector of vectors such a great idea for the Matrix layout? The underlying buffer ofvector will be larger than the size of the vector itself.

    Actually, having an interface for Matrix might be preferable (or the matrix type as a policy template parameter).

    Reply

    1. Thanks for the function name suggestion!
      You are right that the comments are out of date, I’ll add a cleanup step before I get to the core of the next session. There can be said a lot about the Matrix class, it’s far from good code. I’d probably toss it out and rewrite it from scratch. If I don’t get my hands on another piece of interesting code in the next weeks, that could be the topic of another refactoring session – or a “class design session”. Replacing the Matrix with another class would need a separation of the algorithm from the original class and that indeed would be achieved best by using an interface.
      Whether a templated version would be better or not depends largely on the project the function and Matrix class are used in. If all you have to deal with is float, there is no need to pay the price for it, e.g. additional compile time for header only. See also an earlier post about over-generaization: https://arne-mertz.de/2015/01/over-generalization/

      Reply

  3. There are many issues still with the code, (nobody doing linear algebra would use this code) but, I praise you for dealing with a real world problem this time.

    Real problems are usually left out of serious C++ discussions. (Stepanov is an exception.)

    Reply

    1. I will revisit the code next week and add some more refactorings. I will however concentrate on the clean code and design aspects. I imagine that there are better approaches from the algebraic/numerical perspective, but I won’t touch them since it is not my area of expertise. I should probably mention that fact next week 🙂

      Reply

  4. Is there a more compact way to access the internal std::vector without assessing its first member via operator[0]? This question arise because the constructor that takes an initializer list can be constructed as thus:

    Matrix matrix {};
    matrix.cols(); // would result in a runtime error.

    Reply

    1. If there is no inner vector, there is no safe way to access it. I’d consider the Matrix class as it is still sloppily designed. The least would be an assertion that neither the outer nor the inner initializer lists are empty. Depending on how it is actually used one could even make the Matrix dimensions compile time constant.

      Reply

      1. Although, there could be better ways, but one of the ways that came to mind was to check if the initializer list’s size is 0 in the ctor’s body and then reserve* at least one element in the vector. Then the call to operator[] with index 0 would be safe**.

        *Since std::vector::reserve() merely allocates enough memory and affects the capacity( not the size ), and doesn’t perform construction of the objects, so if T is a movable-only type, we would be still be OK.

        ** TDM-GCC behaves weirdly with this behavior but Clang, GCC on Linux, MSVS 2013 gives the desired output.

        Reply

        1. While it may work in most implementations it is not defined, so it’s not advisable. cols() could just check for 0 size of the outer vector: return values.empty() ? 0 : values[0].size();
          But that would only remedy one symptom, not the causes.

          Reply

Leave a Reply

Your email address will not be published. Required fields are marked *