Contents

It’s time for the next refactoring session, and I happen to have been lazy on my current project. So, today I’ll start to write about some changes I did in a longer refactoring session.

Introducing a new type

One of the first items on my TODO-list was to get rid of the use of JSON data and replace it with a proper class with well-defined elements. I had run into problems with these weakly typed data earlier.

This is the original code snippet in question:

...
try {
  auto requestedIssue = Json::parse(requestContent);
  if (requestedIssue.count("data") == 0) {
    return status400();
  }
  if (requestedIssue["data"].count("ID") != 0) {
    return status400();
  }
  for (auto const& attribute : {"summary"s, "description"s}) {
    if (requestedIssue["data"].count(attribute) == 0) {
      return status400();
    }
  }
  return {storage.insertIssueIncreasedID(requestedIssue), 200};
} catch(std::invalid_argument &) {
  return status400();
}
...

The requestedIssue is the JSON data. The snippet is part of a larger function that returns a JSON and an HTML code. In status400() that JSON is empty.

1. Create the class

The first step is fairly simple: create a plain struct that wraps the JSON data. The most challenging task here is to find a good name. You could say, if a refactoring step contains something more complicated than finding a good name or a good spot to put something, then the step should be broken down into simpler steps.

Fleshing out the class to have proper attributes instead of the JSON data, therefore, will be done in a later step, to make necessary changes to the surrounding code as simple as possible.

2. Extract function

The above snippet is already quite large, but only part of a larger function. In addition, we’re going to apply several changes during this refactoring, so it should get its own function as a second step. Doing this resulted to be a small challenge: The snippet does a lot of checks but also extracts the data from the requestContent string. The new function would need to return two values: The requestedIssue and a boolean if the parsing went OK. For this step, I settled with an ugly out-parameter that has to be fixed in the next steps.

Here is the relevant new code, you can find the whole commit on GitHub, containing both steps:

struct IssueData {
  Json content;
};

bool RestApi::parseIssue(const std::string &requestContent, IssueData &requestedIssue) const {
  auto issueJson = Json::parse(requestContent);
  if (issueJson.count("data") == 0) {
    return false;
  }
  if (issueJson["data"].count("ID") != 0) {
    return false;
  }
  for (auto const &attribute : {"summary"s, "description"s}) {
    if (issueJson["data"].count(attribute) == 0) {
      return false;
    }
  }
  requestedIssue.content = issueJson;
  return true;
}

...
try {
  IssueData requestedIssue;
  if (!parseIssue(requestContent, requestedIssue)) {
    return status400();
  }

  return {storage.insertIssueIncreasedID(requestedIssue.content), 200};
} catch(std::invalid_argument &) {
  return status400();
}
...

3. Add encapsulation

Currently, the parsing function directly accesses the wrapped JSON. If we want to change the JSON to proper attributes, we have to make the data private and provide the needed functionality to access it. Currently, we only need to convert JSON to IssueData and get it back out. Since the JSON that we get out is only used to be stored I explicitly named the function toStorageJSON.

class IssueData {
  Json content;
public:
  IssueData(Json requestJson) : content{std::move(requestJson)} {}
  IssueData() = default;
  Json toStorageJson() const {
    return content;
  }
};

4. Remove the out-parameter

The parseIssue function creates two values, so it should return them both instead of obfuscating that fact using the out-parameter. This step is pretty straightforward: Create a small helper class and adapt the parsing function and its call sites.

If you’re wondering why not simply use an std::pair: Expressive naming is key. Having the function return a pair with a first and a second member can be confusing: does first denote the success of the parsing or second? Something like this should only be done in a very small context where no confusion can appear. Whether the context is small enough is, of course, a matter of convention and personal taste – I opted for the named structure in this case.

struct IssueParseResult {
  IssueData issueData;
  bool success;
};

With these two steps the parsing function and the original snippet look like this:

IssueParseResult RestApi::parseIssue(const std::string &requestContent) const {
  auto issueJson = Json::parse(requestContent);

  if (issueJson.count("data") == 0) {
    return {{}, false};
  }
  if (issueJson["data"].count("ID") != 0) {
    return {{}, false};
  }
  for (auto const &attribute : {"summary"s, "description"s}) {
    if (issueJson["data"].count(attribute) == 0) {
      return {{}, false};
    }
  }
  return {IssueData{issueJson}, true};
}    

...
try {
  auto parsedIssue = parseIssue(requestContent);
  if (!parsedIssue.success) {
    return status400();
  }

  auto requestedIssue = parsedIssue.issueData;
  return {storage.insertIssueIncreasedID(requestedIssue.toStorageJson()), 200};
} catch(std::invalid_argument &) {
  return status400();
}
...

The commit containing these two steps can be found here on GitHub.

5. Move the parsing function into the IssueData class

The parsing function we extracted in step 2 conceptually belongs to the IssueData class. Therefore the next commit simply moves the function to that class, with a slight change: Instead of directly parsing from a string I decided to parse from a JSON object. That means, that parsing the string to JSON goes back into the try block in the original snippet.

This kind of back and forth is common in refactoring sessions. A change that was a good idea at one point can seem less appropriate a little later and then is reversed.

class IssueParseResult;

class IssueData {
  Json content;
public:
  Json toStorageJson() const {
    return content;
  }

  static IssueParseResult parse(Json const& issueJson);
};

...
try {
  auto issueJson = Json::parse(requestContent);
  auto parsedIssue = IssueData::parse(issueJson);
  if (!parsedIssue.success) {
    return status400();
  }
...

Apart from the call to Json::parse, the parsing function remains the same. See the full commit on GitHub.

6. Finally use the typed data attributes

This is the step that actually does what I wanted in the first place: get rid of the JSON as the main data structure. From the parsing function we can see that we expect the JSON to have the following format:

{
  "data": {
    "summary" : "some summary",
    "description" : "some description"
  }
}

There is also mentioned an “ID” member that may not be present in this case, it is used in other locations. For now, the IssueData only needs the two attributes “summary” and “description”. Of course, the toStorageJson function now needs to actually do something (create a JSON object from the attributes). The parsing function also needs to read the attributes from the provided JSON instead of only checking for their existence.

class IssueData {
  std::string summary;
  std::string description;
public: 
  Json toStorageJson() const {
    return Json{
      {"data", {
        {"summary", summary},
        {"description", description}
      }}
    };
  }    
};

IssueParseResult IssueData::parse(Json const& issueJson) {
  // same checks as before...
  auto const& dataJson = issueJson["data"];
  // ...
  IssueData issueData;
  issueData.summary = dataJson["summary"];
  issueData.description = dataJson["description"];
  return {issueData, true};
}

The whole commit can be found here. If you are wondering about the Json class, I use Niels Lohmann’s “JSON for modern C++” library.

7. Last but not least: Adapt some tests

Having the issue data in their own class and a proper parsing function enabled me to test the parsing on a finer granularity. Until now, I had to call the larger function containing the original snippet for this behavior. Now, I can write a dedicated test for the parsing. That’s the last commit for today, but an important one: If refactoring your code improves testability, use that fact to improve your tests!

Conclusion

Obviously, this is not the last time I touch the IssueData class. The stored data has to be read and used at other places, where I still use the plain JSON. But now that the class exists, I can use it in those places as well.

This is by far not the end of the refactorings I had to apply to Fix: There still are some large functions to be broken down, names to be found for recurring smaller tasks and repeated code patterns to be generalized and unified.

5 Comments

Jacek
8 years ago Permalink

I immediately thought about some helper like:

template<typename T, typename JO, typename... STR> T extract(const JO& json_dict, STR ... strs) { return T{json_dict[strs], ...}; }

which could be used like…

return extract<IssueData>(dataJson, "summary", "description");

…although I do not know that JSON lib, so it is likely that my idea just doesn’t play well with it.

1. Arne Mertz
  8 years ago Permalink
  
  Thanks! This looks nice, will try it out the next time I get to play with the code 🙂
  
Jeff V
8 years ago Permalink

Sure thing… really enjoying your posts so I’ll try to keep engaged. 🙂

Jeff V
8 years ago Permalink

Nice post Arne!

I am happy to see that’s very much how I would have done it as well. It’s still useful to see your walk through though. I like the refactoring walk through format.

1. Arne Mertz
  8 years ago Permalink
  
  Thanks, Jeff!
  Always good to hear that my approach is not complete rubbish 😉
  
  If you spot anything you would had have done differently, please let me know. I still need to learn as much as anyone else 🙂

Write clean and maintainable C++

Refactoring Session #3 – from JSON to a Proper Class.