Prefixed Names

Arne Mertz March 15, 2017 13

Contents

1 Why does it matter?
2 Hungarian notation
3 Class prefixes
4 If a name really needs that addition
5 Conclusion

Prefixes are a rather controversial topic. Taking everything into account, I think we should not use prefixed names. Here’s why.

A few days ago I touched the topic of prefixed names for variables and types in a discussion. Today I’ll write a few points about the topic, and why I am mostly against the use of prefixed names.

Why does it matter?

To have easily understandable code, names should be meaningful and readable. We have enough complexity to put up with that it is not worth having names that are harder to read than they have to be. Most people formulate any text they read in their minds, and reading a PrfxSomething name just adds unnecessary noise to that process.

Prefixes also add an extra burden when we search generated documentation: Many documentation generators sort e.g. classes in alphabetical order. Having an alphabetical index of classes where almost everything is empty except for the letter C foregoes some of the usefulness of these tools.

Some of those tools have features to ignore a set of known prefixes, but then there’s still the human reading through that index: We search for words starting with the first letter, and our brains do not have the feature to skip the prefix.

Hungarian notation

There are basically two flavors of Hungarian notation: Prefixes that denote the type of a variable, and prefixes that denote some kind of intent or semantics. In modern day languages like C++ where you define hundreds of types in a single program, putting the type in a variable name is not feasible, regardless of whether you use an abbreviated prefix or something else.

I have seen co as the variable name prefix for anything that is not of built-in type (for “complex object” – even if it was not complex at all). In well-designed programs that would mean that almost every variable should have a prefix like that, which makes it pretty useless. Most people use modern IDEs these days where you just hover the mouse over a variable and the IDE tells you what type it has.

In addition, having the type encoded in a variable name makes maintenance cumbersome: imagine you have to change the type of a variable. It does not make much fun to also have to change its name everywhere, or worse, not changing it and having the variable name lie to you from now on. In C++, we also have type aliases, which makes finding a proper prefix hard or impossible. It’s just not worth the effort.

When it comes to semantic prefixes, there is another problem. Encoding the semantics of a variable in its name means the compiler can not enforce the semantics. It’s usually better to solve this by using the type system, i.e. use strong types instead of weak naming.

Class prefixes

There are frameworks that prefix every single class with one or two characters, e.g. Qt where everything is a QSomething. Others use C as a prefix – for “class”. It might be to distinguish the name from other classes, but that’s what namespaces are for. It might be to distinguish classes from enums or structs, but I don’t buy that.

Structs and classes are pretty much the same, and they are all just types. What is important about a type is its interface, which determines how you use it. What language features you used to achieve that interface is less important.

Another convention that some people have taken over from C# and Java is to prefix interface names with an I. To me, this does not make much sense. If I use a type, I should not have to care whether it is an abstract interface or the real thing. I just use its methods, and I expect them to do what their names say they will do. If on the other hand, I am implementing the interface itself or a class that derives from it, then I hopefully know that it’s an interface class and don’t need the I-prefix to remind me.

If a name really needs that addition

In some cases, it might actually be necessary to add some information to a name. I still don’t think that should be done via a prefix. Those cases should be rather rare, which means the added information will not be very common. Having to know what a prefix for such uncommon cases means us just an unnecessary burden.

In addition, if it really is important to add information to a name, it should be worth a little more than just one or two letters.

Conclusion

Like braces and indentation, prefixes are a styling topic that is mostly a matter of taste. Even if you agree with what I think about prefixed names, you might not want to stray from naming conventions you have been used to for years.

I’d like to hear what you think. Do you like prefixes? What peculiar prefixing conventions have you seen? Please leave a comment!

13 Comments

Beth
6 years ago Permalink

I would agree with you if the only kind of semantic prefix was just general type names. But these are only semantic in the sense that they avoid implementation details. They are a step up from microsoft’s overly specific naming conventions such as LPSZ meaning “long pointer to string”, but not by much.

However, in our code prefixes represent the role of the variable and not just the type. They are a shorthand for commonly used suffixes in names without prefixes. They are a lot less typing than the full name.

Thus in a function dealing with person data, “sPerson” is short for “personName”, “iPerson” is short for “personIdentifier” (i.e. the unintelligent primary key identifier associated with the person”, “aPerson” would be short for “personTupleData” (i.e. the row array returned by a database query), “oPerson” would be short for “personObject” (i.e. the row tuple converted into an object).

In a single query routine all four variables may co-exist: sPerson to look up a person record with a human readable name, iPerson to join the person record to detail records via foreign keys, aPerson when the data is retrieved and oPerson when it is converted into an object.

Sometimes variables that have the same physical type (both integer) will have a different prefix depending on the role. An integer serving as a count will begin with “q” – its short for “fooCount”. An integer acting as an index or identifier will begin with “i” – its short for “fooIdentifier” or “fooIndex” if looping through an array of foo’s.

Or consider a graphics algorithm – I frequently find it helpful to determine points by thinking out how one axis at a time changes as the figure drawing progresses. Therefore, it is more important to me to see all the x-axis variables grouped together. Therefore I prefer to have my variables prefixed by coordinate , i.e. xCenter, xRadius , yCenter, yRadius rather than centerX, centerY, radiusX, radiusY.
That way I can see all the data I have available to calculate the next X coordinate value or the next Y coordinate value.

As for this causing problems with sorting variable names in the IDE – I question that. Typically the variables with common role based prefixes play common roles in the algorithm. If your code is well modularized, each routine is dedicated to a single algorithm. Role names are much more important than the fact that one has four variables that are “people” variables. So I want things sorted by role.

Reply
Christos Kontas
7 years ago Permalink

There is only a single place where Hungarian Notation is helpful: in API documentation.
I believe this the cause that confused so many people and started to mimic naming conventions: by reading Windows documentation.

You forgot to mention the other horrible convention: the m_ prefix for member variables.
Another atrocity that probably stems again from old documentation. If an author would like
to be explicit about the scope of a variable, then this->variable is more than enough!
Smaller scopes, though, are even better and make even these explicit references useless.

Reply
Petar Petrov
7 years ago Permalink

What if you dont use an IDE!

BANG!

this post goes into oblivion …..

Reply
1. Arne Mertz
  7 years ago Permalink
  
  If you don’t use an IDE, one or two points of the post may not apply. The rest stays.
  
  But then again, when you don’t use an IDE, there’s a lot of other stuff that’s not as easy as it would be with one.
  
  And as I wrote, it’s a controversial topic. If course, people have different opinions about it. Has nothing to do with oblivion though.
  
  Reply
Juan
7 years ago Permalink

I still use some prefixes occasionally, f.e. n for variables holding the count of something, p for raw pointers and k for constants.

Reply
Markus Palcer
7 years ago Permalink

That’s rather interesting topic actually.
I can very much follow your reasoning, yet prefixing interfaces with I is part of muscle memory so strongly that it feels really strange not to do it.
Still modern IDEs quickly show (e.g.via a symbol before the class name) whether it’s an interface or not.

I’d like to ask about your opinion about suffixes that denote a special meaning of a class.
I’ve often seen the schema “FooBase” as class name for base classes for Foos (Like WindowBase being an abstract base class for all Window-Implementations).

In our project we’ve got lots of code that’s generated for each service from a DSL.
If I create a service called “MyService” and specify its public interface (accessible via a UART communication line between PC and the embedded device), the following things are created according to our project specific conventions:

An interface that contains C++ declarations for all publicly visible functions that are defined in the DSL in a File called SMyService.h
A “Dispatcher” which unpacks a message received on the UART line addressed to that service and calls the appropriate function in the service implementation (which inherits from SMyService) in files called DMyService.h and DMyService.cpp
A “Responder” used to trigger sendeing response messages that are encoded in the DSL (communication is asynchronous, so the response might not be a synchronous result of the function in SMyService) in files called RMyService.h and RMyService.ccc
If I want to add functions to the interface of the service which is visible to other services (services running on the device can talk directly to each other without the message bus through UART), I create a file called IMyService.h
The actual implementation is done in CMyService.h and CMyService.cpp

I think it’s obvious that the classes in those files have the same names as the files and they all reside in a namespace that’s an abbreviation of the service name (also specified in the DSL – e.g., MYSE for MyService)

Now without looking at the architecture itself – this comment and my question only focuses on the names as your post does too – would it be better to use the following names:

MyServicePublicInterface.h
MyServiceDispather.h/.cpp
MyServiceResponder.h/.cpp
MyService.h (The former IMyService.h)
MyServiceImplementation.h/.cpp

I personally think that the same arguments apply to this naming schema – it doesn’t matter if I add a prefix or a suffix, the thing that I do is encode the “category” of a class in its name.

In my personal opinion the classes should rather be named:

MyService::PublicInterface
MyService::Dispatcher
MyService::Responder
MyService::MyService
MyService::Implementation

Even though that means I have different name spaces with the same class names, but – isn’t that the actual reason namespaces exist?

What do you think about these three naming schemes?

Reply
1. Arne Mertz
  7 years ago Permalink
  
  At first sight, I definitely agree with you that all those generated classes should be put together in their own namespace, just as you describe it at the end. How these classes have to be used in your project should determine some additional steps: From what I understand you probably won’t use the main service class, dispatcher and responder directly apart from wiring them together in some place (probably also in generated code). The public interface is probably the thing that gets used in your manually written code. I’d probably use type aliases to give all those PublicInterface classes more usable names, but that’s about it.
  
  Here the issue with alphabetical sorting can also be seen: With all the service detail classes in the same namespace and probably the same directory you will see all classes that play the same role in different services grouped together instead of all classes belonging to one service. With namespaces (and an associated directory structure) the classes will just be grouped naturally.
  
  Reply
Bartek F.
7 years ago Permalink

I am used to simplified Hungarian notation, and I even like this. But I prefer to have a good name without any type information than have a weak name with tons of type prefixes…

Probably the only thing that I really need is ‘m_’ ‘s_’ – for members and statics inside classes. ‘I’ For the interface is also a good thing since it means that the class is really something abstract that shouldn’t/cannot be a final object…

Reply
Fred
7 years ago Permalink

I like the m_varName prefix for class members. Not to know when I am dealing with a class member but because it’s easier that way than to come up with two different names for e.g. the parameter passed into a member function and the member variable itself (m_size = size instead of size = size_param or something). Other than that I prefer long descriptive variable names without a prefix.

While I can imagine people using sematic prefixes, I consider type prefixes useless artefacts of some past where the tools were not there yet.

Reply
1. HappyCactus
  7 years ago Permalink
  
  I agree with Fred of the previous comment. I would like to have prefix-less members but I hate when I can’t immediately understand if a variable is a local variable, a member or a parameter. Using the “m” prefix (with camel names) is the only exception to the “no prefixes” rule I am using since many years.
  
  Reply
  1. Arne Mertz
    7 years ago Permalink
    
    When you can’t immediately understand whether a variable is a member or local to the function, you might have a different problem: Functions should be small enough and use few variables so it should be easy to keep track what is local and what is not.
    
    Reply
2. Arne Mertz
  7 years ago Permalink
  
  I agree that it’s sometimes not easy to come up with some parameter names for methods. In setter-like functions I often use something like size = newSize and similar.
  
  Reply
  1. 'No Bugs' Hare
    7 years ago Permalink
    
    FWIW: For last 10 years I’m using a convention of underscore suffix: size = size_; (prefixes are technically illegal in some cases, but suffixes seem to be ok, and collisions are ultra-plus-unlikely).
    
    Reply

Write clean and maintainable C++