Fanning the Flames: Prefixing Variable/Attribute Names

This is a repost of an article posted on SCN.


Trigger Warning: This blog will probably annoy a number of people. It's been lurking in the back of my mind for some time now, originally inspired by Ralf Wenzel's article Hungarian beginner's course - a polemic scripture against Hungarian Notation. Among others, I wouldn't be too surprised to receive critical comments from Matthew Billingham in particular - but then again, maybe not. As Tolkien has put it a long time ago, "So many strange things have chanced that to learn the praise of a fair lady under the loving strokes of a Dwarf's axe will seem no great wonder." I figured I might as well give it a try, even though it might spoily my birthday festivities.

To cut to the point: I use prefixes for variables and parameters myself, and I encourage people to use a consistent prefix system throughout their code. In my experience, a sensible naming scheme for variables does help developing and maintaining ABAP code. I don't for a moment believe that I will convince anyone who doesn't want to accept that, but I do believe that in this discussion, there are a number of arguments that are too easily forgotten or cast aside. I'm well aware of the fact that this subject is guaranteed to trigger a lively - hum - discussion, so I will try to keep the tone a bit more formal and a little less emotional than Ralf did in his post. (To be fair - his trigger warning is in the title as well as the introduction.)

Perhaps, first and foremost - what is the prefix system I use? It boils down to a combination of one or two letters, an underscore and a sensible name for the variable. I'll focus on the assembly of the prefix here, since that's what the 'hungarian notation' discussion is about - whether you want to name your variables according to the dictionary structures or rather have some 'english' name is another discussion.

The first letter designates the origin or scope of the variable of parameter:

  • class members:
    • g = instance attribute
    • s = static attribute
  • methods and (where applicable) function modules:
    • l = local variable
    • i = importing parameter
    • e = exporting parameter
    • c = changing parameter
    • r = returning parameter
  • programs, function groups
    • g = global variable
    • p = PARAMETERS (report only)

The second letter distinguishes between the data types:

  • (none) - elementary data type
  • s = structure
  • t = internal table
  • r = reference

For WebDynpro coding, I use a slightly different notation - basically because I'm too lazy to to through the generated code and adapt it all:

  • v = elementary data type (WebDynpro coding since that's the default in the generated code)
  • o = object reference (WebDynpro)
  • r = data reference

Exceptions and inconsistencies I haven't overcome yet:

  • so_ = SELECT-OPTIONS
  • co_ = constants
  • no prefix for public read-only attributes of persistence classes
  • no prefix for constants in "enum interfaces" that only contain constants
  • TABLES - hello, dynpros! - for structures EXCLUSIVELY, NEVER for transparent tables, same name as the structure

So we get, for example

  • gt_foo in a class - that would be an instance attribute that is an internal table
  • ls_bar - a structured local variable
  • ir_baz - an importing reference parameter

For FIELD-SYMBOLS, I tend to use the same notation, so you'll see stuff like <ls_foo> or <lr_bar> in my code. Since I tend to avoid global FIELD-SYMBOLS, I could omit the 'l', but decided to keep it for simplicity of the ruleset.

To sum it up: nothing too special here. I don't claim to have invented this prefix system, only adopted and adapted it slightly - although it's so generic that probably hundreds of developers around the world use either this system or something very similar. It's slightly more verbose than the scheme presented by Kai Meder in ABAP Modern Code Conventions, but I believe that the differences do justify an extra character.

Before discussion the advantages and disadvantages in detail, I would like to point out a simple and important fact that appears to be overlooked in many discussions: There is no perfect life. There is just life. We have to make do with what we have, and while we can (and should!) always aspire to refine our skills and improve ourselves as well as our environment, there are shortcomings, deficiencies and historically-explainable-but-nonetheless-ugly-kludges that we just have to deal with. Dreaming of a world without global variables, methods no longer than (insert number here) lines of code and pure object orientation may be an enjoyable pastime, but it doesn't get the job done. The world is out there, and it contains loooooong methods (many of which started out fairly short and grew through years of maintenance by many hands and a lack of refactoring), insanely complex systems and a weird mix of object-oriented and procedural programming, so we have to deal with stuff like that on a daily basis.

We also have to deal with another important fact: Humans make errors. Even the most intelligent and experienced developer will eventually produce a bug (other blog authors and moderators excepted, of course). Since a naming convention is of no use whatsoever to the maching, it has to be optimized to support the human in front of the machine. Ideally, it should aid in understanding the code that is already present and prevent coding errors wherever possible - or at least make them more obvious. So let's take a look at some common mistakes and how the naming conventions above aid in preventing or at least spotting these.

Let's start with a common controversy - why use prefixes for local and global variables at all? There are really two arguments that make me recommend using the prefixes 'g' and 'l' respectively, despite many notable voices arguing against their use. First off - shadowing and name masking. These concepts exist in many programming languages, and I have yet to find a real-world application that does not involve a decent amount of sadism towards the poor souls who have to maintain the code afterwards. From my experience, wherever you see variable shadowing actually happening outside of a programming class, something has usually gone south when refactoring or maintaining the code. Variable shadowing produces particularly nasty bugs that can easily take hours or even days to hunt down. Spliting the namespace into distinct partitions for global and local variables neatly solves that problem - if all class attributes start with 'g' (or 's') and all method-internal variables start with 'l', shadowing won't ever be a problem. Other languages cope by adding compiler warnings and smart IDE extensions to help the developer spot variables shadowing each other, which is just another workaround for the same problem, and one we don't have yet (the last time I checked, even the splendid ATC didn't warn about shadowed variables).

The second argument for "scope prefixes" I'd like to present is related to the counter-argument "variables are always local, and who get's to say what's called global in an object-orientated aplication anyway?" It certainly isn't easy to answer the latter question, but it helps to approach the issue from a different angle: When I change the contents of that foo thingy over there - then what is the scope (or lifetime, if you prefer) of that change? Am I only affecting the execution context of the method I'm currently in, and whatever I change gets discarded when the method is processed anyway - or am I permanently changing the state of the current instance, or even the entire class? You may want to call these changes "local" and "global", for want of a better name. Of course, this can be easily determined with a simple double-click, but it's much easier to write and maintain code if you've got that information embedded in the identifier. If you use this consistently, especially as an entry-level programmer, you will find that after a time, using something prefixed with 'g' will start to feel slightly awkward - should this really be global, do I need this anywhere else, can't I just make this a local variable? - whereas accessing something prefixed with 's' outside of some specially designated factory methods will set of alarm bells at once just by reading it. I've seen many bugs that ultimately could be traced back to an accidental modification of an attribute value (often after an improperly executed refactoring operation), and again, these bugs are really hard to find because they usually only surface much later in the application, and even then only under particular circumstances.

Now for the data type designator - what's the point in distinguishing between elementary fields, structures, tables and references? Convenience, mostly, but there are a few cases where these prefixes really can help you spot problems. ABAP has a history of peculiar conversion rules, and the one I'm after here is the conversion between flat structures and single fields. Let's assume you're coding a data object for some translateable entity (something with a separate language-dependent table that contains the translateable parts of the object), and you're keeping the text in an atribute named description. Some day you note that you've forgotten to add an accessor method for that, so you write a getter that returns a STRING, something as simple as r_result = description. Clean and obvious - until you notice that your application displays '100DEFoobar' instead of 'Foobar'. Whoops. Looks like description wasn't only the text, but you decided to store the entire text table entry instead (which makes perfect sense, especially if you've got multiple translateable fields). If you had named that attribute gs_description, you would have noted that immediately and prevented that mistake. Now this is an easy example, but I've seen this error happen to experienced developers in large applications, and again, it takes time and patience to hunt down that unwanted client number in the outbound message stream.

With internal tables, this kind of conversion error will not occur, but there's another quirk in the ABAP syntax that virtually every newbie has to master. Consider the following two examples:

DATA: sflight TYPE TABLE OF sflight.

" somehow, magically, fill that table with some values

LOOP AT sflight ASSIGNING FIELD-SYMBOL(<sflight>).
  " some condition, don't care, whatever...
  DELETE TABLE sflight FROM <sflight>.
ENDLOOP.

as opposed to

DATA: sflight TYPE TABLE OF sflight.

" somehow, magically, fill that table with some values

LOOP AT sflight ASSIGNING FIELD-SYMBOL(<sflight>).
  " some condition, don't care, whatever...
  DELETE sflight FROM <sflight>.
ENDLOOP.

Now granted, that's taking it to the limits, but the experienced ones among you will know how inventive the inexperienced among you can get when introduced to ABAP. I might have turned that into another trapdoor article, but just to reiterate: the first example will modify the internal table while the second example will actually kill selected contents of the database table. The issue here isn't only that this will be dangerous for the application data - bugs like these are usually so catastrophic in their results that they tend to be found early on - but that it is much harder to determine the intent of the developer and pinpoint the bug. Now let's compare this to a prefixed version:

DATA: lt_sflight TYPE TABLE OF sflight.

" somehow, magically, fill that table with some values

LOOP AT lt_sflight ASSIGNING FIELD-SYMBOL(<ls_sflight>).

  DELETE TABLE lt_sflight FROM <ls_sflight>. " OK - we're modifying the local table
                                             " contents here

  DELETE TABLE sflight FROM <ls_sflight>.    " syntax error: The field "SFLIGHT" is
                                             " unknown, but there is a field with
                                             " the similar name "LT_SFLIGHT".


  DELETE lt_sflight FROM <ls_sflight>.       " ATC complains about "Non-numeric index
                                             " access on internal table"
                                             " and suggests correct variant above.

  DELETE sflight FROM <ls_sflight>.          " OK - changing the database contents

ENDLOOP.

Also, after a while, typing LOOP AT lt_ followed by Ctrl+Space becomes a habit, and the code completion will present only the local tables - without robbing you of the flexibility to simply type LOOP AT <Ctrl+Space> and still have access to the full list of options.

All things considered, prefixes for variable names aren't as bad as some articles make them. Prefixes are a workaround, a compromise between what we have, what we can get, what we need and most importantly what we don't want. If you've ever ran into any of the bugs I mentioned above, either in your own code or someone else's, you know what I'm talking about. If you haven't (yet?), still please consider the junior members of your team and the people who succeed you and have to maintain your code. Someone may be very grateful for those few extra characters that really take no time either typing or reading.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer