The difference between data
and actuallyUsefulReference
.
This is the 13th post in my series: Coding Without the Jargon, where I show how I clean up and gradually improve real-world code.
You’ll often hear that it is important to add comments to your code. I don’t think that is quite true.
I see comments as a means to an end. The goal is to make the code more readable, and comments are just one way to try and accomplish the desired result, which is readable code.
Rule #13: Put effort into naming things.
One of my recurring messages is: have some coding convention and be consistent. A convention for naming things is part of the coding convention.
For whatever they’re worth I’ll describe some of my naming conventions, which reflect my personal quirks and preferences. Don’t take what follows as gospel.
Comments
The problem with comments is that they are not coupled to the code.
Their only relation to the code is their position within the code. That’s it.
Initially, a comment will clarify the code.
Often the code then changes, and the interspersed comment gets overlooked and remains unchanged.
Finally, what starts out as a helpful comment inadvertently becomes an unhelpful lie.
At best, a comment is a liability: the software developer is liable to keep them updated as the code evolves.
At worst, comments are booby traps that cause harm at undesirable moments.
Better Than Comments
I put a lot of effort in judiciously naming and renaming the various elements on my code, as there are: variables, functions, constants…
I find it useful to re-read and tidy up my code at most a few days after writing it. By then, I still remember enough about how it all works, but any details that are not obvious will become apparent to me as I try to make sense of what I’ve written.
A helpful use of comments is to use structured comments to mix API documentation into the code. I can run a documentation-generation tool to extract that info into readable documentation. That works great.
In rare circumstances, I’ll use a comment to point out a ‘gotcha’: areas where the code looks like other code, but is subtly different. Alert the reader to the fact that there is some detail they might not expect.
Apart from structured documentation and gotcha comments, I will only add a comment if I really, really cannot make the code self-explanatory by reshuffling statements, renaming elements, adding white space, dividing into smaller chunks, grouping larger actions into bite-sized thoughts…
Naming
Things I consider when naming things: I’ll rarely, if ever use variable names like i
, j
, idx
, count
,… Those names convey very little meaning.
I don’t have a rigid approach to how the names are spelled out. UPPER_CASE, lower_case, camelCase, first letter upper- or lowercase… that all depends on the project and language at hand. There might be existing ‘best practices’, which I’ll try to follow.
Like As Like
What I am rigid about is the consistency. I want similar elements to be named similarly, to draw attention to the fact that they’re similar. Below: yes are variable names I might have used in my code, no are inconsistently named variables I’ll try to avoid using within the same project.
It’s not about the names themselves, but rather about the consistency in how the names are chosen.
yes: authorPtr
vs. bookPtr
no: authorPtr
and pBook
yes: numParagraphs
and numCharacters
no: numParagraphs
and charCount
yes: paraCount
and charCount
no: numPara
and charCount
yes: STATE_IDLE
and STATE_EXPECTING_LETTER
no: IdleState
and STATE_EXPECTING_LETTER
Depending on the context (language, complexity of the project), I might also cram some ‘meta-info’ into the variable name.
Parameters
Most of the time, I prefer to use call-by-value behavior, and use return
to hand back results, but for some projects, I use call-by-reference and have some more complicated mechanism for passing data in and out of methods.
In those cases, I might use a name prefix (in_
…, io_
…, out
_…) on function parameters to differentiate parameters that are ‘by value’ vs parameters that are ‘by reference’.
Pseudocode example:
function doSomething(in_someValue, io_someReference, out_someResult) {
...
}
in_...
means: used for data being passed in. out_...
means: used for data being returned. io_...
means: data being passed in, then modified in method, so different data is being returned.
I will express things like ‘const-ness’ through language features of the language at hand.
Meta-Information In Names
I don’t use Hungarian Notation, but I do use some similar ideas, where the name of the entity reflects some additional meta-information about it, meta-information that helps a human interpreter of the code understand some of the less-obvious relations in the code.
An example: I have variables that store JSON strings, and corresponding variables that store a deserialized object. In my code I will use names like thingJSON
and thing
which indicate to me that thingJSON
is a string which can be passed into some JSON parser, and thing
is the deserialized result.
For pointers to things I might use either a p...
prefix or a ...Ptr
suffix – mostly depending on whatever convention is already in force throughout the project. I have no real preference for one over the other. The main thing is: it needs to be consistent throughout the project.
I’ve spent a lot of time writing C and C++ code, and one habit I formed is to use ALL_UPPERCASE
for constants and macros. It’s a habit, in my opinion not better nor worse than using prefixes like c...
or k...
for global constants.
Booleans
In the same vein, I will try to name variables and functions that handle boolean values such that their name reflects their boolean nature. isValid
, hasDelimiter
, isCachedValueAvailable
,…
Collections
With collections, I’ll try to reflect the type and structure of the collection. Instead of a bland variable name like users
which does not hint at how those users are stored, I’ll prefer to use variable names like userList
, userMapByName
, userSet
… to hint at the underlying data structure.
Namespacing
I also like using some namespacing mechanism, especially for stuff that is globally shared.
Rather than make everything global, I prefer to use namespacing techniques. Some languages (e.g. C++, Java) have namespacing features built-in and then I’ll use those.
Other languages don’t have namespacing in which case I’ll use some approximation – e.g. in JavaScript I’ll use global objects or functions and stash stuff in them. In Xojo I’ll use modules or classes to stash global objects.
This increases the odds that the code is re-usable and reduces the risk of a clash with other code.
Some sample code from a project generated by CEPSparker:
if (! SPRK.C) {
SPRK.C = {}; // stash constants here
}
...
SPRK.C.APP_CODE_AFTER_EFFECTS = "AEFT";
SPRK.C.APP_CODE_BRIDGE = "KBRG";
SPRK.C.APP_CODE_DREAMWEAVER = "DRWV";
SPRK.C.APP_CODE_FLASH_PRO = "FLPR";
SPRK.C.APP_CODE_ILLUSTRATOR = "ILST";
SPRK.C.APP_CODE_INCOPY = "AICY";
SPRK.C.APP_CODE_INDESIGN = "IDSN";
SPRK.C.APP_CODE_PHOTOSHOP = "PHXS";
SPRK.C.APP_CODE_PHOTOSHOP_OLD = "PHSP";
SPRK.C.APP_CODE_PRELUDE = "PRLD";
SPRK.C.APP_CODE_PREMIERE_PRO = "PPRO";
Aligning and Sorting
Some people prefer their code editor to do the formatting for them, but I like to format by hand. I like vertically aligned code and monospaced fonts. I also like to have similar things alphabetically sorted.
Modern code editors like Sublime Edit or VSCode make it very easy to keep things sorted and aligned.
This helps me visually spot discrepancies. Imagine I added a new constant and made a consistency mistake, like:
SPRK.C.AP_CODE_EXPRESS = "EXPRESS";
(note AP_
instead of APP_
), it would visually stand out like a sore thumb.
Intermediate Results
A powerful technique is to store intermediate results into temporary variables with meaningful names.
Rather than write out long, complicated expressions, I’ll evaluate and store the subexpressions and then combine them in a final expression.
This has two advantages:
– it can explain the code better, without need for a comment
– it makes the code easier to debug
During a debug session, I can inspect the intermediate result, rather than being forced into an all-or-nothing situation.
Sample snippet in JavaScript: instead of
var padding = new Array(len - retVal.length + 1).join(padChar)
retVal = padding + retVal;
I’ll write:
var padLength = len - retVal.length;
var padding = new Array(padLength + 1).join(padChar)
retVal = padding + retVal;
Regular Expressions
Naming regular expressions can also be helpful.
Regular expressions are notoriously ‘write once, read never’ constructs.
Once you’ve figured it out, you never want to dissect it again. Using named constants will help make the code readable. JavaScript example:
const REGEXP_TRIM = /^\s*(\S?.*?)\s*$/;
Depending on the context, there can also be a performance benefit: regular expressions need to be compiled into an internal representation, which can be performance-intensive, and using a named constant rather than a literal can be faster because the regular expression only needs to be compiled once.
Next
Putting some effort in naming coding elements in a consistent and helpful manner pays off.
If you’re interested in automating part of a Creative Cloud-based workflow, please reach out to [email protected].
We provide training, co-development, mentoring for developers working around Adobe’s Creative Cloud and InDesign Server.
We can also run workshops – have a look at Workshop: Mastering Automation for Real-World Adobe Workflows.
If you find this post to be helpful, make sure to give me a positive reaction on LinkedIn! I don’t use any other social media platforms. My LinkedIn account is here: