Hacker News Clone

Syntax highlighters are wrong

by jamesfisher on 5/11/2014, 2:30 PM with 124 comments

by pcmonk on 5/11/2014, 3:47 PM
I disagree about insertions and deletions. The primary purpose of colors should be to communicate more information without us having to think about. Having standard color schemes is really helpful. If every service uses different insertion/deletion colors, that's just more headache for us.
Additionally, to me, red deletion doesn't mean "this was a bad action". It means "this was bad code, so we're crossing it out". Semantically, that makes total sense to me. Red is bad old code, green is new good code. Seeing a lot of red in code means the same to me as seeing a lot of red on a marked up copy of an essay (as long as it wasn't a professor who marked it up) -- I've identified a lot of improvements to make.
by artursapek on 5/11/2014, 3:57 PM
I think the author is overthinking it in the second section: we're not discouraging deleting code by marking it red. We're simply denoting that it was removed, or killed off (he drew the connection of red to blood, and green to new life).
However I don't understand why diff views abandon syntax highlighting. I understand there's sometimes a challenge of breaking up highlighting that depends on syntax that spans multiple lines, but it seems like we could do better than marking entire lines bright red and green and leaving the text black. This is the biggest thing that confuses me about Github - they've left it like this for years and it's probably the most looked-at part of their site.
by timr on 5/11/2014, 5:17 PM
"Every comment here is redundant. The textual decoration attempts to assert its own importance, adding two non-semantic lines to every comment. And indeed the eye is drawn to the useless comments instead of the code. So, after looking at these useless comments all day, what do we do? We use our syntax highlighter to turn them off!"
Is it really such a burden to scan a few comments? Come on.
The real problem here is that most developers are total brats about reading code. They'd rather cut off their hand than read anything, and once you're in that mindset, anything that you can latch onto is the target for excessive, self-affirming hostility about the stupidity of the author: those comments are just useless! The idiot who wrote this code must be terrible! I now feel slightly more justified in re-writing everything!
Given that developers almost never write comments for anything, I'd rather see someone erring on the side of over-commenting. If I see code that has been commented in that way, and I see that the other methods are well-commented, I know that the developer was probably more careful than the person who wrote no comments at all. I'm more likely to trust the code. Moreover, having your editor insert a comment on every new method is really a best practice -- it gets you in the habit of documenting, because you always have that empty comment template, staring at you.
by pyre on 5/11/2014, 5:12 PM
The author seems to brush off criticisms of his/her diff color ideas:
```
  [Edit: quite a few people disagree, which isn’t
  surprising, coming from people that already read
  diffs all day. It would take some time to retrain
  your eye.]
```
Comments like "well these people are just stuck in their ways," don't really disprove criticisms of your ideas.
Also, the author also seems to choose the colorscheme for comments base on his/her ideas of how one should use comments. This is less an argument of how right/wrong current colorschemes are and more of a battle of "More Comments, Be Explicit" vs "Less Comments, The Code is Documentation".[1]
[1] I'll weigh in my anecdotes on the "code should be documentation" idea. I've been places where this was the mantra, and the code was a complex mess of spaghetti. The idea that all I had to do was "follow the code" is laughable when "following the code" could be a several hour affair just trying to determine what one option to a class did. Or usage of variable naming schemes from the 90's where everything was an abbreviation (presumably to keep variable names as close to 8 characters as possible) or avoiding name collisions by prepending characters ('ppc' already exists, so use 'qppc', but in other source files use 'rppc' or 'PPC').
by JoshTriplett on 5/11/2014, 5:03 PM
The syntax highlighting I use strongly emphasizes comments, making them bold and bright; I'm not at all a fan of highlighting schemes that fade out comments, like github's.
And I definitely agree about redundant comments. One of my earliest CS classes made it a requirement to have a comment for every line of code; those homework assignments looked like abominations (above and beyond being written in C++). These days, I try to avoid writing code that's too clever for its own good, and on the rare occasions when I can't make code self-documenting, I comment it instead.
As for the insertion/deletion item, though: I read so many diffs that I have a strongly established association for red->deletion->yay. Doubly so for the diffstat: always great to see -bignum +smallnum.
Regarding the linked article about "semantic highlighting", I've seen numerous complaints equating existing syntax highlighters to highlighting all the verbs in English prose, and "why on Earth would you want to do that?". Code is not prose; highlighting the verbs makes perfect sense given that code is a set of instructions. A better analogy would be to a step-by-step instruction manual, and I've seen many instruction manuals that highlight the verbs in each step.
by dionidium on 5/11/2014, 3:42 PM
I believe the reason for this strange color scheme is the lack of a revision control system. Back in the dark ages of programming, we didn’t use them. We edited files on disk, and that was that. In that environment, a deletion is dangerous. If you decide you want it again after you delete it, well, that’s tough.
I don't think this explanation makes any sense. What was this person from the dark ages diffing against? Presumably, something on disk (i.e. something whose contents are recoverable).
by ma_mazmaz on 5/11/2014, 4:20 PM
This is blatantly wrong: "red means bad and green means good. This association is cross-cultural, probably universal, and probably as old as the hills: red as blood, green as grass. "
http://youtu.be/z2exxj4COhU?t=16m6s
by jroseattle on 5/11/2014, 4:47 PM
Because there is only one way for it to be "correct", eh?
Seriously, to each his own. My IDEs all support customizable syntax highlighting to use whatever I want. They all ship with some default settings, but I mod them every time to support what works for me.
That's the correct answer -- whatever works best for the individual is what is "correct".
by Velox on 5/11/2014, 3:46 PM
One of the main rants in this is about superfluous comments. I completely agree that comments which are superfluous should be removed. However all his examples are JavaDoc. Yes, in the code they are useless because of the variable declarations, however they are of great use in an IDE.
by pubby on 5/11/2014, 3:53 PM
Aw, I was hoping article was going to be about the highlighter's grammar rather than its colors. Perhaps there exists a system other than, "colorize keywords and comments". Or maybe build color into the language itself, as done by colorForth.
by juliendorra on 5/11/2014, 4:02 PM
about comments being washed out: in the quest of writing good source code for kids (easy for kids to read, adapt and use) I found that comments should be not only an integral part of the source, but probably not highlighted as a single block, with an uniform color. For kids and beginners, we should have a highlighter for comments that bold words between *, that put some emphasis on titles in comments, that makes function/variable names in comments jump out. Also, we could highlight inline comments (at the end of a line of code, generally frowned upon but potentially useful for beginners and kids) as yellow stickies.
by bayonetz on 5/11/2014, 5:20 PM
I agree that syntax highlighters are wrong...except when they are right!
There are times when you want the comments to recede into the background, such as when you are actively creating some new code and aren't reading your own comments. This is a great time to have comments greyed out. In fact, it is probably the most common case for a syntax highlighting IDE - to help you WRITE code.
Of course there are times when you want comments to stand out and scream at you, such as when revisiting your old code or trying to penetrate someone else's code. This is a great time have them highlighted in pink as the OP suggests.
You could imagine a hybrid, where inline comments are pink but greying out big multiline comments like those that precede methods and classes. Or a million other variations, depending on your context.
Of course, you can sort of do this already by customizing the colors in the IDE's settings area. But you only get to pick on context. Or maybe you can create a couple, but then you have to go back to settings and at least toggle between them.
Bottom line, I think this article could be better if it developed this idea that IDEs/Syntax Highlighters could be improved by giving users a good way to EASILY choose different code reading contexts/modes on the fly such as "authoring mode", "reviewing someone else's mode", etc.

by vorg on 5/11/2014, 10:10 PM

Why stop at highlighting? The syntax should be customizable also. Who needs to repeat the `protected` keyword, initial null value, or semicolons before the line ends? Perhaps instead of...

  protected HashMap children = new HashMap();
  protected int backgroundProcessorDelay = -1;
  protected LifecycleSupport lifecycle = new LifecycleSupport(this);
  protected ArrayList listeners = new ArrayList();
  protected Loader loader = null;
  protected Log logger = null;
  protected String logName = null;
  protected Manager manager = null;
  protected Cluster cluster = null;
  protected String humanReadableName = null;
  protected Container parent = null;
  protected ClassLoader parentClassLoader = null;

...we could write...

  protected(defaultInitialValue: null){
    HashMap children = new HashMap()
    int backgroundProcessorDelay = -1
    LifecycleSupport lifecycle = new LifecycleSupport(this)
    ArrayList listeners = new ArrayList()
    Loader loader
    Log logger
    String logName
    Manager manager
    Cluster cluster
    String humanReadableName
    Container parent
    ClassLoader parentClassLoader
  }

And that's just for starters!

by unfamiliar on 5/11/2014, 6:58 PM
In practice, I usually choose a colour scheme based on how important the comments are. There exist plenty which either emphasise or de-emphasise the comments, just choose one that is appropriate for the way your project is commented. And I disagree that all of those comments you removed are useless. They make the code far more readable and remove any ambiguity about what is represented by the variables.
>Why, then, are we psychologically rewarding additions with green, and punishing deletions with red?
I'm sorry, but this is absurd. And with regards to the replacement blue/yellow scheme:
> Perhaps it’s not as pretty, but it’s more usable
It is far less usable. I glance at the red/green and know immediately which is which. I have no idea whether yellow or blue is deletion or insertion at first glance.
This post could have touched on much more, for example: should we really be highlighting keywords, shouldn't we just be emphasising flow control and identifiers? There are many interesting ideas to be explored in this area, such as highlighting rvalues/lvalues in C++, highlighting scope instead of keywords, etc.
by buro9 on 5/11/2014, 4:54 PM
I think the source of the problem with comments is that we try to make them serve two purposes:
1) Give the code clarity
2) Be the source of documentation
These goals aren't easily reconciled, too many comments reduce clarity whilst increasing the likelihood that they are inaccurate and not maintained.
Documentation generated from comments seldom explains how to use the APIs that a library or program exposes. Usually resulting in the addition of more comments, further reducing code clarity.
The real problem with comments is that too few of us write useful documentation.
I can't remember who said that the best thing a programmer can do is to learn to write.
Our job isn't just to write code, it's also to make sure the code is used... meaning we need to learn how to sell our code to people who will maintain and implement things against our interfaces. We need to sell to other devs, by documenting better and explaining how to use our code. Only then can we reduce the in-line comments that reduce readability of code.
by harsh1618 on 5/11/2014, 4:29 PM
I agree with most of the first section but there are scenarios where washed out comments are better. If I comment out a block of code while debugging, I don't want it to be bold and highlighted. Maybe highlighters should be able to differentiate between natural language and code.
by robert_tweed on 5/11/2014, 4:26 PM
I completely disagree that comments should be called out in bold because they are supposed to add important information. Even in code where that is the case, they are still just metadata. Comments are there to help you understand something. Comments should never be shouting at you, drowning out the code itself. We only have so much attention to spare, and most of that attention should be focused on the actual code, not the comments.
The biggest danger of comments (especially in heavily over-commented code) is that they can be misleading. If you get over used to relying on the comments as a true indication of what the code does, it's easy to be misled. It can especially cause you to miss subtle errors in the code (like an = instead of ==, or > instead of <).
Your eye should be drawn to the code first and foremost. Hence it is most important that it is formatted neatly and really, syntax highlighting is just a good way to catch typos quickly.
Comments should be what you read second, not what you read first. I.e., I just read this bit of code and it seems a bit weird, so I'll check the comment - oh yeah, now I understand what it's doing. Most comments should be ignored most of the time.
"we have collectively decided that the comment is less important than the code" - this is simply because the comment is less important than the code. Comments are not necessarily correct and don't always accurately represent what the code does. The code however, is always 100% accurate. That's why experienced programmers rely on the code first, comments second and try to avoid commenting ideas that can be expressed equally well in the code itself.
The use of Javadoc style comments is another strawman. As much as we might wish it were not necessary, using comments to generate documentation is actually useful. Similarly, having a coding standard requiring a comment block per class/function for the docs is also useful. A standardised structure is helpful when reading lots of code, and it helps delineate long source files, even if many of those comments turn out to be redundant. Having a highligher scream about all those comments being super important isn't helping by making people write more concise comments - it's merely trying to solve a problem that doesn't exist and creating a new problem in doing so.
Sometimes I need to comment something that really is super important like the example in the OP. It turns out, there's a way to do that which doesn't even rely on having any syntax highlighting whatsoever. You just write your comment like this:
```
  // !!!!! IMPORTANT WARNING !!!!! //
  // This does something really dangerous,
  // so don't change it unless you understand it...
  // !!!!! BEGIN CRITICAL SECTION !!!!! //
    if(foo) {
      bar();
    }
  // !!!!! END CRITICAL SECTION !!!!! //
```
I write code that needs something like this maybe once every 6 months. Do I want every single comment that I write called out with equal importance? Nope. Is limiting comments to only comments of such dramatic urgency a good idea? Also no.
A related rookie mistake that I see a lot is conflating the idea that less code = less complexity. Inexperienced programmers like to cram as much logic onto one line as possible, whereas better programmers often write the same thing as one-statement-per-line with a bunch of temporary variables, whose name encapsulates what each operation is doing. Which one do you think requires several lines of comments to explain what it does, and which requires no comments whatsoever? One has less code, but both have the same complexity. Except the one with more code breaks that complexity out into smaller, less complex individual chunks which makes reasoning about the whole much, much easier.
by owenversteeg on 5/11/2014, 4:48 PM
Although I like the idea of comments in a stand-out color, red and green for deletion and insertion is absolutely necessary. It's intuitive and the standard, and using other colors (ugly tones of yellow and purple) is both ugly and confusing. If I showed the red/green screen to someone, they'd intuitively know what's being deleted and not. If I showed the purple/yellow screen to someone, they'd scratch their heads.
Also, the proposal that reviewers will determine whether or not to reject a patch based on how the colors look is preposterous. Imagine the Heartbleed patch being rejected because the colors look ugly. (This is, hilariously, the example the author uses.)
by gojomo on 5/11/2014, 7:53 PM
Fainter doesn't necessarily mean "less important": it may just indicate a separate channel-of-interleaved-information, sometimes skippable. (And, since comments are less rigorous and less-definitive than the code, code readthroughs often want to toggle between considering them, and not considering them.)
Red doesn't necessarily mean danger or disfavor. Its use for removed-ranges owes mostly to longstanding use in editorial review (red pens), emphasis ([rubrication]), and as indication of stopping/ending (as in "discontinue this text") – without any inherent 'badness' evaluation.
by chenglou on 5/11/2014, 5:04 PM
Related to the linked article at bottom (coding in color: https://medium.com/p/3a6db2743a1e/): I've been using the Sublime Colorcoder plugin (https://github.com/vprimachenko/Sublime-Colorcoder) and beside some instabilities, it's been working well. It's really nice to see the flow of a variable (or more) through a function, without manually highlighting the occurrences.
by ryangallen on 5/11/2014, 5:11 PM
Colors are certainly a significant part of user experience, so although I agree that these issues exist, I disagree with the solutions.
Comments shouldn't be pink/red because they are not necessarily a bad thing. Just make them bold and opaque.
Changing to blue and yellow for diffs is equally as wrong as red/green but more confusing. Blue signifies safe information and yellow signifies warning. It would be better to use the same color for deletions and additions but make deletions darker to show they are stale/rotted and additions brighter to show freshness.
by jackbauer on 5/11/2014, 5:08 PM
Interesting article. I definitely try to avoid comments and let the code do the talking, I hope more devs would. My eyes ALWAYS skim over comments, unless I can't make sense of some agency dev's code, then I have to look at the comments if they exist.
Although I detest comments, making them stand out more and consequently wanting (hopefully) to write better code, is a plus. There will be some cases for comments, and it should be few and far between, and only super necessary (read Clean Code for contextual examples).
by restlessdesign on 5/11/2014, 5:34 PM
I might take the author more seriously if he didn’t brush off every criticism in the most dismissive, holy-than-thou tone I’ve come across since I was in Williamsburg the other weekend.
by gooseyard on 5/11/2014, 4:41 PM
These "<blank> is wrong" or "you're doing <whatever> wrong" titles are tedious. Tell me about the better one you've written or STFU.
by Patrick_Devine on 5/11/2014, 4:53 PM
This will sound weird, but I actually turn off the syntax highlighter most of the time. I usually use a black background in my terminal, and often the syntax highlighter (or even directory colours in a shell) are set to dark blue, which I have a really hard time distinguishing.
I could remap my colours, I suppose, however I used to often find myself trouble shooting at someone else's terminal. Since then I just found it easier to have it turned off completely.
by rwbcxrz on 5/11/2014, 4:16 PM
Most editors allow you to customize the syntax highlighting color scheme. It's a bit more difficult in a web editor, but is possible with some JavaScript and an extension like Greasemonkey. Then again, how often are you really using the GitHub code browser -- if it's a lot, wouldn't it be easy enough to clone the repo and explore it in the editor of your choice?
by seth1010 on 5/11/2014, 6:40 PM
I disagree with the comment syntax highlighting. If I'm trying to fix a bug in some code, I'll read all the comments and then try to focus on what the code is actually doing. I don't need the comments trying to steal my focus away. Some times, I'll get vim to color comments the same as my background so they'll be invisible.
by on 5/11/2014, 5:32 PM
undefined
by Semaphor on 5/11/2014, 7:33 PM
I use the default dark scheme for visual studio. Comments are in a well visible green [0].
So much for "virtually every highlighting scheme".
[0] http://imgur.com/Nfw7Lyd
by jayd16 on 5/11/2014, 4:37 PM
So I guess we're just going to have a completely barren Javadoc then?
by matthuggins on 5/11/2014, 3:51 PM
The comment talk I can get behind. However, the change from red/green to blue/yellow for file change is non-intuitive, and I hope to never see it.
by Kiro on 5/11/2014, 4:33 PM
I don't agree with a single thing in this article so please don't tell me what's wrong when there clearly are different views on this matter.
by on 5/11/2014, 4:36 PM
undefined
by ttty on 5/11/2014, 6:10 PM
I don't think that in github the additions are green because are good, but because +=green and -=red
by robobro on 5/11/2014, 5:38 PM
I missed where he suggests a better color scheme. He references it for pink commenting.
by lispm on 5/11/2014, 5:34 PM
I wonder when literature will come standard with syntax highlighting.
by noonespecial on 5/11/2014, 4:36 PM
The code itself is the "truth". The comments are commentary.
```
  //Set x to be 4
  x=5;
```
Which would you rather have bold and jumping out at you when skimming? Comments can be wrong because tweaks happen.
by walshemj on 5/11/2014, 4:40 PM
1st world problems - some people have obviously never had to debug on a 3 inch thick printout on fan fold paper where he only highlighting was the highlight pens you used.
by masterleep on 5/11/2014, 3:36 PM
Almost everyone intuitively knows that red = deletion and green = insertion. Nobody knows what yellow and blue refers to. So how is that more usable?
by ebbv on 5/11/2014, 6:02 PM
This article is way, way off base.
First of all, most syntax coloring de-emphasizes comments because they are not code. They are not necessary. They are potentially helpful hints which may or may not be needed. Ideally they aren't. So, they should not be attention grabbing.
The idea that making them attention grabbing and ugly is great because it encourages you to have as few of them as possible is ridiculous. It's trying to accomplish a reasonable task (minimal comments) by totally ass-backwards means (making the comments the most obvious thing on the screen.) It's like replacing the click of the turn signal noise in your car with a blaring klaxon to try to get old people to turn them off with more regularity.
Here's a tip for the author, who I can only assume is a novice programmer; if there's a universal practice which has been a certain way for several decades and you think it's wrong, re-examine your assumptions.
by mantrax5 on 5/11/2014, 6:20 PM
Looking for a problem where there is none, this author is...
The reason why comments are pale compared to code is because a comment always belongs to a given piece/block of code.
A comment doesn't exists for itself, it exists to clarify the next lines of code.
You scan code, and when the code doesn't say enough, you read the comment. So code is the primary thing, comments are important, but hierarchically they're subservient to code not the other way around.
Also, why on earth would doc comments be "redundant" and have to be removed? Why would green/red be bad for insertions/deletions? Give me break...
by owenversteeg on 5/11/2014, 4:48 PM
Although I like the idea of comments in a stand-out color, red and green for deletion and insertion is absolutely necessary. It's intuitive and the standard, and using other colors (ugly tones of yellow and purple) is both ugly and confusing. If I showed the red/green screen to someone, they'd intuitively know what's being deleted and not. If I showed the purple/yellow screen to someone, they'd scratch their heads.
Also, the proposal that reviewers will determine whether or not to reject a patch based on how the colors look is preposterous. Imagine the Heartbleed patch being rejected because the colors look ugly. (This is, hilariously, the example the author uses.)
by mc_hammer on 5/11/2014, 3:49 PM
I think this may be correct. Personally I always wanted the syntax highlighter to highlight only function calls and object method calls.. ex: .Trim() would get highlighted in mystr.Trim()
It also annoys me endlessly that words like 'var' and 'def' and 'function' get the brighest highlight, i know its a variable, it has a name, and i put it there.
by Fasebook on 5/11/2014, 4:47 PM
I thought this was going to be about the ambiguity of some languages when being parsed. Go and Perl are a good example of such a languages. Pretty disappointed with the article even forgiving that. Importance is only a matter of perspective.
by mantrax5 on 5/11/2014, 6:25 PM
Hey look, Uncle Bob said something about comments.
DELETE ALL COMMENTS AND CHANGE SYNTAX HIGHLIGHTING!