C++ Question

The answer here is, don’t. Define each on it’s own line. I make it habit of defining everything, even base types on their own line. It removes any ambiguity and makes managing the code later easier if you need to remove variables or what have you. The problem you’re having right now about the syntax someone else could run into when trying to read it. Multiply defined variables on a single line (especially if initialized) is just messy to try and read.

By the way the ‘&’ or ‘*’ goes with the variable, not the type. In your first example you’ve declared two variables of different types, one a reference the other an instance.

They’re tricky.

  1. It’s only a reference in a definition. Effectively references are just alternate names - any time you touch it, it’s exact the same as performing the operation on the original.
    int value = 0;
    int& valueAlias = Value;
  2. Applying elsewhere, it’s an address-of operator that returns a pointer.

There’s a stylistic convention in some circles that the ‘&’ or '’ should be aligned with the type ("int x" instead of “int *x”), but it does lead to that kind of confusion. I guess they think it makes it clearer what the real type of the variable is, but the compiler really treats it as a per-variable modifier, not a modifier of the base type, so now it’s just confusing in a different way.

I never really minded working in C++ once I got used to its quirks, but I can’t really say I’m missing it either now that I’ve been mostly working in Java lately…

But that’s arbitrary. Functions that are created just as notational shorthand end up polluting your global symbol space and increase your interactional surface area, neither of which are good things. “100 lines” or “one page” (which are rules of thumb usually taught in college) are not particularly good metrics for changing the architecture of a program.

That said, yes, it can look gross if you don’t have an editor that supports code folding. If you do, then it’s the difference between:


GatherInputs();
DoSimulation();
RenderScene();

and:


[+] // gather inputs
[+] // do simulation
[+] // render scene

To combat local symbol space explosion, I also aggressively micro-scope with braces and use const variables almost exclusively. I differentiate between the concept of a constant and a variable, even though C/C++ kind of make this hard.


// This is a tight local scope
{
   float const kX = ... // some computation
   float const kY = ... // some computation

   callSomeFunction(kX,kY);
}

Between the scoping and aggressive const usage the likelihood of fucking up and reusing a variable declared 1000 lines earlier is very low. I also get the benefit that I can share information across scopes whereas before it would be either argument passing galore or a global variable.

But that’s just my style, and while I recognize a lot of people disdain it stylistically, I’m totally comfortable with that, in the same way I was totally comfortable hating on C++ back in 1997 when everyone thought it was the way of the future. =)

I tend to put the modifiers with the type also, but I was referring to what is actually happening as far as the compiler is concerned, not the style of the code. “int* x, y” and “int *x, *y” are both valid and result in different types of “y”.

I never really minded working in C++ once I got used to its quirks, but I can’t really say I’m missing it either now that I’ve been mostly working in Java lately…

I’m missing it. I’ve spent most of my career working in C/C++. My current job has switching between languages depending on the platform (iPhone vs BlackBerry/J2ME). I’m finding in this sort of environment I really miss having the better hold of memory and performance that C/C++ gives me over Java.

I always disdain code arguments that depend on editor features. You’re not always looking at code in an a fancy editor, nor should an editor be required to make reading your code not suck. If it’s not readable in Notepad, it’s not readable.

But that’s just my style, and while I recognize a lot of people disdain it stylistically, I’m totally comfortable with that, in the same way I was totally comfortable hating on C++ back in 1997 when everyone thought it was the way of the future. =)

You realize your style effectively re-invents functions by way of scoping? How on earth do you unit test something like that?

You must hate any text files that don’t use CR/LF then right?

Anyway, the issue of ‘readability’ has a local and a global meaning when performing static code analysis. While you can argue that heavy scoping and lack of function calls makes the local flow less visible (if you refuse to use code folding – again, I don’t really care how ‘readable’ my code is when viewed in Notepad or less/more since that is not a real world use case), it actually improves system wide code understanding (due to reduced symbol pollution) and local code dependencies (since side effects are immediately visible).

No, it doesn’t, since a function is not the same as a scope. In a lexically scoped language then functions can be used AS a scope, but functions also have implications on usability – specifically, the appearance of a non-anonymous/lambda function implies that it can (and likely will) be called by more than a single caller.

In my experience it is much better to just remove the call or to locally scope the function until it is demonstrated that reuse is actually required, at which point you can go back and guarantee that the code fragment has all the properties necessary to be safely called from other locations.

Well, I’ve never seen a 5000 line function that didn’t have a lot of copypasta in it, or that could have been written differently (e.g., really big switch/case state machines).

I code regularly and reliably in vi/vim/gvim. To the best of my knowledge it doesn’t support code folding. I wouldn’t think that using standard unix editors (last I used emacs, it too didn’t do code folding, though that’s been a while) would be an unlikely real-world use case.

Another nice feature of local scoping is that if you DO have a piece of code you need to reuse, the work required to yank it out and make a function out of it has mostly already been done (outside of passing in the context as parameters).

If you’re using emacs, there’s at least one way to do it. But then the question becomes, “Why aren’t you using Eclipse or Kdevelop or something modern?”

Well, because my code execution environment is remote from any modern code development environment and there’s a lot of tweaking that gets done in the execution environment. There, the only available choice is vi/vim due to the fact that our only way in is via SSH with a non-graphical installed environment (heavy number-crunchery machines, no overhead available for niceties like an X server) and the fact that, well, I’m also the sysadmin (so there -may- be solutions that don’t place a load on the machine, but sysadmin/programmer/full-time scientist is plenty of work without trying to solve problems that aren’t that important; ah, academia!). As a result, it’s generally easiest to use the least common denominator.

Apparently emacs, due to its ability to execute actual code with its lisp interpreter posed a security risk at some point so we don’t use emacs (before my time), and again, it’s not worth the headache of trying to fix the security issue. When I undertake my next actual code building (instead of tweaking/massaging/fixing) project, I’ll probably look into at least a local setup that lets me do the bulk first-phase building with a better environment. I actually installed one in my linux VM when I had to rebuild my laptop recently, but, well, I can’t figure out how the hell it’s supposed to work (Anjuta; adding source files and the like confuses the fuck out of me - Visual Studio is a lot easier to work with). I’ll look into the others you mentioned when I get a chance to see if they’re better.

If you’re doing copy/paste to 5000 lines, then clearly either you’re aggressively (and probably misguidedly) doing loop unrolling, or you’re not putting in functions where they should go.

For example, if you have a bunch of matrix multiplies and you unroll them for no reason, then a Mat4x4CatMat4x4() function is probably called for.

In my case, the areas where I have massive functions are straight line pieces of code that are expressly generating side effects.

For example, let’s say I have a file reader that pulls in some chunked style file format. I could do something like:


FILE* fp = openFile(fname);

readHeader(fp,&hdr);
readMeshes(fp,&meshes);
readBlah(fp,&blah);
readFoo(fp,&foo);
...
closeFile(fp);

In my current code that explodes into each of those things inlined.

Again, code folding and proper scoping make it manageable.


FILE* fp = openFile( fname );

// read header
[+] { ... }
// read meshes
[+] { ... }
// read blah
[+] { ... }
// read foo
[+] { ... }
closeFile( fp );

The advantage I gain is that until I need to reuse that functionality, it stays local and all side effects are visible in the surrounding code, e.g. any operations on the FILE* are immediately obvious and I don’t have to guess where the file position will be after a call, if memory was allocated, which parameters to pass, etc.

If I find that I need to use a piece of code multiple times but only local to that function, I will create a local function. For example, a comparison function that is only applicable or relevant to one other function.

Finally, this has significant implications for multithreading/multicore software. The more entry points you have, the more potential areas for race conditions, dead locks and lock churn.

If C/C++ had anonymous/lambda functions then a lot of my use cases would go away for that particular hack.

Note that I’m not saying “this is the best style”, I’m saying that it addresses a lot of the issues I have with larger code bases. For me, personally, it has proven to be hugely effective for maintainability and minimizing defect rate. I do not think trading a 5000 line function for 50 x 100 line functions is the right trade off, but if others do, rock out.

Again, I’m totally comfortable with others not liking this style, you either get it or you don’t, in the same way I hate freeform jazz, olives, and bell peppers – I totally understand why others love that shit but I simply can’t. =)

Based on the description of your development environment, I would still argue that is not a real-world use case. =)

OK, that’s legit, but it’s not common.

I should probably have clarified that this is code I’ve seen, not code I’ve written. And you are being far too kind to the original developers with your assumptions here. There are other reasons why people do copy/paste to 5000 lines, such as, “Doesn’t know how to write a for loop” or “Thinks that inlining will dramatically improve performance” and shit like that.

I’ve worked with a lot of folks who are brilliant in their field, but not trained as professional software engineers – you need them to implement their ideas, but the code is awful.

I don’t know that our setup is all that unrepresentative of the academic science subset. There are exceptions where academia is more in line with modern code generation, but there are still a whole lot of scientists out there who program their own stuff as a kind of unintended byproduct of their work. Our setup is probably equally or less restrictive than most government labs, for example.

I think a lot of the resistance to that mega-function style would be from things like the modern emphasis on unit testing; when functions become that big, it becomes very difficult to test them in isolation.

If one of our junior programmers checked in a 5000-line function, I’d be sending them an “Um, no, rewrite it. And where are the unit tests?” note. But that’s just our corporate standards and you probably wouldn’t want to work for us anyway. :)

For me it is more of just a bit of an OCD thing. As Brian said, it IS arbitrary (IMO anyway), but a lot of the things I like/dislike about code style are arbitrary. There are even cases where one style of code convention has some real world benefit but I just can’t bring myself to use it (eg. when comparing against a constant, list the constant first so that if you mistype the == as = you’ll get a compiler error). I totally get why that is a good idea and yet because (in natural-language terms) it feels so unnatural to me to list the constant first I can’t bring myself to follow that convention in my code.

Also, FWIW, I’m not down with code folding, even when I’m coding in languages where code folding is explicitly supported (like C#). There are many modern IDE advancements I love like continuous compiling, advanced syntax coloring, refactoring tools, etc, but I’m old school when it comes to code folding. I always build this big mental map of where things are pseudo-physically in my project files based on distance from other things and code folding screws with that so much I just can’t do it even if the IDE has great folding support and an associated project browser.