The real issue is that in practice it is often difficult to separate these two changes. You may correct code in one place which requires correcting code elsewhere in a cascading effect. If you limit your correction to just one area (in a single commit) then you introduce the possibility of breaking the build. The other issue is that if you fix all the whitespace stuff first you may find the owners are not willing to pull in your commit for various reasons. I often find in open source the smaller the change the more likely it is to be merged. It's a complicated area and unfortunately often the "two different commits" strategy doesn't encompass the full complexity.
Oh and another cool feature is worddiffs for when you're editing text.
Whitespace changes in an open source project are the kind of thing that depends on the code style enforced at the project level - certainly not something you would submit as a newcomer. Do you happen to have an example of what you're explaining here?
My editor is setup to automatically trim trailing spaces. When I edit files from other devs in the group, it's not uncommon for my commits to have whitespace removal in addition to code changes. I really don't think that's a big deal.
Now doing a code format to tweak indenting and brace positioning, etc., that should definitely be put into a separate commit.
Indeed, when I come across empty lines, white space at the end of the lines etc. I remove it all, commit it and then start actually changing the code. White space in diffs is distracting from the actual changes.
Well, if people used a decent editor to start with . . . but that's not a flame war I want to start (at least not today). I've been tempted to turn on checks in Jenkins that would automatically reject certain "broken" changesets: whitespace at EOL (easy enough to check by using the already provided .git/hooks/pre-commit.sample), reduction in number of unit/regression tests (can be overridden by hand, after conferring with at least one other developer), etc.
We have these awesomely powerful systems, with this awesomely powerful software, that oftentimes I'm depressed by people doing things by hand. Heck, if I was really not willing to deal with whitespace retards, I'd just make a hook on the server auto indent and whitespace cleanup all changes before commit.
Can you do that with a server hook? I've looked into this before and thought it had to be a hook in the client repository, as it would modify the commit hash. This is problematic because it requires everyone to add it to all of their repositories themselves.
I could be wrong (I haven't delved too deeply into this), but I think it requires a code review system (such as Gerrit) hooked up to a CI (such as Jenkins). Then you have the code review system require the CI "sign off" (whatever criteria you please) on changesets before they can be approved by a human; no sign off by the CI, no merging the changeset. Of course, if the CI tests are dependent upon something in the repo, this can be defeated . . .
As usual, it comes down to trust: if you can't trust your developers to write good code or not subvert the system, you either need to educate them, or work with better developers (fire them, or quit and work somewhere better). The point at which you have to start enforcing things by machine fiat can be helpful to catch honest mistakes, but it can also be a sign of a dysfunctional team.
This is what you call a "boil the ocean" solution.
All that it takes is that every single user behaves exactly correctly. That's an indication of a poor design. Tools should naturally guide people to the correct usage and be moderately forgiving of minor examples of incorrect usage.
For whitespace fixes (making a mix of tabs vs spaces consistent), I agree with you.
For "I wrapped these 5 lines in a conditional and adjusted the indentation per code standard" in a language like C# I disagree with you, the first commit is ugly and leaves the codebase (temporarily) out of sync with codebase style, which may fail the build or fail to submit in the first place if you enforce such things via precommit hook or build process.
For the same operation in python, where semantics changes are attached to indentation level changes, the first commit is BROKEN, WRONG, AND BAD. The build not failing the first commit reflects poorly on the unit tests.
Increasing the readability of a diff is a noble goal. Splitting the diff in 'twain can help, but it's not a full answer. Neither is keeping commits as small as possible: simple renames can easily touch 30 files at a time, to say nothing of adding parameters or more complicated refactorings which must be committed together or leaves the codebase in a broken state.
Other tools include highlighting difference within a line instead of merely at a per-line level. Ignoring whitespace changes in languages which aren't whitespace sensitive can also help: I personally find the two highlighted lines of an if statement and it's end more informative than the seven highlighted lines which include it's functionally unchanged body that's merely changed scope.
I assume if you're using `git diff`, you're not comparing adjacent commits. And asking git for "diff these two commits but ignore this third one" is nontrivial.
Only reason why is to make pretty diffs. Now git has options so people don't have to deal with anal retentive details like that any more. The best option is of course to have a pre-commit hook that strips trailing whitespace from lines. The absolutely worst option is to force people to spend brain cycles dealing with that stuff.
Or use a diff tool that can do this automatically..? [1] Diff without the ability to edit the changes you see is only about 10% as useful as the ability to edit those diffs inline, change how it's decided to compare the two (or three) files (match THIS line with THIS OTHER line...), or trivially revert the changes a line at a time. [2]
I hop back and forth between GUI-based editors and the command line. Some things are just better in a GUI, and diff is one of those things--especially any nontrivial diff.
Well I don't disagree, even if everybody else does. I've long been a user of Araxis Merge, which is not only visual, but also lets you edit the diffs as you go, which comes in super handy for all the reasons that you state.
I didn't realise how spoiled I was until I started using git, and found the hard way how limiting the text diff format can be. Column-oriented changes in particular are very difficult to figure out, and even for more readable sets of changes I always found it a bit hard to get a rough idea of how spread out or clustered together the changes are. git gui and SourceTree try their best to help, but ultimately they're just displaying a prettified version of the same sometimes-unhelpful view of things.
On the other hand, if you're making small changes to a lot of files, a text diff is just the thing. It would be hyperbole to state that there's nothing worse than telling Araxis Merge to do a folder compare and find out that 1,000+ files have had 1 line changed in them - but it is rather annoying, and very time-consuming to investigate thoroughly. You need a text diff for that sort of thing.
As I said, I hop back and forth between GUI and command line; I haven't personally needed to deal with a folder diff with 1000+ files with one line changed each, but if I did, command line seems like a reasonable option.
Beyond Compare (linked above) lets you set patterns to ignore, which has been incredibly useful for when an autogenerated time/date stamp is changed in 1000+ files; I can say "ignore that change; tell me what changed that DOESN'T match the pattern".
You can do the same thing with grep on the command line of course. But then you lose out on all the GUI awesomeness.
I've used Araxis, and it's Even More Awesome, but 5x the cost (last I checked) as Beyond Compare, and I didn't think it was 5x as awesome. BC has gotten much cooler since then too (with comparison plug-ins, and bitmap compare -- let's see diff compare two bitmaps and tell me which pixels have changed! -- and DOC file compare, and normalized XML file compare, and so on ;).
You should make two different commits, changing indentation and changing a feature are different changes and should be tracked independently.