[Solved] How does git generate diff in files?


Your original question asked about binary files, which in Git, means “files that Git has decided are not text”. For such files, unless you provide a special diff driver, Git does not attempt to generate a diff, it only says “these two files are the same” or “these two files are different”. (A diff driver is an external program: you can instruct Git to run this program instead, and this program can do whatever it wants to do with the pair of files, to generate a useable diff.)

Your updated question, at least as of this time, asks about diffing text files. Git has built into it a modified version of LibXDiff. The main algorithm here is due to Eugene Myers. See also Myers diff algorithm vs Hunt–McIlroy algorithm. For a somewhat more user-friendly introduction to diff algorithms, see the last section of chapter 3 of my stalled book. You are in fact onto something with the idea of line hashes: these diff algorithms compare symbols, and using a line-hash as the symbols in the diff matrix is how they find line-by-line diffs.

solved How does git generate diff in files?