[Solved] Can Git store file containers as trees and blobs? [duplicate]


Most of the commands in git do expect to find one of the 4 words blob, tree, commit or tag at the beginning of each object, it will be close to impossible to add a new object type.

Here is a manual experiment :

# I created an object with a new type 'foo' :
$ cat .git/objects/70/c52a28ff2b01f46ccc0cdd03c61c569fd6fd54 | pigz -dz; echo
foo10.abcdefghij    # the '.' is actually '\0'

# all regular git commands start with a "unable to parse header of [object]" :
$ git show 70c52a28ff2b01f46ccc0cdd03c61c569fd6fd54
error: unable to parse 70c52a28ff2b01f46ccc0cdd03c61c569fd6fd54 header
error: unable to parse 70c52a28ff2b01f46ccc0cdd03c61c569fd6fd54 header
fatal: loose object 70c52a28ff2b01f46ccc0cdd03c61c569fd6fd54 (stored in .git/objects/70/c52a28ff2b01f46ccc0cdd03c61c569fd6fd54) is corrupt

$ git fsck
error: unable to parse header of .git/objects/70/c52a28ff2b01f46ccc0cdd03c61c569fd6fd54
error: 70c52a28ff2b01f46ccc0cdd03c61c569fd6fd54: object corrupt or missing: .git/objects/70/c52a28ff2b01f46ccc0cdd03c61c569fd6fd54
Checking object directories: 100% (256/256), done.

# etc ...

A possibility would be to write a more complete smudge/clean filter, which would not only store the zip actual content, but all of the extra data (such as timestamps, comments …)

Here is one first idea :

if archive.zip contains a dir\file.txt :

  • create a tree named dir
  • store the directory header in a blob with a known name (dheader for example)
  • store the header and the content for file.txt in two distinct blobs (hfile.txt and _file.txt for example)
  • etc for other zip metadata

using distinct prefixes should allow you to have a clear separation between each type of data you need to store

A second one would be :

  • manage to pack all of the arhive’s metadata in one single blob

etc …

The clean filter would then have enough data to rebuild the same archive.

Note that “rebuilding the zip file” would require the clean filter to implement all possible features of a zip archive (e.g : being able to compress in all known formats, …)

1

solved Can Git store file containers as trees and blobs? [duplicate]