FAMF: Files As Metadata Format

Perma@programming.dev · 5 months ago

FAMF: Files As Metadata Format

Dark Arc@social.packetloss.gg · 5 months ago

I’m a bit skeptical about the performance penalty. I know there’s a benchmark but I didn’t see any details of what was actually benchmarked and where. Windows (AFAIK) still has notoriously slow directory traversal operations. God forbid you’re using SSHFS or even NFS. I’ve seen things with hundreds of YAML nodes before.

Benchmarking this is also tricky because the OS file cache will almost certainly make the second time faster than the first (and probably by a lot).

Also just the usability… I think opening a file to change one value is extreme. You also still have the problem of documentation… Which sure you can solve by putting that in another file, but… You can also do that with just plain old JSON.

I think in the majority of languages, writing a library to process these files would also be more complicated than writing a JSON parser or using an existing library.

Also how do you handle trailing whitespace inserted by a text editor? Do you drop it? Keep it? It probably doesn’t matter as long as the configuration is just for a particular program. The program just needs to document it… But then you’ve got ambiguities between programs that you just don’t have to worry about with TOML or JSON.

Perma@programming.dev · 5 months ago

OK so, you are very much right. You should definitely benchmark it using a simulation of what your data might look like. It should not be that hard. Just make script, that creates bunch of files similar to your data. About the trailing white space, when I am in terminal I just use sed to remove the latest ‘\n’ and in rust I just use .trim(), in go I think there is strings.trim(). It is honestly not that hard. The data structure and parser is not formed the same way as the json, where you have to parse the whole thing. So you don’t have to. You just open the files you need read their content. It is a bit more difficult at first since you can’t just translate a whole struct directly, but it pays for itself when you want to migrate the data to a new format. So if your structure never changes, probably those formats are easier.

Dark Arc@social.packetloss.gg · 5 months ago

You should definitely benchmark it using a simulation of what your data might look like. It should not be that hard. Just make script, that creates bunch of files similar to your data.

Right, it’s just kind of a thing to think about. If your program is something that might conceivably be used of sshfs (as an example) … this is probably not a great option for your program’s configuration.

The data structure and parser is not formed the same way as the json, where you have to parse the whole thing. So you don’t have to. You just open the files you need read their content. It is a bit more difficult at first since you can’t just translate a whole struct directly, but it pays for itself when you want to migrate the data to a new format. So if your structure never changes, probably those formats are easier.

Well a very common thing is to create a “config” object that lives in the long running process (and in some cases can be reloaded without restarting the program).

That model also saves you from unnecessary repeated IO operations (without one off caching and reloading mechanisms) and allows you to centralize any validation (which also means you can give configuration errors on start up).

I do wish various formats were more “streaming” friendly, but configuration isn’t really one of them.

In a lot of languages moving between formats is also fairly trivial because the XYZ markup parser parses things into an object map and the ZYK markup writer can write an object map into ZYK format.

Maybe I’m not understanding what you mean by migrating the data to a new format though.

Perma@programming.dev · 5 months ago

OK so, for example if you have to change the structure of the configuration file, in a statically typed language. You have to have two representation of the data, the old one, and the new one.You have to first deserialize the data, in the old format, then convert it back to the new format, then replace the old files. The FAMF alternative, allows you just to easily use copy and paste and delete to achieve the same goal. Please keep in mind that you can just make configuration data structure that you can keep in-memory. It is just that the representation of the persisted information is spread between different files and not just one file.

FAMF: Files As Metadata Format

FAMF: Files As Metadata Format

PRMA::Files As Metadata Format