Notes tagged topic/databases (1)

I like the way dvc (Archived) allows you to manage the reproducibility of pipelines, but for my PhD I found myself multiple times in the need to execute a single pipeline multiple times with different parameters and to collect data in an homogeneous way during the multiple runs, so I ended up abusing dvc a bit for my goals.

Raising the issue with a friend, he told me that he uses an SQLite (Archived) database stored under git and an ORM (Archived) to write records in it. I liked the solution, except for the part that makes you store binary data inside of git.

So I searched for a way to make the databases behave like plain text files under git, but to be fully functional on checkout. It turns out that it is possible, as well as easy to do, and you also enjoy line-wise diffs of your databases!