Emi Matchu
8dc11f9940
I'm starting to port over the functionality that was previously just, me running `yarn db:export:public-data` in `impress-2020` and committing it to Git LFS every time. My immediate motivation is that the `impress-2020` git repository is getting weirdly large?? Idk how these 40MB files have blown up to a solid 16GB of Git LFS data (we don't have THAT many!!!), but I guess there's something about Git LFS's architecture and disk usage that I'm not understanding. So, let's move to a simpler system in which we don't bind the public data to the codebase, but instead just regularly dump it in production and make it available for download. This change adds the `rails public_data:commit` task, which when run in production will make the latest available at `https://impress.openneo.net/public-data/latest.sql.gz`, and will also store a running log of previous dumps, viewable at `https://impress.openneo.net/public-data/`. Things left to do: 1. Create a `rails public_data:pull` task, to download `latest.sql.gz` and import it into the local development database. 2. Set up a cron job to dump this out regularly, idk maybe weekly? That will grow, but not very fast (about 2GB per year), and we can add logic to rotate out old ones if it starts to grow too far. (If we wanted to get really intricate, we could do like, daily for the past week, then weekly for the past 3 months, then monthly for the past year, idk. There must be tools that do this!) |
||
---|---|---|
.. | ||
.gitkeep | ||
db.rake | ||
pets.rake | ||
public_data.rake | ||
swf_assets.rake |