remix logo

Hacker Remix

S3 as a Git remote and LFS server

196 points by kbumsik 4 days ago | 50 comments

CGamesPlay 3 days ago

If you are interested in using S3 as a git remote but are concerned with privacy, I built a tool a while ago to use S3 as an untrusted git remote using Restic. https://github.com/CGamesPlay/git-remote-restic

mdaniel 4 days ago

All this mocking when moto exists is just :-( https://github.com/awslabs/git-remote-s3/blob/v0.1.19/test/r...

Actually, moto is just one bandaid for that problem - there are SO MANY s3 storage implementations, including the pre-license-switch Apache 2 version of minio (one need not use a bleeding edge for something as relatively stable as the S3 Api)

notpushkin 4 days ago

> there are SO MANY s3 storage implementations

I suppose given this is under the AWS Labs org, they don’t really care about non-AWS S3 implementations.

mdaniel 4 days ago

Well, I look forward to their `docker run awslabs/the-real-s3:latest` implementation then. Until such time, monkeypatching api calls to always give the exact answer the consumer is looking for is damn cheating

notpushkin 4 days ago

Agreed, haha. Well, I think it should work with Minio & co. just as well, but be prepared to have your issues closed as unsupported. (Pesonally, I might give it a go with Backblaze B2 just to play around, yeah)

chrsig 4 days ago

it wouldn't be unprecedented. dynamodb-local exists.

SahAssar 4 days ago

Do you mean boto (the python SDK for AWS)?

EDIT: They probably do not, I'm guessing they mean https://docs.getmoto.org/en/latest/index.html ?

flakes 4 days ago

moto server for testing S3 is pretty great. It’s about the same experience as using a minio container to run integration tests against.

I use this, and testing.postgresql for unit testing my api servers with barely any mocks used at all.

neeleshs 4 days ago

There is also testcontainers. Supports multiple languages. Uses containers though.

https://testcontainers-python.readthedocs.io/en/latest/

mdaniel 4 days ago

Happy 10,000th Day to you :-D Yes, moto and its friend localstack are just fantastic for being able to play with AWS without spending money, or to reproduce kabooms that only happen once a month with the real API

I believe moto has an "embedded" version such that one need not even have in listen on a network port, but I find it much, much less mental gymnastics to just supersede the "endpoint" address in the actual AWS SDKs to point to 127.0.0.1:4566 and off to the races. The AWS SDKs are even so friendly as to not mandate TLS or have allowlists of endpoint addresses, unlike their misguided Azure colleagues

SahAssar 4 days ago

> Happy 10,000th Day to you :-D

Sorry, not sure what you mean?

mdaniel 4 days ago

misnome 3 days ago

How do you know they are in the US?

remram 3 days ago

Unfortunately there's been a few vulnerability since that old Minio release. For something you expose to users, it's a problem.

mdaniel 3 days ago

I would hope my mentioning moto made it clear my comment was about having an S3 implementation for testing. Presumably one should not expose moto to users, either

Scribbd 4 days ago

This is something I was trying to implement myself. I am surprised it can be done with just an s3 bucket. I was messing with API Gateways, Lambda functions and DynamoDB tables to support the s3 bucket. It didn't occur to me to implement it client side. I might have stuck a bit too much to the lfs test server implementation. https://github.com/git-lfs/lfs-test-server

chx 4 days ago

Client side is, while interesting, of limited use as every CI and similar tool won't work this. This seems like a sort of automation of wormhole which I guess is neat https://github.com/cxw42/git-tools/blob/master/wormhole

zmmmmm 3 days ago

Just remember, the mininum billing increment for file size is 128KB in real AWS S3. So your Git repo may be a lot more expensive than you would think if you have a giant source tree full of small files.

justin_oaks 3 days ago

That 128KB only applies to non-standard S3 storage tiers (glacier, infrequent access, one zone, etc)

S3 standard, which is likely what people would use for git storage, doesn't have that minimum file size charge.

See the asterisk sections in https://aws.amazon.com/s3/pricing/

zmmmmm 3 days ago

Thank you for highlighting that, I had remembered it wrongly.

afro88 3 days ago

Looks like it uses bundles rather than raw files: https://github.com/awslabs/git-remote-s3?tab=readme-ov-file#...

chrsig 3 days ago

also the puts are 5x as expensive as the get operations