> And listing files is slow. While the joy of Amazon S3 is that you can read and...

catlifeonmars · on March 10, 2024

S3 is fundamentally a key value store. The fact that you can view objects in “directories” is nothing more than a prefix filter. It is not a file system and has no concept of directories.

anonymous-panda · on March 10, 2024

Directories make up a hierarchical filesystem, but it’s not a necessary condition. A filesystem at its core is just a way of organizing files. If you’re storing and organizing files in s3 then it’s a filesystem for you. Saying it’s “fundamentally a key value store” like it’s something different is confusing because a filesystem is just a key value store of path to contents of file.

Indeed there’s every reason to believe that a modern file system would perform significantly faster if the hierarchy was implemented as a prefix filter than actually maintaining the hierarchical data structures (at least for most operations). You can guess that this might be the case that file creation is extremely slow on modern file systems (on the order of hundreds or maybe thousands per second on a modern NVME disk that can otherwise do millions of IOPs and listing the contents of an extremely large directory is exceedingly slow)

catlifeonmars · on March 10, 2024

In context of the comment I was addressing, it’s clear that filesystem means more than just a key value store. I’d argue that this is generally true in common vernacular.

anonymous-panda · on March 10, 2024

This is a technical website discussing the nuances of filesystems. Common vernacular is how you choose to define it but even the Wikipedia definition says that directories and hierarchy are just one property of some filesystems. That they became the dominant model on local machines doesn’t take away from the more general definition that can describe distributed filesystems.

jmull · on March 11, 2024

I'm kind of chuckling at this thread because you're working so hard to not understand.

I think the previous poster could/should have said, "It is not a hierarchical file system and has no concept of directories." where I added the word "hierarchical".

But it's also pretty obvious that was the point.

anonymous-panda · on March 12, 2024

I disagree with that characterization because the contrast by OP was that S3 is “just a KV store implying” it doesn’t meet the criteria for being considered a filesystem.

For example, you could implement POSIX directory semantics on top of S3. About the only POSIX filesystem API you couldn’t implement it append / overwrite (well you could but it might be prohibitively expensive).

senderista · on March 10, 2024

A real hierarchy makes global constraints easier to scale, e.g. globally unique names or hierarchical access controls. These policies only need to scale to a single node rather than to the whole namespace (via some sort of global index).

mistrial9 · on March 11, 2024

no - a filesystem implementation on an ordinary OS has more than what you mention, including interfaces to disk device drivers

Spivak · on March 10, 2024

If I wanted to use S3 as a filesystem in the manner people are describing I would probably start looking at storing filesystem metadata in a sidecar database so you can get directory listings, permissions bits, xattrs and only have to round-trip to S3 when you need the content.

SOLAR_FIELDS · on March 10, 2024

Isn't this essentially what systems like Minio and SeaweedFS do with their S3 integrations/mirroring/caching? What you describe sounds a lot like SeaweedFS Filer when backed by S3

electroly · on March 10, 2024

The way that you said "recursively" and spent a lot of time describing "directories" and "levels" worries me. The fastest way to list objects in S3 wouldn't involve recursion at all; you just list all objects under a prefix. If you're using the path delimiter to pretend that S3 keys are a folder structure (they're not) and go "folder by folder", it's going to be way slower. When calling ListObjectsV2, make sure you are NOT passing "delimiter". The "directories" and "levels" have no impact on performance when you're not using the delimiter functionality. Split the one list operation into multiple parallel lists on separate prefixes to attain any total time goal you'd like.

blakesley · on March 10, 2024

All these comments saying merely "S3 has no concept of directories" without an explanation (or at least a link to an explanation) are pretty unhelpful, IMO. I dismissed your comment, but then I came upon this later one explaining why: https://news.ycombinator.com/item?id=39660445

After reading that, I now understand your comment.

electroly · on March 10, 2024

I appreciate you sharing that point of view. There's a "curse of knowledge" effect with AWS where its card-carrying proponents (myself included) lose perspective on how complex it actually is.

petters · on March 10, 2024

Yes, this is very good advice and will likely solve their problem

jameshart · on March 10, 2024

A fun corollary of this issue:

Deleting an S3 bucket is nontrivial!

You can't delete a bucket with objects in it. And you can't just tell S3 to delete all the objects. You need to send individual API requests to S3 to delete each object. Which means sending requests to S3 to list out the objects, 1000 at a time. Which takes time. And those list calls cost money to execute.

This is a good summary of the situation: https://cloudcasts.io/article/deleting-an-s3-bucket-costs-mo...

The fastest way to quickly dispose of an S3 bucket turns out to be to delete the AWS account it belongs to.

electroly · on March 10, 2024

No, don't do that. Set up a lifecycle rule that expires all of the objects and wait 24 hours. You won't pay for API calls and even the cost of storing the objects themselves is waived once they are marked for expiration.

The article has a mistake about this too: expirations do NOT count as lifecycle transitions and you don't get charged as such. You will, of course, get charged if you prematurely delete objects that are in a storage class with a minimum storage duration that they haven't reached yet. This is what they're actually talking about when they mention Infrequent Access and other lower tiers.

jameshart · on March 10, 2024

Still counts as nontrivial.

electroly · on March 10, 2024

This is really easy; much easier than trying to delete them by hand. AWS does all the work for you. It takes longer to log into the AWS Management Console than it does to set up this lifecycle rule.

orf · on March 11, 2024

Literally 1 API call.

jameshart · on March 11, 2024

Two. The one to set up the lifecycle rule. Then the one to delete the bucket, some number of hours later.

orf · on March 11, 2024

Incorrect. One call to trigger a step function that sets up the lifecycle rule, sleeps for 24 hours and then deletes the bucket.

Stop being silly, as if 1 vs 2 API calls matters. You should empty large buckets with lifecycle policies. It's trivial.

jameshart · on March 11, 2024

Imagine for a second you’re a Unix user, familiar with the rm command.

Imagine you are using windows for the first time and you want to delete a directory, so you find an answer on Serverfault that explains that to do so you need to spin up a COM object that marks the directory for deletion, then the next day comes back and deletes it.

You might be inclined to say ‘that seems overly complicated’.

The original answerer is confused though. ‘It’s trivial, stop being silly. Can you think of a simpler way to delete a directory?’

Do you see now why I thought the ‘non triviality’ of deleting an S3 bucket was perhaps relevant in a discussion on an article about why S3 is both simpler and more complex than a file system?

And why your approach might not actually be making the case for it being as simple as you think?

orf · on March 11, 2024

Right click, move to recycle bin, wait for the progress bar to finish. Except the progress bar takes a day or so.

This is only needed if you have a huge (100 million+) bucket, at which point you should be experienced with s3, otherwise you can just click the big, clear and obvious “empty bucket” button on the console.

anonymous-panda · on March 10, 2024

I think it’s far more mundane a reason. You can list 10k objects per request and getting the next 10k requires the result of the previous request, so it’s all serial. That means to list 1M files, you’re looking at 100 back to back requests. Assuming a ping time of 50ms, that’s easily 5s of just going back and forth, not including the cost of doing the listing itself on a flat iteration. The cost of a 10k item list is about the cost of a write which is kinda slow. Additionally, I suspect each listing is a strongly consistent snapshot which adds to the cost of the operation (it can be hard to provide an inconsistent view).

I don’t think btrees would help unless you’re doing directory traversals, but even then I suspect that’s not that beneficial as your bottleneck is going to be the network operations and exposed operations. Ultimately, file listing isn’t that critical a use case and typically most use cases are accomplished through things like object lifecycles where you tell S3 what you want done and it does it efficiently at the FS layer for you.

tsimionescu · on March 10, 2024

That's 5s of a 15m duration. I don't think it matters in the least.

anonymous-panda · on March 10, 2024

Depends how you’re iterating. If your iterating by hierarchy level, then you could easily see this being several orders of magnitude more requests.

perryizgr8 · on March 10, 2024

It's not a good model to think of S3 has having directories in a bucket. It's all objects. The web interface has a visual way of representing prefixes separated by slashes. But that's just a nice way to present the objects. Each object has a key, and that key can contain slashes, and you can think of each segment to be a directory for your ease of mind.

But that illusion breaks when you try to do operations you usually do with/on directories.

returningfory2 · on March 10, 2024

Are you performing list calls sequentially? If you have O(100k) directories and are doing O(100k) requests sequentially, 15 minutes works out at O(10ms) per request which doesn’t seem that bad? (assuming my math is correct…)

luhn · on March 10, 2024

At risk of being pedantic, you seem to be using big O to mean “approximately” or “in the order of”, but that’s not what it means at all. Big O is an expression of the growth rate of a function. Any constant value has a growth rate of 0, so O(100k) isn’t meaningful: It’s exactly the same as O(1).

wetmore · on March 11, 2024

You're right technically, it's an abuse of notation that isn't uncommon. My physics profs would do it in college.

returningfory2 · on March 11, 2024

Fair point, I guess the notation ~100k, ~10ms would be better.

jamesrat · on March 10, 2024

I implemented a solution by threading the listing. Get the files in the root then spin a separate process to do the recursion for each directory.

jasonwatkinspdx · on March 10, 2024

> Why is this something Amazon has not fixed?

It's common to store metadata on DynamoDB where it can be queried, and just have whatever arbitrary links to the values in the buckets.

rakoo · on March 11, 2024

> Why is this something Amazon has not fixed? From the outside really seems like they could slap some B-trees on the individual buckets and call it a day.

They fixed it already, it's called DynamoDB. With some SQS and Lambda glue you can index your S3 content in any way you want for later retrieval.

inopinatus · on March 11, 2024

Take this opportunity to read the docs and discard assumptions. Enumerating buckets as though they’re directories will seem peculiar when you understand it is designed for billions of items and up. Index your objects separately, in whatever form makes sense to your application.

quasarj · on March 10, 2024

It's not "fixed" because it's not a problem. You're just using it wrong.

paulddraper · on March 11, 2024

> Recursively listing these files

There's no "recursive" nature to S3 buckets. "Listing a directory" is simply listing keys by a prefix.

So list by the upper-most prefix that you want. If you have 1,000,000 files, it will take 1,000 API calls to list everything.

If each call takes 1s (I have no idea what your latency to the S3 bucket region is), then it will indeed take 15 min.

https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObje...