蟒蛇博托为AWS S3,如何让分类和有限的文件列表中斗?蟒蛇、文件、列表中、AWS

由网友(好好活下去)分享简介:如果有一个水桶太多的文件,我想只得到100最新的文件,我怎样才能得到只有这些名单?If There are too many files on a bucket, and I want to get only 100 newest files,How can I get only these list?s3.bu...

如果有一个水桶太多的文件,我想只得到100最新的文件, 我怎样才能得到只有这些名单?

If There are too many files on a bucket, and I want to get only 100 newest files, How can I get only these list?

s3.bucket.list < /一>似乎不具有的功能。是否有任何人,谁知道这一点?

s3.bucket.list seems not to have that function. Is there anybody who know this?



有没有办法做到这一点类型的过滤服务端。在S3 API不支持它。你也许可以做到这样的事情在你的对象名称使用 prefixes 。例如,如果您使用的模式是这样命名的所有对象:

There is no way to do this type of filtering on the service side. The S3 API does not support it. You might be able to accomplish something like this by using prefixes in your object names. For example, if you named all of your objects using a pattern like this:

20140618/foobar (as an example)

您可以使用在S3中的 ListBucket 请求的 preFIX 参数只返回对象储存今天。在博托,这将是这样的:

you could use the prefix parameter of the ListBucket request in S3 to return only the object that were stored today. In boto, this would look like:

import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket')
for key in bucket.list(prefix='20140618'):
    # do something with the key object

您仍然要检索的所有对象与preFIX然后根据本地对它们进行排序的 LAST_MODIFIED_DATE 但是这将是比列出了所有容易得多在挖斗和斗然后排序的对象的

You would still have to retrieve all of the objects with that prefix and then sort them locally based on their last_modified_date but that would be much easier than listing all of the objects in the bucket and then sorting.


The other option would be to store metadata object the S3 objects in a database like DynamoDB and then query that database to find the objects to retrieve from S3.


You can find out more about hierarchical listing in S3 here


