Google Cloud Storage + Python : Any Way To List Obj In Certain Folder In Gcs?
Solution 1:
Update: the below is true for the older "Google API Client Libraries" for Python, but if you're not using that client, prefer the newer "Google Cloud Client Library" for Python ( https://googleapis.dev/python/storage/latest/index.html ). For the newer library, the equivalent to the below code is:
from google.cloud import storage
client = storage.Client()
for blob in client.list_blobs('bucketname', prefix='abc/myfolder'):
print(str(blob))
Answer for older client follows.
You may find it easier to work with the JSON API, which has a full-featured Python client. It has a function for listing objects that takes a prefix parameter, which you could use to check for a certain directory and its children in this manner:
from apiclient import discovery
# Auth goes here if necessary. Create authorized http object...
client = discovery.build('storage', 'v1') # add http=whatever param if auth
request = client.objects().list(
bucket="mybucket",
prefix="abc/myfolder")
while request isnotNone:
response = request.execute()
print json.dumps(response, indent=2)
request = request.list_next(request, response)
Fuller documentation of the list call is here: https://developers.google.com/storage/docs/json_api/v1/objects/list
And the Google Python API client is documented here: https://code.google.com/p/google-api-python-client/
Solution 2:
This worked for me:
client = storage.Client()
BUCKET_NAME = 'DEMO_BUCKET'
bucket = client.get_bucket(BUCKET_NAME)
blobs = bucket.list_blobs()
for blob in blobs:
print(blob.name)
The list_blobs() method will return an iterator used to find blobs in the bucket. Now you can iterate over blobs and access every object in the bucket. In this example I just print out the name of the object.
This documentation helped me alot:
I hope I could help!
Solution 3:
You might also want to look at gcloud-python and documentation.
from gcloud import storage
connection = storage.get_connection(project_name, email, private_key_path)
bucket = connection.get_bucket('my-bucket')
for key in bucket:
if key.name == 'abc.txt':
print'Found it!'break
However, you might be better off just checking if the file exists:
if'abc.txt'in bucket:
print'Found it!'
Solution 4:
Install python package google-cloud-storage by pip or pycharm and use below code
from google.cloud import storage
client = storage.Client()
for blob in client.list_blobs(BUCKET_NAME, prefix=FOLDER_NAME):
print(str(blob))
Solution 5:
I know this is an old question, but I stumbled over this because I was looking for the exact same answer. Answers from Brandon Yarbrough and Abhijit worked for me, but I wanted to get into more detail.
When you run this:
from google.cloud importstoragestorage_client= storage.Client()
blobs = list(storage_client.list_blobs(bucket_name, prefix=PREFIX, fields="items(name)"))
You will get Blob objects, with just the name field of all files in the given bucket, like this:
[<Blob: BUCKET_NAME, PREFIX, None>,
<Blob: xml-BUCKET_NAME, [PREFIX]claim_757325.json, None>,
<Blob: xml-BUCKET_NAME, [PREFIX]claim_757390.json, None>,
...]
If you are like me and you want to 1) filter out the first item in the list because it does NOT represent a file - its just the prefix, 2) just get the name string value, and 3) remove the PREFIX from the file name, you can do something like this:
blob_names = [blob_name.name[len(PREFIX):] for blob_name in blobs if blob_name.name != folder_name]
Complete code to get just the string files names from a storage bucket:
from google.cloud import storage
storage_client = storage.Client()
blobs = list(storage_client.list_blobs(bucket_name, prefix=PREFIX, fields="items(name)"))
blob_names = [blob_name.name[len(PREFIX):] for blob_name in blobs if blob_name.name != folder_name]
print(f"blob_names = {blob_names}")
Post a Comment for "Google Cloud Storage + Python : Any Way To List Obj In Certain Folder In Gcs?"