Skip to content Skip to sidebar Skip to footer

What Is The Difference Between The Two Ways Of Accessing The Hdf5 Group In Svhn Dataset?

I need to read the SVHN dataset and was trying to read the filename of the first image. I am struggling a bit to understand the structure of HDF5 and especially in understanding t

Solution 1:

First, there is a minor difference in output from your 2 methods. Method 1: returns the full array (of the encoded file name) Method 2: only returns the first element (character) of the array

Let's deconstruct your code to understand what you have. The first part deals with h5py data objects.

f['digitStruct'] -> returns a h5py group object f['digitStruct']['name'] -> returns a h5py dataset object f['digitStruct']['name'].name -> returns the name (path) of the dataset object

Note: The /digitStruct/name dataset contains "Object References". Each array entry is a pointer to another h5py object (in this case another dataset). For example (spaces used to delineate the 2 object references): f[ f['digitStruct']['name'][0][0] ] -> returns the object referenced at [0][0] So, the outer f[ obj_ref ] works just like other object references.

In the case of f['digitStruct']['name'][0][0], this is an object pointing to dataset /#refs#/b In other words, f['digitStruct']['name'][0][0] references the same object as: f['#refs#']['b'] or f['/#refs#/b']

So much for h5py object references. Let's continue to get the data from this object reference using Method 1.

f[f['digitStruct']['name'][0][0]].value -> returns the entire /#refs#/b dataset as a NumPy array.

However, dataset.value is deprecated, and NumPy indexing is preferred, like this: f[f['digitStruct']['name'][0][0]][:] (to get the entire array)

Note: both of these return the entire array of encoded characters. At this point, getting the name is Python and NumPy fuctionality. Use this to return the filename as a string: f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii')

Now let's deconstruct the object reference you used for Method 2.

f['digitStruct']['name'].value -> returns the entire /digitStruct/name dataset as a NumPy array. It has 13,068 rows with object references

f['digitStruct']['name'].value[0] -> is the first row

f['digitStruct']['name'].value[0].item() -> copies that array element to a python scalar

So all of these point to the same object: Method 1: f['digitStruct']['name'][0][0] Method 2: f['digitStruct']['name'].value[0].item() And are both the same as f['#refs#']['b'] or f['/#refs#/b'] for this example.

Like Method 1, getting the string is Python and NumPy fuctionality.

f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii')

Yes, object references are complicated.... My recommendation: Extract NumPy arrays from objects using NumPy indexing instead of .value (as shown in Modified Method 1 above).

Example code for completeness. Intermediate print statements used to show what's going on.

import h5py

# Both of these methods read the name of the 1st# image in svhn dataset
f = h5py.File('test_digitStruct.mat','r')
print (f['digitStruct'])
print (f['digitStruct']['name'])
print (f['digitStruct']['name'].name)

# method 1print('\ntest method 1')
print (f[f['digitStruct']['name'][0][0]])
print (f[f['digitStruct']['name'][0][0]].name)
#  both of these get the entire array / filename:print (f[f['digitStruct']['name'][0][0]].value)
print (f[f['digitStruct']['name'][0][0]][:]) # same as .value aboveprint (f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii'))

# method 2print('\ntest method 2')
print (f[f['digitStruct']['name'].value[0].item()]) 
print (f[f['digitStruct']['name'].value[0].item()].name) 

# this only gets the first array member / character:print (f[f['digitStruct']['name'].value[0].item()].value[0][0])
print (f[f['digitStruct']['name'].value[0].item()].value[0][0].tostring().decode('ascii'))
#  this gets the entire array / filename:print (f[f['digitStruct']['name'].value[0].item()][:])
print (f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii'))

Output from last 2 print statements for each method is identical:

[[ 49][ 46][112][110][103]]
1.png

Post a Comment for "What Is The Difference Between The Two Ways Of Accessing The Hdf5 Group In Svhn Dataset?"