How Do I Load Heterogeneous Data (np.genfromtxt) As A 2d Array?

December 06, 2023 Post a Comment

I learn from numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why? that numpy.genfromtxt returns a structured ndarray if the data is not homogeneous. How

Solution 1:

In this case use pandas and then converting pandas dataframe to numpy matrix would be easier.

import pandas as pd
foo = pd.read_csv('table.dat', sep='\t')
type(foo)
<class 'pandas.core.frame.DataFrame'>
bar = foo.as_matrix()
array([[10,  7,  6,  7, 10],
       [ 5, 10,  2,  1,  3],
       [ 7,  6,  5,  3,  6],
       [ 5,  8,  5,  2,  7],
       [ 1,  2,  2, 10,  8],
       [10,  5,  9,  3,  8],
       [ 5,  2,  4,  4,  2]])
bar.shape
(7,5)

Solution 2:

I got this to work with:

import numpy as np

table = np.genfromtxt('table.dat',
                      dtype=None,
                      skip_header=1)

Here's why it works:

You should consecutive whitespace as the delimiter (the default) not tabs (unless the snippet you posted has lost formatting).
You should let NumPy infer the dtype, rather than using the default float.
To get the desired output in your question you want to simply skip the header column rather than get the function to create a structured dtype.

Check out the docs: http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.genfromtxt.html for more details.

I agree a Pandas DataFrame may be more appropriate if you are essentially reading in a csv file.

Solution 3:

Your data looks homogeneous - all int except for the header. But by saying header=True you force it to load it as a structured array. Look at the dtype.

Try skip_header=1 (check the syntax). Omit names (or make it false).

In other words you want to load integers, ignoring the header line.

The tab delimiter appears to be working ok.

I see from a comment that you have discovered the view method of converting a structured array. That gives you both header names and a 2d view.

Python Guru

How Do I Load Heterogeneous Data (np.genfromtxt) As A 2d Array?

Solution 1:

Solution 2:

Solution 3:

Post a Comment for "How Do I Load Heterogeneous Data (np.genfromtxt) As A 2d Array?"