Skip to content Skip to sidebar Skip to footer

Merging List Of DFs With Alternating Columns Output Using Pandas

I have the following codes: import pandas as pd rep1 = pd.DataFrame.from_items([('Probe', ['x', 'y', 'z']), ('Gene', ['foo', 'bar', 'qux']), ('RP1',[1.00,23.22,11.12]),('RP1',['A'

Solution 1:

You could dedupe the column names. Here's a kind of hacky way:

In [11]: list(rep1.columns[0:2]) + [rep1.columns[2] + "_value"] + [rep1.columns[2] + "_letter"]
Out[11]: ['Probe', 'Gene', 'RP1_value', 'RP1_letter']

In [12]: for rep in tmp:
   .....:     rep.columns = list(rep.columns[0:2]) + [rep.columns[2] + "_value"] + [rep.columns[2] + "_letter"]

In [13]: reduce(pd.merge,tmp)
Out[13]:
  Probe Gene  RP1_value RP1_letter  RP2_value RP2_letter  RP3_value RP3_letter
0     x  foo       1.00          A       3.33          G      99.99          M
1     y  bar      23.22          B      77.22          I      98.29          P

You also need to specify it as an outer merge (to get the NaN rows):

In [21]: reduce(lambda x, y: pd.merge(x, y, how='outer'),tmp)
Out[21]:
  Probe Gene  RP1_value RP1_letter  RP2_value RP2_letter  RP3_value RP3_letter
0     x  foo       1.00          A       3.33          G      99.99          M
1     y  bar      23.22          B      77.22          I      98.29          P
2     z  qux      11.12          C      18.12          K        NaN        NaN
3     k  kux        NaN        NaN        NaN        NaN       8.10          J

Post a Comment for "Merging List Of DFs With Alternating Columns Output Using Pandas"