Using Merge On A Column And Index In Pandas
I have two separate dataframes that share a project number. In type_df, the project number is the index. In time_df, the project number is a column. I would like to count the numbe
Solution 1:
If you want to use an index in your merge you have to specify left_index=True
or right_index=True
, and then use left_on
or right_on
. For you it should look something like this:
merged = pd.merge(type_df, time_df, left_index=True, right_on='Project')
Solution 2:
Another solution is use DataFrame.join
:
df3 = type_df.join(time_df, on='Project')
For version pandas 0.23.0+
the on
, left_on
, and right_on
parameters may now refer to either column names or index level names:
left_index = pd.Index(['K0', 'K0', 'K1', 'K2'], name='key1')
left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
'key2': ['K0', 'K1', 'K0', 'K1']},
index=left_index)
right_index = pd.Index(['K0', 'K1', 'K2', 'K2'], name='key1')
right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3'],
'key2': ['K0', 'K0', 'K0', 'K1']},
index=right_index)
print (left)
A B key2
key1
K0 A0 B0 K0
K0 A1 B1 K1
K1 A2 B2 K0
K2 A3 B3 K1
print (right)
C D key2
key1
K0 C0 D0 K0
K1 C1 D1 K0
K2 C2 D2 K0
K2 C3 D3 K1
df = left.merge(right, on=['key1', 'key2'])
print (df)
A B key2 C D
key1
K0 A0 B0 K0 C0 D0
K1 A2 B2 K0 C1 D1
K2 A3 B3 K1 C3 D3
Solution 3:
You must have the same column in each dataframe to merge on.
In this case, just make a 'Project' column for type_df
, then merge on that:
type_df['Project'] = type_df.index.values
merged = pd.merge(time_df,type_df, on='Project', how='inner')
merged
# Project Time Project Type#0 Project1 13 Type 2#1 Project1 12 Type 2#2 Project2 41 Type 1print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()
2
Post a Comment for "Using Merge On A Column And Index In Pandas"