Python Parse Dataframe Element
I have a pandas dataframe column (Data Type) which I want to split into three columns target_table_df = LoadS_A [['Attribute Name', 'Data Type',
Solution 1:
Use target_table_df['Data Type'].str.extract(pattern)
You'll need to assign pattern
to be a regular expression that captures each of the components you're looking for.
pattern = r'([^\(]+)(\(([^,]*),(.*)\))?'
([^\(]+)
says grab as many non-open parenthesis characters you can up to the first open parenthesis.
\(([^,]*,
says to grab the first set of non-comma characters after an open parenthesis and stop at the comma.
,(.*)\)
says to grab the rest of the characters between the comma and the close parenthesis.
(\(([^,]*),(.*)\))?
says the whole parenthesis thing may not even happen, grab it if you can.
Solution
everything together looks like this:
pattern = r'([^\(]+)(\(([^,]*),(.*)\))?'
df = s.str.extract(pattern, expand=True).iloc[:, [0, 2, 3]]
# Formatting to get it how you wanted
df.columns = ['Data Type', 'Precision', 'Scale']
df.index.name = Noneprint df
I put a .iloc[:, [0, 2, 3]]
at the end because the pattern I used grabs the whole parenthesis in column 1
and I wanted to skip it. Leave it off and see.
Data Type Precision Scale
0decimal1841 number 1102date NaN NaN
3decimal1844decimal1845 number 110
Post a Comment for "Python Parse Dataframe Element"