Skip to content Skip to sidebar Skip to footer

TypeError: 'GroupedData' Object Is Not Iterable In Pyspark

I'm using spark version 2.0.1 & python 2.7. I'm running following code # This will return a new DF with all the columns + id data1 = data.withColumn('id', monotonically_increas

Solution 1:

You have to perform an aggregation on the GroupedData and collect the results before you can iterate over them e.g. count items per group: res = df.groupby(field).count().collect()


Post a Comment for "TypeError: 'GroupedData' Object Is Not Iterable In Pyspark"