TypeError: 'GroupedData' Object Is Not Iterable In Pyspark

March 22, 2023 Post a Comment

I'm using spark version 2.0.1 & python 2.7. I'm running following code # This will return a new DF with all the columns + id data1 = data.withColumn('id', monotonically_increas

Solution 1:

You have to perform an aggregation on the GroupedData and collect the results before you can iterate over them e.g. count items per group: res = df.groupby(field).count().collect()

Baca Juga

Pyspark : Keyerror When Converting A Dataframe Column Of String Type To Double
Spark: How To Transpose And Explode Columns With Nested Arrays
Saving Dataframe To Parquet Takes Lot Of Time

Python Guru

TypeError: 'GroupedData' Object Is Not Iterable In Pyspark

Solution 1:

Post a Comment for "TypeError: 'GroupedData' Object Is Not Iterable In Pyspark"