The spark code does not provide the unique values. Not sure why you don't do something similar to this:
```
df.select('col_name').distinct().show()
```
The variable `calcN` isn't explained either. Seems overall like a flawed comparison.
The spark code does not provide the unique values. Not sure why you don't do something similar to this:
```
df.select('col_name').distinct().show()
```
The variable `calcN` isn't explained either. Seems overall like a flawed comparison.
CTO, Data Scientist & Chartered Engineer (MEng CEng EUR ING MRAeS) with over 20 years experience in the Aerospace, Rail & Energy Industry.