Install Anaconda in Cloudera
Oct 28, 2020
Go to parcels tab under Hosts.
Go to configuration and add another repository for Anaconda:
https://repo.continuum.io/pkgs/misc/parcels/
click save changes.
Now Anaconda repository will appear and you can download and distribute it on your hadoop cluster
After distributing process is done, you can run your pyspark using Anaconda dependency with PYSPARK_PYTHON in front of the spark-submit command.
PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda-2019.10/bin/python spark-submit count.py
Hope it helps.