Dear all,
I am a beginner in python programming with Spark using PySpark. I am facing with 2 following problems 1) To make each partition of RDD (after having this RDD from spark_context.wholeTextFiles()) have the content of a single file. 2) To execute a Async RDD function on all partitions in parallel.
Hope to see your replies asap !!!
Thanks MinhDQ