Spark Memory Model

Spark is a distributed data processing framework. It works best with huge amount of data (What we call as Bigdata) .As we are dealing with enormous amount of data , it is very important to understand the Memory model of the framework , which will give you a better flexibility to process data . Here we will consider Spark is running on top of YARN resource manager . Spark has mainly two components where we are concerned about memory as below Driver Executor Driver Driver is the place where all the local computations happens . Some times it is required to collect too much data to driver from executors or some times it is required to do a heavy local computations. Hence this kind of memory intensive activities might slowdown or break your application . Driver has two memory division i.e Diver overhead and driver memory. Driver over head is the amount of heap memory (in megabytes) . This memory accounts for JVM overheads, interned s...

Search This Blog

Spark Memory Model

Posts

Spark Memory Model