#big-data
Read more stories on Hashnode
Articles with this tag
When we talk about spark on top of Hadoop its generally Hadoop core with Spark compute engine instead of MapReduce, i.e (HDFS, Spark, YARN) Spark...
Sometimes in a spark application, we need to share small data across all the machines for processing. For example, if you want to filter some set of...
In simple terms, Apache spark is an in-memory unified parallel compute engine. In Memory,Most of the operations in apache spark happen in memory and...
We cannot use an analytical storage system for transactional requirements and vice versa. But have you ever wondered why is that so? Transactional vs...
Data ingestion is one of the crucial steps in the data lifecycle and when the source is a relational database, Sqoop can be a very easy and simple...
Although MapReduce is not much used in solving Big Data problems nowadays because of its poor performance compared to spark. But it's still a very...