Yash Srivastava's Blog

Yash Srivastava's Blog

Follow

Follow

Tag

big data

#big-data

Read more stories on Hashnode

Articles with this tag

Spark on YARN architecture

Jan 9, 20232 min read

When we talk about spark on top of Hadoop its generally Hadoop core with Spark compute engine instead of MapReduce, i.e (HDFS, Spark, YARN) Spark...

Spark on YARN architecture

Shared variables in spark

Jan 9, 20232 min read

Sometimes in a spark application, we need to share small data across all the machines for processing. For example, if you want to filter some set of...

Shared variables in spark

What is Apache Spark?

Jan 4, 20232 min read

In simple terms, Apache spark is an in-memory unified parallel compute engine. In Memory,Most of the operations in apache spark happen in memory and...

What is Apache Spark?

Introduction to Hive

Dec 30, 20224 min read

We cannot use an analytical storage system for transactional requirements and vice versa. But have you ever wondered why is that so? Transactional vs...

Introduction to Hive

Introduction to SQOOP in Hadoop

Dec 27, 20224 min read

Data ingestion is one of the crucial steps in the data lifecycle and when the source is a relational database, Sqoop can be a very easy and simple...

Introduction to SQOOP in Hadoop

MapReduce in Hadoop

Dec 25, 20224 min read

Although MapReduce is not much used in solving Big Data problems nowadays because of its poor performance compared to spark. But it's still a very...

MapReduce in Hadoop