Yash Srivastava's Blog

Yash Srivastava's Blog

Follow

Follow

Tag

hadoop

#hadoop

Read more stories on Hashnode

Articles with this tag

Shared variables in spark

Jan 9, 20232 min read

Sometimes in a spark application, we need to share small data across all the machines for processing. For example, if you want to filter some set of...

Shared variables in spark

Introduction to Hive

Dec 30, 20224 min read

We cannot use an analytical storage system for transactional requirements and vice versa. But have you ever wondered why is that so? Transactional vs...

Introduction to Hive

Introduction to SQOOP in Hadoop

Dec 27, 20224 min read

Data ingestion is one of the crucial steps in the data lifecycle and when the source is a relational database, Sqoop can be a very easy and simple...

Introduction to SQOOP in Hadoop

MapReduce in Hadoop

Dec 25, 20224 min read

Although MapReduce is not much used in solving Big Data problems nowadays because of its poor performance compared to spark. But it's still a very...

MapReduce in Hadoop

HDFS Recovery mechanisms

Dec 22, 20222 min read

Data nodes in HDFS are generally of commodity hardware which is low-priced but goes down very frequently. What happens to the data when a Data Node...

HDFS Recovery mechanisms

Basic HDFS Architecture

Dec 21, 20222 min read

What is HDFS? HDFS (Hadoop Distributed File System) is 1 of the 3 Hadoop core components. As its name suggests it's a distributed storage file system...

Basic HDFS Architecture