#hadoop
Read more stories on Hashnode
Articles with this tag
Sometimes in a spark application, we need to share small data across all the machines for processing. For example, if you want to filter some set of...
We cannot use an analytical storage system for transactional requirements and vice versa. But have you ever wondered why is that so? Transactional vs...
Data ingestion is one of the crucial steps in the data lifecycle and when the source is a relational database, Sqoop can be a very easy and simple...
Although MapReduce is not much used in solving Big Data problems nowadays because of its poor performance compared to spark. But it's still a very...
Data nodes in HDFS are generally of commodity hardware which is low-priced but goes down very frequently. What happens to the data when a Data Node...
What is HDFS? HDFS (Hadoop Distributed File System) is 1 of the 3 Hadoop core components. As its name suggests it's a distributed storage file system...