hadoop-env.sh file is present in the conf location.
32. In Hadoop_PID_DIR, what does PID stands for?
PID stands for ‘Process ID’.
33. What does /var/hadoop/pids do?
It stores the PID.
34. What does hadoop-metrics.properties file do?
hadoop-metrics.properties is used for ‘Reporting‘ purposes. It controls the reporting for Hadoop. The default status is ‘not to report‘.
35. What are the network requirements for Hadoop?
The Hadoop core uses Shell (SSH) to launch the server processes on the slave nodes. It requires password-less SSH connection between the master and all the slaves and the secondary machines.
36. Why do we need a password-less SSH in Fully Distributed environment?
We need a password-less SSH in a Fully-Distributed environment because when the cluster is LIVE and running in Fully Distributed environment, the communication is too frequent. The job tracker should be able to send a task to task tracker quickly.
37. Does this lead to security issues?
No, not at all. Hadoop cluster is an isolated cluster. And generally it has nothing to do with an internet. It has a different kind of a configuration. We needn’t worry about that kind of a security breach, for instance, someone hacking through the internet, and so on. Hadoop has a very secured way to connect to other machines to fetch and to process data.
38. On which port does SSH work?
SSH works on Port No. 22, though it can be configured. 22 is the default Port number.
39. Can you tell us more about SSH?
SSH is nothing but a secure shell communication, it is a kind of a protocol that works on a Port No. 22, and when you do an SSH, what you really require is a password.
40. Why password is needed in SSH localhost?
Password is required in SSH for security and in a situation where passwordless communication is not set.