Please see attached
Step 1 Reading
1. Read Chapter 5: MapReduce Details for Multimachine Clusters (in Pro Hadoop; books 24×7).
2. Read HIV and Pig.
Step 2 Using a reporting and visualization tool such as Qlikview
In this module use Qlikview, or a Tableau type reporting tool to download the data from your Hive server via ODBC connection to a Windows machine.
If your cluster becomes non-functional for any reason, please recreate it like in task 1
** Please ensure that all of your services are running before beginning this task to ensure proper configuration**
1. In VirtualBox, press CTRL+S. Navigate to the network settings, and under port forwarding please ensure that the following is set:
Name Protocol Host IP Host Port Guest IP Guest Port
10000 TCP 127.0.0.1 10000 10.0.2.15 10000
2. Install the Cloudera Hive 64-bit ODBC Driver, click next until the installation is completed.
3. Click Start and type ODBC, when the ODBC configuration manager pops up, click to open.
4. Click Add, select the Cloudera Hive ODBC connector, and then configure it using the following information:
Data Source Name: hive
Hive Server Type: Hive Server 2
Authentication Mechanism: User name and password
User Name: cloudera
Test the connection, if it says Tests Completed Successfully! you are good to go. Click OK and, and OK again until ODBC administration is closed.
5. Install Qlikview, click next until the installation is completed.
6. Open Qlikview and click on File -> New, and then close the wizard using the X,
7. Click on File -> Edit Script. When the window pops up, click Connect and enter the Cloudera credentials as needed. Click Test Connection if you would like to try it again.
8. Click Select and identify the table and columns as needed from the menu. Click OK to add the lines to the script.
9. Click RELOAD to execute the script and connect to the server.
10. Once Qlikview has connected, select the columns and click ADD for those fields you wish to add. Click OK.
11. Now make a chart using the quick chart wizard of your choosing.
Please submit a document including your understanding of the process and purpose, and include all supporting screenshots as necessary
Step 3 Report
Write a report (4-6 pages) includes:
Following APA standards cover page and table of content,
Short research report on other components of Hadoop platform: reporting and Visualization tools such as Qlikview.
Create a file and loading data in the file; include a document on your understanding of the process and purpose, along with supporting screen shots.
Use QlikView and generate the result, along with supporting screen shots.
Describe your understanding of the process and purpose of such tools and processes in a corporate environment and how it relates to data analysis and business activities.