Latest Updated CCA175 Exam dumps
Questions from Exact2pass CCA175 PDF dumps! Welcome to download the newest Exact2pass
CCA175 VCE dumps: https://www.exact2pass.com/CCA175-pass.html
Keywords: CCA175 exam dumps, CCA175 exam questions, CCA175 VCE dumps, CCA175 PDF dumps, CCA175 practice tests, CCA175 study guide, CCA175braindumps
QUESTION NO: 71
Problem Scenario 71 :
Write down a Spark script using Python,
In which it read a file "Content.txt" (On hdfs) with following content.
After that split each row as (key, value), where key is first word in line and entire line as value.
Filter out the empty lines.
And save this key value in "problem86" as Sequence file(On hdfs)
Part 2 : Save as sequence file , where key as null and entire line as value. Read back the stored sequence files.
Content.txt
Hello this is ABCTECH.com
This is XYZTECH.com
Apache Spark Training
This is Spark Learning Session
Spark is faster than MapReduce
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution :
Step 1 :
# Import SparkContext and SparkConf
from pyspark import SparkContext, SparkConf
Step 2:
#load data from hdfs
contentRDD = sc.textFile(MContent.txt")
Step 3:
#filter out non-empty lines
nonemptyjines = contentRDD.filter(lambda x: len(x) > 0)
Step 4:
#Split line based on space (Remember : It is mandatory to convert is in tuple} words = nonempty_lines.map(lambda x: tuple(x.split('', 1))) words.saveAsSequenceFile("problem86")
Step 5: Check contents in directory problem86 hdfs dfs -cat problem86/part*
Step 6 : Create key, value pair (where key is null)
nonempty_lines.map(lambda line: (None, Mne}).saveAsSequenceFile("problem86_1")
Step 7 : Reading back the sequence file data using spark. seqRDD = sc.sequenceFile("problem86_1")
Step 8 : Print the content to validate the same.
for line in seqRDD.collect():
print(line)
QUESTION NO: 72
Problem Scenario 72 : You have been given a table named "employee2" with following detail.
first_name string
last_name string
Write a spark script in python which read this table and print all the rows and individual column values.
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution :
Step 1 : Import statements for HiveContext from pyspark.sql import HiveContext
Step 2 : Create sqIContext sqIContext = HiveContext(sc)
Step 3 : Query hive
employee2 = sqlContext.sql("select' from employee2")
Step 4 : Now prints the data for row in employee2.collect(): print(row)
Step 5 : Print specific column for row in employee2.collect(): print( row.fi rst_name)
QUESTION NO: 73
Problem Scenario 73 : You have been given data in json format as below.
{"first_name":"Ankit", "last_name":"Jain"}
{"first_name":"Amir", "last_name":"Khan"}
{"first_name":"Rajesh", "last_name":"Khanna"}
{"first_name":"Priynka", "last_name":"Chopra"}
{"first_name":"Kareena", "last_name":"Kapoor"}
{"first_name":"Lokesh", "last_name":"Yadav"}
Do the following activity
1. create employee.json file locally.
2. Load this file on hdfs
3. Register this data as a temp table in Spark using Python.
4. Write select query and print this data.
5. Now save back this selected data in json format.
Answer: See the explanation for Step by Step Solution and configuration.
Explanation:
Solution :
Step 1 : create employee.json tile locally.
vi employee.json (press insert) past the content.
Step 2 : Upload this tile to hdfs, default location hadoop fs -put employee.json
Step 3 : Write spark script
#lmport SQLContext
from pyspark import SQLContext
#Create instance of SQLContext sqIContext = SQLContext(sc)
#Load json file
employee = sqlContext.jsonFile("employee.json")
#Register RDD as a temp table employee.registerTempTablef'EmployeeTab"}
#Select data from Employee table
employeelnfo = sqlContext.sql("select * from EmployeeTab"}
#lterate data and print
for row in employeelnfo.collect():
print(row)
Step 4 : Write dataas a Text file employeelnfo.toJSON().saveAsTextFile("employeeJson1")
Step 5: Check whether data has been created or not hadoop fs -cat employeeJsonl/part"
ReplyDeleteExtra-Ordinary piece of work. Interesting concepts to read. Very much informative. Thanks for sharing. Waiting for your future posts.
Tableau Training in Chennai
Tableau Course in Chennai
Tableau Certification
<a href="https://www.fita.in/tableau-training-in-chennai/”>Tableau Training in Adyar</a>
Dumpsaway.com provides authentic IT Certification exams preparation material guaranteed to make you pass in the first attempt. Download instant free demo & begin preparation.
ReplyDeletebraindumpsaway
https://70-773-exam-dumps.blogspot.com/2020/07/cisco-certification-gives-you-sense-of.html
ReplyDeletehttps://pass4sure-70-355.blogspot.com/2020/07/adobe-certifications-what-should-we.html
https://70-774-exam-dumps.blogspot.com/2020/07/pmp-certification-training.html
https://www.thewyco.com/news/220-1002-comptia-a-certification-exam-core-2-04-02-2021
ReplyDeleteIf you want to learn about tableau with online practice tests and get the Tableau Certification then click on this link Tableau Certification Exam Dumps.
ReplyDelete