Free New Certifications Exam Dumps with Cracked: Download CCA175 Dumps Practice Exam Questions 2018

Latest Updated CCA175 Exam dumps Questions from Exact2pass CCA175 PDF dumps! Welcome to download the newest Exact2pass CCA175 VCE dumps: https://www.exact2pass.com/CCA175-pass.html

Keywords: CCA175 exam dumps, CCA175 exam questions, CCA175 VCE dumps, CCA175 PDF dumps, CCA175 practice tests, CCA175 study guide, CCA175braindumps

QUESTION NO: 71

Problem Scenario 71 :

Write down a Spark script using Python,

In which it read a file "Content.txt" (On hdfs) with following content.

After that split each row as (key, value), where key is first word in line and entire line as value.

Filter out the empty lines.

And save this key value in "problem86" as Sequence file(On hdfs)

Part 2 : Save as sequence file , where key as null and entire line as value. Read back the stored sequence files.

Content.txt

Hello this is ABCTECH.com

This is XYZTECH.com

Apache Spark Training

This is Spark Learning Session

Spark is faster than MapReduce

Answer: See the explanation for Step by Step Solution and configuration.

Explanation:

Solution :

Step 1 :

# Import SparkContext and SparkConf

from pyspark import SparkContext, SparkConf

Step 2:

#load data from hdfs

contentRDD = sc.textFile(MContent.txt")

Step 3:

#filter out non-empty lines

nonemptyjines = contentRDD.filter(lambda x: len(x) > 0)

Step 4:

#Split line based on space (Remember : It is mandatory to convert is in tuple} words = nonempty_lines.map(lambda x: tuple(x.split('', 1))) words.saveAsSequenceFile("problem86")

Step 5: Check contents in directory problem86 hdfs dfs -cat problem86/part*

Step 6 : Create key, value pair (where key is null)

nonempty_lines.map(lambda line: (None, Mne}).saveAsSequenceFile("problem86_1")

Step 7 : Reading back the sequence file data using spark. seqRDD = sc.sequenceFile("problem86_1")

Step 8 : Print the content to validate the same.

for line in seqRDD.collect():

print(line)

QUESTION NO: 72

Problem Scenario 72 : You have been given a table named "employee2" with following detail.

first_name string

last_name string

Write a spark script in python which read this table and print all the rows and individual column values.

Answer: See the explanation for Step by Step Solution and configuration.

Explanation:

Solution :

Step 1 : Import statements for HiveContext from pyspark.sql import HiveContext

Step 2 : Create sqIContext sqIContext = HiveContext(sc)

Step 3 : Query hive

employee2 = sqlContext.sql("select' from employee2")

Step 4 : Now prints the data for row in employee2.collect(): print(row)

Step 5 : Print specific column for row in employee2.collect(): print( row.fi rst_name)

QUESTION NO: 73

Problem Scenario 73 : You have been given data in json format as below.

{"first_name":"Ankit", "last_name":"Jain"}

{"first_name":"Amir", "last_name":"Khan"}

{"first_name":"Rajesh", "last_name":"Khanna"}

{"first_name":"Priynka", "last_name":"Chopra"}

{"first_name":"Kareena", "last_name":"Kapoor"}

{"first_name":"Lokesh", "last_name":"Yadav"}

Do the following activity

1. create employee.json file locally.

2. Load this file on hdfs

3. Register this data as a temp table in Spark using Python.

4. Write select query and print this data.

5. Now save back this selected data in json format.

Answer: See the explanation for Step by Step Solution and configuration.

Explanation:

Solution :

Step 1 : create employee.json tile locally.

vi employee.json (press insert) past the content.

Step 2 : Upload this tile to hdfs, default location hadoop fs -put employee.json

Step 3 : Write spark script

#lmport SQLContext

from pyspark import SQLContext

#Create instance of SQLContext sqIContext = SQLContext(sc)

#Load json file

employee = sqlContext.jsonFile("employee.json")

#Register RDD as a temp table employee.registerTempTablef'EmployeeTab"}

#Select data from Employee table

employeelnfo = sqlContext.sql("select * from EmployeeTab"}

#lterate data and print

for row in employeelnfo.collect():

print(row)

Step 4 : Write dataas a Text file employeelnfo.toJSON().saveAsTextFile("employeeJson1")

Step 5: Check whether data has been created or not hadoop fs -cat employeeJsonl/part"

Free New Certifications Exam Dumps with Cracked

Friday, May 18, 2018

Download CCA175 Dumps Practice Exam Questions 2018

5 comments:

Microsoft Azure Exam DP-200 Dumps Questions Answers [2020]

Report Abuse

Labels