Assignment 1_Data Analysis with Pyspark – Copy

$ 7.73

Assignment name Course Code Your name/ID Date of submission Convert all the collected dates/times from GMT to China time time_diff = 8 * 3600 interval = (datetime.datetime(1970, 1, 1) – datetime.datetime(1899, 12, 30)).total_seconds() # 1. Convert all the collected dates/times from GMT to China time df = df.withColumn(‘Timestamp’, col(‘Timestamp’) * 24 * 3600 + time_diff – interval) to_date = udf(lambda x: datetime.datetime.fromtimestamp(x.Timestamp, datetime.timezone.utc).strftime(‘%Y-%m-%d’), StringType()) to_time = udf(lambda x: datetime.datetime.fromtimestamp(x.Timestamp, datetime.timezone.utc).strftime(‘%H:%M:%S’), StringType()) df = df.withColumn(“Date”, to_date(struct([df[x] for x in df.columns]))) df = df.withColumn(“Time”, to_time(struct([df[x] for x in df.columns]))) print(‘Step 1: Shift the time to beijing time’)

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.