hive Archives - JassWeb

[Solved] Spark in Business Intelligence

March 6, 2023 by Kirat

I think that you should build Hive Datawarehouse using Hive or MongoDB Datawarehouse using MongoDB. I didn’t understand how you are going to mix them, but I will try to answer the question anyway. Usually, you configure for a BI tool a JDBC driver for DB of your choice (e.g. Hive) and the BI tool … Read more

[Solved] Spark in Business Intelligence

March 5, 2023 by Kirat

Introduction Business intelligence (BI) is a critical tool for organizations to gain insights into their data and make informed decisions. Spark is an open-source distributed computing platform that has become increasingly popular for its ability to process large amounts of data quickly and efficiently. Spark is a powerful tool for business intelligence, as it can … Read more

[Solved] How to perform self join with same row of previous group(month) to bring in additional columns with different expressions in Pyspark

October 14, 2022 by Kirat

How to perform self join with same row of previous group(month) to bring in additional columns with different expressions in Pyspark solved How to perform self join with same row of previous group(month) to bring in additional columns with different expressions in Pyspark

[Solved] Tell me how to replace python data, How can I read the a.txt file and make it in the following format? [closed]

September 28, 2022 by Kirat

file = open(‘a.txt’, ‘r’) l = [] for line in file: l.append( line.split()) Then if you want the second part to be integer, you can use list comprehension: l = [ [i[0], int(float(i[1]))] for i in l] output [[‘abcd.com’, 0], [‘*’, 66999306], [‘asdf.com’, 150744025], [‘asfd.df.com’, 193139033], [‘fdsa.com’, 907938122], [‘bank.com’, 2638989462], [‘fire.com’, 4151822166], [‘ms.com’, 7026079907] ] … Read more

[Solved] group by sum and join giving duplicate rows

September 13, 2022 by Kirat

You are getting duplicates due to join, you can get expected results without join. Try below query select materialno, sum(billedqty), min(materialdesc), min(startdate) from billing group by materialno; When you are joining test1 with billing table, you are getting duplicates because your materialno=”1001″ is having 2 corresponding “materialdesc” values, and “startdate” values. But the query I … Read more