[Solved] Big data using Microsoft SQL Server [closed]

SQL Server benchmarks and performance TPC-E – Top Ten Performance Results Lenovo System x3950 X6/Microsoft SQL Server 2014 Enterprise Edition : 9,145.01 tpsE (The Performance Metric reported by TPC-E is a “business throughput” measure of the number of completed Trade-Result transactions processed per second) solved Big data using Microsoft SQL Server [closed]

[Solved] What should I use to perform similarity functions on 200 column 12 million row dataset? [closed]

After getting suggestions from a couple of friends, I looked up the documentation on ElasticSearch. Seems like that’s the perfect tool for my use-case. It’s built for search/retrieval needs such as this, shards like anything, can handle huge data. Here’s what should be done: Store each row in a document, with the key elements being … Read more

[Solved] What is the scope and benefits of big data? [closed]

Not really a software related question – but it’s very relevant to current technology and why some software exists. So here is an opinion. We now live in a world where it is possible to monitor and digitally record information on an epic scale that continues to expand with concepts like The Internet of Things. … Read more

[Solved] What are the approaches to the Big-Data problems? [closed]

I will approach your question like this: I assume you are firmly interested in big data database use already and have a real need for one, so instead of repeating textbooks upon textbooks of information about them, I will highlight some that meet your 5 requirements – mainly Cassandra and Hadoop. 1) The first requirement … Read more

[Solved] How to get the specified output without combineByKey and aggregateByKey in spark RDD

Here is a standard approach. Point to note: you need to be working with an RDD. I think that is the bottleneck. Here you go: val keysWithValuesList = Array(“foo=A”, “foo=A”, “foo=A”, “foo=A”, “foo=B”, “bar=C”,”bar=D”, “bar=D”) val sample=keysWithValuesList.map(_.split(“=”)).map(p=>(p(0),(p(1)))) val sample2 = sc.parallelize(sample.map(x => (x._1, 1))) val sample3 = sample2.reduceByKey(_+_) sample3.collect() val sample4 = sc.parallelize(sample.map(x => (x._1, … Read more