[Solved] What should I use to perform similarity functions on 200 column 12 million row dataset? [closed]
After getting suggestions from a couple of friends, I looked up the documentation on ElasticSearch. Seems like that’s the perfect tool for my use-case. It’s built for search/retrieval needs such as this, shards like anything, can handle huge data. Here’s what should be done: Store each row in a document, with the key elements being … Read more