[Solved] How to do data exploration before choosing any Machine Learning algorithms [closed]


Firstly, you have to understand Machine Learning as a field, and have some understanding of its sub fields. If you don’t intuitively understand your tools, you won’t be able to identify when to use them.

The idea you’re talking about is called exploratory data analysis, and it can be very approachable if you think about it the right way. Think about it in terms of the scientific method:

First, look over the data, and any documentation about it.

Then, come to some hypotheses about the patterns that might exist.

Based on your understanding of ML, brainstorm some approaches that might give some insight into your hypotheses. For example, if you see that your proposed dependent value can have several distinct values, you have a classification problem, and based on your input data, you should choose an appropriate approach.

The tools that you might find useful are plentiful, but a good start could be the programming language R, or Python. Both are very strong data science tools. R has a greater learning curve, but is built with data science in mind. Python, on the other hand, is very easy to pick up, but you have more choices to make with regards to ML and data science libraries. With Python, look into Pandas for CSV and data manipulation, and Tensorflow, Theano or Scikit-Learn for data analysis and ML.

Hope this helps!

1

solved How to do data exploration before choosing any Machine Learning algorithms [closed]