-
An inability to overfit may be caused by a lack of network capacity or a bug in your code itself. Are you training the model from scratch? Did you write the model yourself? Are you working on a task at a similar scale to Imagenet classification?
-
Plateauing of a loss is to be expecting during any kind of model training. But without any learning curves showing the loss across epochs, a loss of “0.2” is meaningless.
Without knowing the exact task your’re working on, the framework you’re using, source code, learning curves, and/or any debugging steps you’ve taken, nobody will be able to really answer your questions. I suggest you go back to the machine learning basics in the cs231 lectures (particularly Lectures 4,5) or Andrew Ng’s Coursera course for a more rudimentary knowledge.
2
solved how to train deep learning network [closed]