[Solved] I need an algorithm to measure content quality


I just had a quick check of the site you linked to. Their algorithm appears to boil down to “longer comment == higher quality”. Not exactly a sophisticated algorithm. For example, this

asklfklasf kajslkjf akjs flkajsfklajs fkjaskfj aklsjf kajsfk ajskfj alksjf aklsjfkl asfjaklsjf

was given their top quality rating…

Some ideas to make this better:

  • Check spelling (mispelled words reduce quality)
  • Check for swear words and other profanity.
  • Length is probably important, but I wouldn’t put much weight on it.
  • Grammar would be good to check, although difficult.
  • Running a spam filter over it would be a good first step.

Those are just some ideas. For the spelling and profanity, just check each words against a dictionary. Grammar would be more difficult as you start to move into natural language processing, which is a very deep area of research.

0

solved I need an algorithm to measure content quality