[Solved] What approaches are available to reduce the time needed for a large site scrape? [closed]


To answer my own question:

I implemented the same logic with Jsoup and the time bench mark yielded the results for a fixed amount of data:

  • Selenium: 2 minutes 46 seconds
  • Jsoup: 16 seconds

Thus it seems that Selenium is much slower. I cannot give a technical reason why this is so. I can only make a guess and say that it is because of the rendering overhead.

1

solved What approaches are available to reduce the time needed for a large site scrape? [closed]