To answer my own question:
I implemented the same logic with Jsoup and the time bench mark yielded the results for a fixed amount of data:
- Selenium: 2 minutes 46 seconds
- Jsoup: 16 seconds
Thus it seems that Selenium is much slower. I cannot give a technical reason why this is so. I can only make a guess and say that it is because of the rendering overhead.
1
solved What approaches are available to reduce the time needed for a large site scrape? [closed]