[Solved] how to scrape web page that is not written directly using HTML, but is auto-generated using JavaScript? [closed]


Run this script and I suppose it will give you everything the table contains including a csv output.

import csv
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
outfile = open('table_data.csv','w',newline="")
writer = csv.writer(outfile)
driver.get("http://washingtonmonthly.com/college_guide?ranking=2016-rankings-national-universities")

wait.until(EC.frame_to_be_available_and_switch_to_it("iFrameResizer0"))
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, 'table.tablesaw')))

tab_data = driver.find_element_by_css_selector('table.tablesaw')
list_rows = [[cell.text for cell in row.find_elements_by_css_selector('td')]
             for row in tab_data.find_elements_by_css_selector('tr')]
for data in list_rows:
    writer.writerow(data)
    print(data)

driver.quit()

Btw, I’m assuming that you have lxml library installed.

0

solved how to scrape web page that is not written directly using HTML, but is auto-generated using JavaScript? [closed]