[Solved] Scrape Multiple URLs from CSV using Beautiful Soup & Python


Assuming that your urls.csv file look like:

https://stackoverflow.com;code site;
https://steemit.com;block chain social site;

The following code will work:

#!/usr/bin/python
# -*- coding: utf-8 -*-

from bs4 import BeautifulSoup #required to parse html
import requests #required to make request

#read file
with open('urls.csv','r') as f:
    csv_raw_cont=f.read()

#split by line
split_csv=csv_raw_cont.split('\n')

#remove empty line
split_csv.remove('')

#specify separator
separator=";"

#iterate over each line
for each in split_csv:

    #specify the row index
    url_row_index=0 #in our csv example file the url is the first row so we set 0

    #get the url
    url = each.split(separator)[url_row_index] 

    #fetch content from server
    html=requests.get(url).content

    #soup fetched content
    soup=   BeautifulSoup(html)

    #show title from soup
    print soup.title.string

Result:

Stack Overflow - Where Developers Learn, Share, & Build Careers
Steemit

More informations: beautifulsoup and requests

1

solved Scrape Multiple URLs from CSV using Beautiful Soup & Python