[Solved] How to scrape HTML using Python for NOWTV available movies

You can mimic what the page is doing in terms of paginated results (https://www.nowtv.com/stream/all-movies/page/1) and extract movies from the script tag of each page. Although the below could use some re-factoring it shows how to obtain the total number of films, calculate the films per page, and issue requests to get all films using Session … Read more

[Solved] Skipp the error while scraping a list of urls form a csv

Here is working version, from bs4 import BeautifulSoup import requests import csv with open(‘urls.csv’, ‘r’) as csvFile, open(‘results.csv’, ‘w’, newline=””) as results: reader = csv.reader(csvFile, delimiter=”;”) writer = csv.writer(results) for row in reader: # get the url url = row[0] # fetch content from server html = requests.get(url).content # soup fetched content soup = BeautifulSoup(html, … Read more

[Solved] organizing data that I am pulling and saving to CSV

You can use pandas to do that. Collect all the data into a dataframe, then just write the dataframe to file. import pandas as pd import requests import bs4 root_url=”https://www.estatesales.net” url_list=[‘https://www.estatesales.net/companies/NJ/Northern-New-Jersey’] results = pd.DataFrame() for url in url_list: response = requests.get(url) soup = bs4.BeautifulSoup(response.text, ‘html.parser’) companies = soup.find_all(‘app-company-city-view-row’) for company in companies: try: link = … Read more

[Solved] Check if Python has written the targeted text

This is an alternative to know if python found the text you are looking for: import requests from bs4 import BeautifulSoup urls = [‘https://www.google.com’] for i in range(len(urls)): r = requests.get(urls[i]) soup = BeautifulSoup(r.content, ‘lxml’) items = soup.find_all(‘p’) for item in items: if “2016 – Privacidad – Condiciones” in item.text: print “Python has found the … Read more