[Solved] Beautifulsoup: Is it possible to get tag name and attribute name by its value? [closed]

You could define a filter function that checks if there is one HTML tag with a attribute value equal to value: def your_filter(tag, value): for key in tag.attrs.keys(): if tag[key] == value: return True return False # alternatively as one liner: def your_filter(tag, value): return any(tag[key] == value for key in tag.attrs.keys()) Then, you could … Read more

[Solved] AttributeError: ‘NoneType’ object has no attribute ‘find_all’ (Many of the other questions asked weren’t applicable)

It means you try to call find_all on the value None. That could be row.tbody for example, perhaps because there is no <tbody> in the actual HTML. Keep in mind that the <tbody> element is implied. It’ll be visible in your browser’s DOM inspector, but that doesn’t mean it is actually present in the HTML … Read more

[Solved] This code for Web Scraping using python returning None. Why? Any help would be appreciated

Your code works fine but there is a robot check before the product page so your request looks for the span tag in that robot check page, fails and returns None. Here is a link which may help you: python requests & beautifulsoup bot detection solved This code for Web Scraping using python returning None. … Read more

[Solved] Can’t deal with some complicated laid-out content from a webpage

You can take advantage of CSS selector span[id$=lblResultsRaceName], which finds all spans that’s id ends with lblResultsRaceName and ‘td > span’, which finds all spans that have direct parent <td>: This code snippet will go through all racing result and prints all races: import requests from bs4 import BeautifulSoup url = “https://www.thedogs.com.au/Racing/Results.aspx?SearchDate=3-Jun-2018” def get_info(session,link): session.headers[‘User-Agent’] … Read more

[Solved] when I write findAll it says: findAll is not defined [closed]

If I am assuming right and you are using from bs4 import BeautifulSoup you need to understand that find_all is part of the bs4.element.Tag object findAll might not work obj = BeautifulSoup(html_text, ‘html.parser’) obj.find_all(“tr”,{“class”:”match”}) This should solve your problem. 3 solved when I write findAll it says: findAll is not defined [closed]

[Solved] How to scrape HTML using Python for NOWTV available movies

You can mimic what the page is doing in terms of paginated results (https://www.nowtv.com/stream/all-movies/page/1) and extract movies from the script tag of each page. Although the below could use some re-factoring it shows how to obtain the total number of films, calculate the films per page, and issue requests to get all films using Session … Read more

[Solved] How to get product price from json [closed]

What happens? You try to find a price in the json, but there is no price information available. How to get the price? You have to call another api with the productId per item: requests.get(‘https://www.adidas.com/api/search/product/’+item[‘productId’],headers=headers) Example import requests url = “https://www.adidas.com/api/plp/content-engine?” params = { ‘sitePath’: ‘us’, ‘query’: ‘women-athletic_sneakers’ } headers = { ‘User-Agent’: ‘Mozilla/5.0 (Windows … Read more

[Solved] Python’s strip() function not working

Your code as posted doesn’t run. And, even after I guess at how to fix it to run, it does not actually do what you claim. But I’m pretty sure I know where the error is anyway. This code does not return an empty string, but a “: text = div.get_text().strip().split(” “, 1)[0].strip() … and … Read more

[Solved] Extract all pages from a Table

To scrape all the pages, observe that trailing parameter in the url increments by 2, rather than 1. Thus, the code below finds the maximum page in the listing, multiples the latter result by 2, and utilizes the result as a range: import requests, re, contextlib from bs4 import BeautifulSoup as soup import csv @contextlib.contextmanager … Read more

[Solved] Search for a local html file by name

In Python 2.x, this could be done as follows: from bs4 import BeautifulSoup filename = raw_input(‘Please enter filename: ‘) with open(filename) as f_input: html = f_input.read() soup = BeautifulSoup(html, “html.parser”) print soup solved Search for a local html file by name

[Solved] How can I extract the text between ? [closed]

import urllib from bs4 import BeautifulSoup html = urllib.urlopen(‘http://www.last.fm/user/Jehl/charts?rangetype=overall&subtype=artists’).read() soup = BeautifulSoup(html) print soup(‘a’) # prints [<a href=”https://stackoverflow.com/” id=”lastfmLogo”>Last.fm</a>, <a class=”nav-link” href=”http://stackoverflow.com/music”>Music</a>…. For getting the text of each one of them. for link in soup(‘a’): print link.get_text() 1 solved How can I extract the text between ? [closed]

[Solved] Scrape Multiple URLs from CSV using Beautiful Soup & Python

Assuming that your urls.csv file look like: https://stackoverflow.com;code site; https://steemit.com;block chain social site; The following code will work: #!/usr/bin/python # -*- coding: utf-8 -*- from bs4 import BeautifulSoup #required to parse html import requests #required to make request #read file with open(‘urls.csv’,’r’) as f: csv_raw_cont=f.read() #split by line split_csv=csv_raw_cont.split(‘\n’) #remove empty line split_csv.remove(”) #specify separator … Read more