[Solved] Twitter python scraping

This is because you scrape it manually, the page shows the first 20 members and loads (by the use of a AJAX call) more members dynamically if you scroll down. This behaviour does not happen when you perform a http request in python. As Arkanosis and Odi already suggested, use the Twitter API to make … Read more

[Solved] scrapy/Python crawls but does not scrape data

Your imports didn’t work that well over here, but that might be a configuration issue on my side. I think the scraper below does what you’re searching for: import scrapy class YelpSpider(scrapy.Spider): name=”yelp_spider” allowed_domains=[“yelp.com”] headers=[‘venuename’,’services’,’address’,’phone’,’location’] def __init__(self): self.start_urls = [‘https://www.yelp.com/search?find_desc=&find_loc=Springfield%2C+IL&ns=1’] def start_requests(self): requests = [] for item in self.start_urls: requests.append(scrapy.Request(url=item, headers={‘Referer’:’http://www.google.com/’})) return requests def parse(self, … Read more

[Solved] Parse HTML nodes using xpath to Ruby/Nokogiri

I have just tried with Capybara with Poltergeist; it worked fine. When I tried your code as well but, div[@id=”NavFrame1″] does not exist. So there might be a parsing problem… require ‘capybara’ require ‘capybara/dsl’ require ‘capybara/poltergeist’ Capybara.register_driver :poltergeist_debug do |app| Capybara::Poltergeist::Driver.new(app, inspector: true) end Capybara.javascript_driver = :poltergeist_debug Capybara.current_driver = :poltergeist_debug visit(“https://pt.wiktionary.org/wiki/fazer”) doc = Nokogiri::HTML.parse(page.html) p … Read more

[Solved] Regex for HTML manipulation [closed]

It is bad practice to use regular expressions to parse HTML. Instead, use the tools provided in PHP that are specifically geared toward parsing HTML, namely DomDocument[doc]. // create a new DomDocument object $doc = new DOMDocument(); // load the HTML into the DomDocument object (this would be your source HTML) $doc->loadHTML(‘ <table> <tr> <td … Read more

[Solved] XML DOMDocument PHP – Get node where attribute value [closed]

Try this <?php $slideids = array(); $get_id = 2; $xml = new DOMDocument(); $xml->load(‘test.xml’); // path of your XML file ,make sure path is correct $xpd = new DOMXPath($xml); false&&$result_data = new DOMElement(); //this is for my IDE to have intellysense $result = $xpd->query(“//row[@id=”.$get_id.”]/*”); // change the table naem here foreach($result as $result_data){ $key = … Read more

[Solved] What is the significance of the attribute xpath=”1″ while constructing locators for Selenium tests

That attribute (xpath=”1″) is placed there by a browser extension named CHROPATH. It is provided by a feature they call Dynamic Attribute Support. Scolling down the page one will find a text description of how to use the tool. Scroll to Note: at the bottom of the page, or search for “Note:” within the page … Read more

[Solved] How to locate an element and extract required text with Selenium and Python

You can use driver.find_element_by_css_selector(‘.form-control + [for=address]’).text Use replace() to remove the enter this code: string if required That is a class selector “.” with adjacent sibling combinator joining to attribute = value selector. So element with attribute for having value address that is adjacent to element with class form-control. solved How to locate an element … Read more

[Solved] get elements by attribute value

You can use XPath with an expression like //Book[ListOfBookUser/BookUser]: var xmlMarkup = `<ListOfBook> <Book> <Id>ACIA-11QWTKX</Id> <ListOfBookUser recordcount=”0″ lastpage=”true”> </ListOfBookUser> </Book> <Book> <Id>ACIA-ANC0CC</Id> <ListOfBookUser recordcount=”1″ lastpage=”true”> <BookUser> <BookId>ACIA-ANC0CC</BookId> <BookName>TKSP_GLOBAL</BookName> </BookUser> </ListOfBookUser> </Book> <Book> <Id>ACIA-ANC0CF</Id> <ListOfBookUser recordcount=”0″ lastpage=”true”> </ListOfBookUser> </Book> <Book> <Id>ACIA-EUMCH5</Id> <ListOfBookUser recordcount=”1″ lastpage=”true”> <BookUser> <BookId>ACIA-EUMCH5</BookId> <BookName>TKSP_MADRID_CENTRO_SUR</BookName> </BookUser> </ListOfBookUser> </Book> </ListOfBook>`; var xmlDoc = new DOMParser().parseFromString(xmlMarkup, … Read more

[Solved] scraping with selenium web driver

You should fix your XPath expressions. Use findElement for the first 3. findElements for the last. To get the home odd : //td[a[.=”bet365″]]/following-sibling::td[span][1]/span To get the draw odd : //td[a[.=”bet365″]]/following-sibling::td[span][2]/span To get the away odd : //td[a[.=”bet365″]]/following-sibling::td[span][3]/span To get them all : //td[a[.=”bet365″]]/following-sibling::td[span]/span Getting them all is probably better since you call driver.find_elements_by_xpath 1 time. … Read more

[Solved] Syntax Error: XPath Is Not a Legal Expression

That’s a lousy diagnostic message. Your particular XPath syntax problem Rather than ||, which is logical OR in some languages, you’re probably looking for |, which is nodeset union in XPath. (That is, assuming you’re not aiming for XPath 3.0’s string concatenation operator.) How to find and fix XPath syntax problems in general Use a … Read more