{"id":23409,"date":"2022-11-25T20:58:26","date_gmt":"2022-11-25T15:28:26","guid":{"rendered":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/"},"modified":"2022-11-25T20:58:26","modified_gmt":"2022-11-25T15:28:26","slug":"solved-how-to-scrape-html-using-python-for-nowtv-available-movies","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/","title":{"rendered":"[Solved] How to scrape HTML using Python for NOWTV available movies"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-55879443\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"55879443\" data-parentid=\"55878982\" data-score=\"0\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>You can mimic what the page is doing in terms of paginated results (<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.nowtv.com\/stream\/all-movies\/page\/1\">https:\/\/www.nowtv.com\/stream\/all-movies\/page\/1<\/a>) and extract movies from the script tag of each page. Although the below could use some re-factoring it shows how to obtain the total number of films, calculate the films per page, and issue requests to get all films using Session for efficiency. Result is 1425 movies.<\/p>\n<pre><code>import requests\nimport re\nimport json\nimport math\nimport pandas as pd\n\ntitles = []\nlinks = []\nbase=\"https:\/\/www.nowtv.com\"\nheaders = {'User-Agent' : 'Mozilla\/5.0'}\n\nwith requests.Session() as s:\n    res = s.get('https:\/\/www.nowtv.com\/stream\/all-movies\/page\/1') \n    r = re.compile(r\"var propStore = (.*);\")\n    data = json.loads(r.findall(res.text)[0])\n    first_section = data[next(iter(data))]\n    movies_section = first_section['props']['data']['list']\n    movies_per_page = len(movies_section)\n    total_movies = int(first_section['props']['data']['count'])\n    pages = math.ceil(total_movies \/ movies_per_page)\n\n    for movie in movies_section:\n        titles.append(movie['title'])\n        links.append(base + movie['slug'])\n\n    if pages &gt; 1:\n        for page in range(2, pages + 1):\n            res = s.get('https:\/\/www.nowtv.com\/stream\/all-movies\/page\/{}'.format(page)) \n            r = re.compile(r\"var propStore = (.*);\")\n            data = json.loads(r.findall(res.text)[0])\n            first_section = data[next(iter(data))]\n            movies_section = first_section['props']['data']['list']\n            for movie in movies_section:\n                titles.append(movie['title'])\n                links.append(base + movie['slug'])\n\ndf = pd.DataFrame(list(zip(titles, links)), columns = ['Title', 'Link'])\n<\/code><\/pre>\n<\/p><\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\"><\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved How to scrape HTML using Python for NOWTV available movies <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] You can mimic what the page is doing in terms of paginated results (https:\/\/www.nowtv.com\/stream\/all-movies\/page\/1) and extract movies from the script tag of each page. Although the below could use some re-factoring it shows how to obtain the total number of films, calculate the films per page, and issue requests to get all films using &#8230; <a title=\"[Solved] How to scrape HTML using Python for NOWTV available movies\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/\" aria-label=\"More on [Solved] How to scrape HTML using Python for NOWTV available movies\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[622,346,349,2463,760],"class_list":["post-23409","post","type-post","status-publish","format-standard","hentry","category-solved","tag-beautifulsoup","tag-html","tag-python","tag-screen-scraping","tag-web-scraping"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>[Solved] How to scrape HTML using Python for NOWTV available movies - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] How to scrape HTML using Python for NOWTV available movies - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] You can mimic what the page is doing in terms of paginated results (https:\/\/www.nowtv.com\/stream\/all-movies\/page\/1) and extract movies from the script tag of each page. Although the below could use some re-factoring it shows how to obtain the total number of films, calculate the films per page, and issue requests to get all films using ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2022-11-25T15:28:26+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#\\\/schema\\\/person\\\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] How to scrape HTML using Python for NOWTV available movies\",\"datePublished\":\"2022-11-25T15:28:26+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/\"},\"wordCount\":90,\"publisher\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#organization\"},\"keywords\":[\"beautifulsoup\",\"html\",\"python\",\"screen-scraping\",\"web-scraping\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/\",\"url\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/\",\"name\":\"[Solved] How to scrape HTML using Python for NOWTV available movies - JassWeb\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#website\"},\"datePublished\":\"2022-11-25T15:28:26+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] How to scrape HTML using Python for NOWTV available movies\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#website\",\"url\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/jassweb.com\\\/wp-content\\\/uploads\\\/2021\\\/02\\\/jass-website-logo-1.png\",\"contentUrl\":\"https:\\\/\\\/jassweb.com\\\/wp-content\\\/uploads\\\/2021\\\/02\\\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/#\\\/schema\\\/person\\\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/wp-content\\\/litespeed\\\/avatar\\\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1784720468\",\"url\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/wp-content\\\/litespeed\\\/avatar\\\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1784720468\",\"contentUrl\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/wp-content\\\/litespeed\\\/avatar\\\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1784720468\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\\\/\\\/jassweb.com\"],\"url\":\"https:\\\/\\\/jassweb.com\\\/solved\\\/author\\\/jaspritsinghghumangmail-com\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] How to scrape HTML using Python for NOWTV available movies - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] How to scrape HTML using Python for NOWTV available movies - JassWeb","og_description":"[ad_1] You can mimic what the page is doing in terms of paginated results (https:\/\/www.nowtv.com\/stream\/all-movies\/page\/1) and extract movies from the script tag of each page. Although the below could use some re-factoring it shows how to obtain the total number of films, calculate the films per page, and issue requests to get all films using ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/","og_site_name":"JassWeb","article_published_time":"2022-11-25T15:28:26+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] How to scrape HTML using Python for NOWTV available movies","datePublished":"2022-11-25T15:28:26+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/"},"wordCount":90,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["beautifulsoup","html","python","screen-scraping","web-scraping"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/","url":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/","name":"[Solved] How to scrape HTML using Python for NOWTV available movies - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2022-11-25T15:28:26+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-how-to-scrape-html-using-python-for-nowtv-available-movies\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] How to scrape HTML using Python for NOWTV available movies"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1784720468","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1784720468","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1784720468","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/23409","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=23409"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/23409\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=23409"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=23409"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=23409"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}