[Solved] Extracting variables from Javascript inside HTML


You could use BeautifulSoup to extract the <script> tag, but you would still need an alternative approach to extract the information inside.

Some Python can be used to first extract flashvars and then pass this to demjson to convert the Javascript dictionary into a Python one. For example:

import demjson

content = """<script type="text/javascript">/* <![CDATA[ */ 
... 
...
</script>"""

script_var = content.split('var flashvars=")[1]
script_var = script_var[:script_var.find("};') + 1]
data = demjson.decode(script_var)

print(data['video_url'])
print(data['video_alt_url'])

This would then display:

https://www.ptrex.com/get_file/4/996a9088fdf801992d24457cd51469f3f7aaaee6a0/33000/33247/33247.mp4/
https://www.ptrex.com/get_file/4/774833c428771edee2cf401ef2264e746a06f9f370/33000/33247/33247_720p.mp4/

demjson is an alternative JSON decoder which can be installed via PIP

pip install demjson

4

solved Extracting variables from Javascript inside HTML