[Solved] Regular expression findall python


This looks like a json object but doesn’t have [] around it to make it an actual list. You should be able to convert it into a Python native list of dictionaries and navigate it:

import json

recipe = json.loads('[' + your_text + ']')
steps = [obj["text"] for obj in recipe if obj.get("@type") == "HowToStep"] 

What worries me though is that unless you truncated your text to post here, this may not be a well formed json. In which case you cannot use the above code and instead use regular expressions like this:

import re

regex = r"\"text\":\"([^\"]*)\""
matches = re.findall(regex, your_text) 

‘matches’ should now be a list of all the text elements.

Curious about how this regex works? Here’s a simulator

solved Regular expression findall python