Your code as posted doesn’t run. And, even after I guess at how to fix it to run, it does not actually do what you claim. But I’m pretty sure I know where the error is anyway.
This code does not return an empty string, but a "
:
text = div.get_text().strip().split(" ", 1)[0].strip()
… and it’s not because of strip
. Because, Contrary to what you claim, this code does not include the text you want in the first place:
text = div.get_text().strip().split(" ", 1)[0]
… but rather '"\n'
. So of course stripping that gives you an empty string.
If you print out the intermediate pieces, you can see why:
>>> div.get_text()
'\n "\n Text I want \n "\n \nEdit\n\n'
>>> div.get_text().strip()
'"\n Text I want \n "\n \nEdit'
>>> div.get_text().strip().split(" ", 1)
['"\n', ' Text I want \n "\n \nEdit']
>>> div.get_text().strip().split(" ", 1)[0]
'"\n'
>>> div.get_text().strip().split(" ", 1)[0].strip()
'"'
It looks like what you actually want to do is find the text between the first two "
characters, and then split that:
>>> div.get_text().strip().split('"', 2)[1].strip()
'Text I want'
But also, I think you’re making things more complicated than they need to be by including all descendant text instead of just the immediate child text. If we don’t have the Edit
part to deal with, the whole thing is just the text you want surrounded by a complicated mix of spaces, newlines, and quotes… which we can strip out all in one go:
>>> div.contents[0]
'\n "\n Text I want \n "\n
>>> div.contents[0].strip(' \n"')
'Text I want'
solved Python’s strip() function not working