Why couldn’t regex:
<post\\s*author=\"([^\"]+)\"[^>]+>[^</post>]*</post>extract the author in following text.
Because
[^</post>]*
represents a character class and will match everything but the characters <, /, p, o, s, t, and > 0 or more times.
That doesn’t happen in your text. As for how to fix it, consider using the following regex
<post\s*author=\"([^\"]+?)\"[^>]+>(.|\s)*?<\/post>
// obviously, escape appropriate characters in Java String literals
with a multiline flag.
4
solved extract information from xml using regular expression