Your current use of preg_replace_callback()
assumes that all regex matches will be replaced with a link. Since the emojis will not be used as part of a link, the simples solution is to leave the preg_replace_callback()
as-is, an add an extra step after that where we do the unicode replacement.
function convertAll($str) {
$regex = "/[@#](\w+)/";
//type and links
$hrefs = [
'#' => 'hashtag?tag',
'@' => 'profile?username'
];
$result = preg_replace_callback($regex, function($matches) use ($hrefs) {
return sprintf(
'<a href="https://stackoverflow.com/questions/57415065/%s=%s">%s</a>',
$hrefs[$matches[0][0]],
$matches[1],
$matches[0]
);
}, $str);
$result = preg_replace("/U\+([A-F0-9]{5})/", '\u{${1}}', $result);
return($result);
}
The regex part of the preg_replace()
is saying to match a literal “U” followed by a literal “+” followed by 5 instances of any characters A-Z or 0-9. We are capturing those 5 characters and putting them after a literal “\u{” and then following them with a literal “}”.
There may be a way to do this within preg_replace_callback()
, but that seemed a bit more effort than I was willing to put in right now. If someone comes up with an answer that does that, I’d love to see it.
To replace with HTML entities use this preg_replace
instead:
$result = preg_replace("/U\+([A-F0-9]{5})/", "&#x\\1;", $result);
12
solved Improve function to recognize and convert unicode emojis [closed]