[Solved] Improve function to recognize and convert unicode emojis [closed]


Your current use of preg_replace_callback() assumes that all regex matches will be replaced with a link. Since the emojis will not be used as part of a link, the simples solution is to leave the preg_replace_callback() as-is, an add an extra step after that where we do the unicode replacement.

function convertAll($str) {
    $regex = "/[@#](\w+)/";
    //type and links
    $hrefs = [
        '#' => 'hashtag?tag',
        '@' => 'profile?username'
    ];

    $result = preg_replace_callback($regex, function($matches) use ($hrefs) {
         return sprintf(
             '<a href="https://stackoverflow.com/questions/57415065/%s=%s">%s</a>',
             $hrefs[$matches[0][0]],
             $matches[1], 
             $matches[0]
         );
    }, $str);

    $result = preg_replace("/U\+([A-F0-9]{5})/", '\u{${1}}', $result);

    return($result);
}

The regex part of the preg_replace() is saying to match a literal “U” followed by a literal “+” followed by 5 instances of any characters A-Z or 0-9. We are capturing those 5 characters and putting them after a literal “\u{” and then following them with a literal “}”.

DEMO

There may be a way to do this within preg_replace_callback(), but that seemed a bit more effort than I was willing to put in right now. If someone comes up with an answer that does that, I’d love to see it.

To replace with HTML entities use this preg_replace instead:

$result = preg_replace("/U\+([A-F0-9]{5})/", "&#x\\1;", $result);

12

solved Improve function to recognize and convert unicode emojis [closed]