{"id":10783,"date":"2022-09-24T23:39:02","date_gmt":"2022-09-24T18:09:02","guid":{"rendered":"http:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/"},"modified":"2022-09-24T23:39:02","modified_gmt":"2022-09-24T18:09:02","slug":"solved-converting-regex-to-account-for-international-characters","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/","title":{"rendered":"[Solved] Converting regex to account for international characters"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-30113100\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"30113100\" data-parentid=\"30112917\" data-score=\"3\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>Here are the required steps:<\/p>\n<ul>\n<li>\n<p>Use the <code>u<\/code> pattern option. This turns on <code>PCRE_UTF8<\/code> <em>and<\/em> <code>PCRE_UCP<\/code> (the PHP docs forget to mention that one):<\/p>\n<blockquote>\n<p><code>PCRE_UTF8<\/code><\/p>\n<p>This option causes PCRE to regard both the pattern and the subject as strings of UTF-8 characters instead of single-byte strings. However, it is available only when PCRE is built to include UTF support. If not, the use of this option provokes an error. Details of how this option changes the behaviour of PCRE are given in the pcreunicode page.<\/p>\n<p><code>PCRE_UCP<\/code><\/p>\n<p>This option changes the way PCRE processes <code>\\B<\/code>, <code>\\b<\/code>, <code>\\D<\/code>, <code>\\d<\/code>, <code>\\S<\/code>, <code>\\s<\/code>, <code>\\W<\/code>, <code>\\w<\/code>, and some of the POSIX character classes. By default, only ASCII characters are recognized, but if PCRE_UCP is set, Unicode properties are used instead to classify characters. More details are given in the section on generic character types in the pcrepattern page. If you set PCRE_UCP, matching one of the items it affects takes much longer. The option is available only if PCRE has been compiled with Unicode property support.<\/p>\n<\/blockquote>\n<\/li>\n<li>\n<p><code>\\d<\/code> will do just fine with <code>PCRE_UCP<\/code> (it&#8217;s equivalent to <code>\\p{N}<\/code> already), but you have to replace these <code>[a-z]<\/code> ranges to account for accented characters:<\/p>\n<ul>\n<li>Replace <code>[a-zA-Z]<\/code> with <code>\\p{L}<\/code><\/li>\n<li>Replace <code>[a-z]<\/code> with <code>\\p{Ll}<\/code><\/li>\n<li>Replace <code>[A-Z]<\/code> with <code>\\p{Lu}<\/code><\/li>\n<\/ul>\n<p><code>\\p{X}<\/code> means: <em>a character from <a rel=\"nofollow noopener\" target=\"_blank\" href=\"http:\/\/www.fileformat.info\/info\/unicode\/category\/index.htm\">Unicode category<\/a> X<\/em>, where <code>L<\/code> means <em>letter<\/em>, <code>Ll<\/code> means <em>lowercase letter<\/em> and <code>Lu<\/code> means <em>uppercase letter<\/em>. You can get a list from <a rel=\"nofollow noopener\" target=\"_blank\" href=\"http:\/\/www.pcre.org\/original\/doc\/html\/pcresyntax.html\">the docs<\/a>.<\/p>\n<p>Note that you can use <code>\\p{X}<\/code> inside a character class: <code>[\\p{L}\\d\\s]<\/code> for instance.<\/p>\n<\/li>\n<li>\n<p>And make sure you use UTF8 encoding for your strings in PHP. Also, make sure you use Unicode-aware functions to handle these strings.<\/p>\n<\/li>\n<\/ul>\n<\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\"><\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved Converting regex to account for international characters <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Here are the required steps: Use the u pattern option. This turns on PCRE_UTF8 and PCRE_UCP (the PHP docs forget to mention that one): PCRE_UTF8 This option causes PCRE to regard both the pattern and the subject as strings of UTF-8 characters instead of single-byte strings. However, it is available only when PCRE is &#8230; <a title=\"[Solved] Converting regex to account for international characters\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\" aria-label=\"More on [Solved] Converting regex to account for international characters\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[2956,339,347],"class_list":["post-10783","post","type-post","status-publish","format-standard","hentry","category-solved","tag-internationalization","tag-php","tag-regex"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>[Solved] Converting regex to account for international characters - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] Converting regex to account for international characters - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] Here are the required steps: Use the u pattern option. This turns on PCRE_UTF8 and PCRE_UCP (the PHP docs forget to mention that one): PCRE_UTF8 This option causes PCRE to regard both the pattern and the subject as strings of UTF-8 characters instead of single-byte strings. However, it is available only when PCRE is ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2022-09-24T18:09:02+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] Converting regex to account for international characters\",\"datePublished\":\"2022-09-24T18:09:02+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\"},\"wordCount\":265,\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"keywords\":[\"internationalization\",\"php\",\"regex\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\",\"url\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\",\"name\":\"[Solved] Converting regex to account for international characters - JassWeb\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#website\"},\"datePublished\":\"2022-09-24T18:09:02+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/jassweb.com\/solved\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] Converting regex to account for international characters\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/jassweb.com\/solved\/#website\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/jassweb.com\/solved\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"contentUrl\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"contentUrl\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\/\/jassweb.com\"],\"url\":\"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] Converting regex to account for international characters - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] Converting regex to account for international characters - JassWeb","og_description":"[ad_1] Here are the required steps: Use the u pattern option. This turns on PCRE_UTF8 and PCRE_UCP (the PHP docs forget to mention that one): PCRE_UTF8 This option causes PCRE to regard both the pattern and the subject as strings of UTF-8 characters instead of single-byte strings. However, it is available only when PCRE is ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/","og_site_name":"JassWeb","article_published_time":"2022-09-24T18:09:02+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] Converting regex to account for international characters","datePublished":"2022-09-24T18:09:02+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/"},"wordCount":265,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["internationalization","php","regex"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/","url":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/","name":"[Solved] Converting regex to account for international characters - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2022-09-24T18:09:02+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-converting-regex-to-account-for-international-characters\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] Converting regex to account for international characters"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/10783","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=10783"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/10783\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=10783"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=10783"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=10783"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}