{"id":32079,"date":"2023-01-26T15:08:11","date_gmt":"2023-01-26T09:38:11","guid":{"rendered":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/"},"modified":"2023-01-26T15:08:11","modified_gmt":"2023-01-26T09:38:11","slug":"solved-what-are-unicode-codepoint-types-for","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/","title":{"rendered":"[Solved] What are Unicode codepoint types for?"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-73688043\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"73688043\" data-parentid=\"73663349\" data-score=\"1\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>Texts have many different meaning and usages, so the question is difficult to answer.<\/p>\n<p>First: about <strong>codepoint<\/strong>. We uses the term <em>codepoint<\/em> because it is easy, it implies a number (<em>code<\/em>), and not really confuseable with other terms. Unicode tell us that it doesn&#8217;t use the term <em>codepoint<\/em> and <em>character<\/em> in a consistent way, but also that it is not a problem: context is clear, and they are often interchangeable (but for few codepoints which are not characters, like surrogates, and few reserved codepoints). Note: Unicode is mostly about characters, and ISO 10646 <em>was<\/em> most about codepoints. So original ISO was about a table with numbers (codepoint) and names, and Unicode about properties of characters. So we may use <em>codepoints<\/em> where Unicode <em>character<\/em> should be better, but <em>character<\/em> is easy confuseable with C <code>char<\/code>, and with font glyphs\/graphemes.<\/p>\n<p>Codepoints are one basic unit, so useful for most of programs, e.g. to store in databases, to exchange to other programs, to save files, for sorting, etc. For this exact reasons program languages uses the codepoint as type. UTF-8 code units may be an alternative, but it would be more difficult to navigate (see a UTF-8 as a tape disk where you should read sequentially, and codepoint text as an hard disk where you can just in middle of a text). Not a 100% appropriate, because you may need some context bytes. If you are getting user text, your program probably do not need to split in graphemes, to do liguatures, etc. if it will just store the data in a database. Codepoint is really low level and so fast for most operations.<\/p>\n<p>The other part of text: displaying (or speech). This part is very complex, because we have many different scripts with very different rules, and then different languages with own special cases. So we needs a series of libraries, e.g. text layout (so word separation, etc. like pango), sharper engine (to find which glyph to use, combining characters, where to put next characters, e.g. HarfBuzz), and a font library which display the font (cairo plus freetype). it is complex, but most programmers do not need special handling: just reading text from database and sent to screen, so we just uses the relevant library (and it depends on operating system), and just going on. It is too complex for a language specification (and also a moving target, maybe in 30 years things are more standardized). So it is complex, and with many operation, so we may use complex structures (array of array of codepoint: so array of graphemes): not much a slow down. Note: fonts have codepoint tables to perform various operation before to find the glyph index. Various API uses Unicode strings (as codepoint array, UTF-16, UTF-8, etc.).<\/p>\n<p>Naturally things are more complex, and it requires a lot of knowledge of different part of Unicode, if you are trying to program an editor (WYSIWYG, but also with terminals): you mix both worlds, and you need much more information (e.g. for selection of text). But in this case you must create your own structures.<\/p>\n<p>And really: things are complex: do you want to just show first x characters on your blog? (maybe about assessment), or split at words (some language are not so linear, so the interpretation may be very wrong). For now just humans can do a good job for all languages, so also not yet need to a supporting type in different languages.<\/p>\n<\/p><\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\"><\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved What are Unicode codepoint types for? <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Texts have many different meaning and usages, so the question is difficult to answer. First: about codepoint. We uses the term codepoint because it is easy, it implies a number (code), and not really confuseable with other terms. Unicode tell us that it doesn&#8217;t use the term codepoint and character in a consistent way, &#8230; <a title=\"[Solved] What are Unicode codepoint types for?\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\" aria-label=\"More on [Solved] What are Unicode codepoint types for?\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[596,942,938,426],"class_list":["post-32079","post","type-post","status-publish","format-standard","hentry","category-solved","tag-go","tag-rust","tag-unicode","tag-utf-8"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>[Solved] What are Unicode codepoint types for? - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] What are Unicode codepoint types for? - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] Texts have many different meaning and usages, so the question is difficult to answer. First: about codepoint. We uses the term codepoint because it is easy, it implies a number (code), and not really confuseable with other terms. Unicode tell us that it doesn&#8217;t use the term codepoint and character in a consistent way, ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2023-01-26T09:38:11+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] What are Unicode codepoint types for?\",\"datePublished\":\"2023-01-26T09:38:11+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\"},\"wordCount\":592,\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"keywords\":[\"go\",\"rust\",\"unicode\",\"utf-8\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\",\"url\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\",\"name\":\"[Solved] What are Unicode codepoint types for? - JassWeb\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#website\"},\"datePublished\":\"2023-01-26T09:38:11+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/jassweb.com\/solved\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] What are Unicode codepoint types for?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/jassweb.com\/solved\/#website\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/jassweb.com\/solved\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"contentUrl\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775193939\",\"contentUrl\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775193939\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\/\/jassweb.com\"],\"url\":\"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] What are Unicode codepoint types for? - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] What are Unicode codepoint types for? - JassWeb","og_description":"[ad_1] Texts have many different meaning and usages, so the question is difficult to answer. First: about codepoint. We uses the term codepoint because it is easy, it implies a number (code), and not really confuseable with other terms. Unicode tell us that it doesn&#8217;t use the term codepoint and character in a consistent way, ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/","og_site_name":"JassWeb","article_published_time":"2023-01-26T09:38:11+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] What are Unicode codepoint types for?","datePublished":"2023-01-26T09:38:11+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/"},"wordCount":592,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["go","rust","unicode","utf-8"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/","url":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/","name":"[Solved] What are Unicode codepoint types for? - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2023-01-26T09:38:11+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-what-are-unicode-codepoint-types-for\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] What are Unicode codepoint types for?"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775193939","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775193939","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/32079","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=32079"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/32079\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=32079"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=32079"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=32079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}