{"id":30228,"date":"2023-01-14T06:32:06","date_gmt":"2023-01-14T01:02:06","guid":{"rendered":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/"},"modified":"2023-01-14T06:32:06","modified_gmt":"2023-01-14T01:02:06","slug":"solved-reformat-csv-file-using-python","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/","title":{"rendered":"[Solved] Reformat csv file using python?"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-37628996\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"37628996\" data-parentid=\"37628184\" data-score=\"1\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>Think of it in terms of two separate tasks:<\/p>\n<ul>\n<li>Collect some data items from a \u2018dirty\u2019 source (this CSV file)<\/li>\n<li>Store that data somewhere so that it\u2019s easy to access and manipulate programmatically (according to what you want to do with it)<\/li>\n<\/ul>\n<h2>Processing dirty CSV<\/h2>\n<p>One way to do this is to have a function <code>deserialize_business()<\/code> to distill structured business information from each incoming line in your CSV. This function can be complex because that\u2019s the nature of the task, but still it\u2019s advisable to split it into self-containing smaller functions (such as <code>get_outlets()<\/code>, <code>get_headings()<\/code>, and so on). This function can return a dictionary but depending on what you want it can be a [named] tuple, a custom object, etc.<\/p>\n<p>This function would be an \u2018adapter\u2019 for this particular CSV data source.<\/p>\n<p>Example of deserialization function:<\/p>\n<pre><code>def deserialize_business(csv_line):\n    \"\"\"\n    Distills structured business information from given raw CSV line.\n    Returns a dictionary like {name, phone, owner,\n    btype, yoe, headings[], outlets[], category}.\n    \"\"\"\n\n    pieces = [piece.strip(\"[[\\\"\\']] \") for piece in line.strip().split(',')]\n\n    name = pieces[0]\n    phone = pieces[1]\n    owner = pieces[2]\n    btype = pieces[3]\n    yoe = pieces[4]\n\n    # after yoe headings begin, until substring Outlets Address\n    headings = pieces[4:pieces.index(\"Outlets Address\")]\n\n    # outlets go from substring Outlets Address until category\n    outlet_pieces = pieces[pieces.index(\"Outlets Address\"):-1]\n\n    # combine each individual outlet information into a string\n    # and let ``deserialize_outlet()`` deal with that\n    raw_outlets=\", \".join(outlet_pieces).split(\"Outlets Address\")\n    outlets = [deserialize_outlet(outlet) for outlet in raw_outlets]\n\n    # category is the last piece\n    category = pieces[-1]\n\n    return {\n        'name': name,\n        'phone': phone,\n        'owner': owner,\n        'btype': btype,\n        'yoe': yoe,\n        'headings': headings,\n        'outlets': outlets,\n        'category': category,\n    }\n<\/code><\/pre>\n<p>Example of calling it:<\/p>\n<pre><code>with open(\"phonebookCOMPK-Directory.csv\") as f:\n    lineno = 0\n\n    for line in f:\n        lineno += 1\n\n        try:\n            business = deserialize_business(line)\n\n        except:\n            # Bad line formatting?\n            log.exception(u\"Failed to deserialize line #%s!\", lineno)\n\n        else:\n            # All is well\n            store_business(business)\n<\/code><\/pre>\n<h2>Storing the data<\/h2>\n<p>You\u2019ll have the <code>store_business()<\/code> function take your data structure and write it somewhere. Maybe it\u2019ll be another CSV that\u2019s better structured, maybe multiple CSVs, a JSON file, or you can make use of SQLite relational database facilities since Python has it built-in.<\/p>\n<p>It all depends on what you want to do later.<\/p>\n<h3>Relational example<\/h3>\n<p>In this case your data would be split across multiple tables. (I\u2019m using the word \u201ctable\u201d but it can be a CSV file, although you can as well make use of an SQLite DB since Python has that built-in.)<\/p>\n<p>Table identifying all possible business headings:<\/p>\n<pre><code>business heading ID, name\n1, Abattoirs\n2, Exporters\n3, Food Delivery\n4, Butchers Retail\n5, Meat Dealers-Retail\n6, Meat Freezer\n7, Meat Packers\n<\/code><\/pre>\n<p>Table identifying all possible categories:<\/p>\n<pre><code>category ID, parent category, name\n1, NULL, \"Agriculture, fishing &amp; Forestry\"\n2, 1, \"Farming equipment &amp; services\"\n3, 2, \"Abattoirs in Pakistan\"\n<\/code><\/pre>\n<p>Table identifying businesses:<\/p>\n<pre><code>business ID, name, phone, owner, type, yoe, category\n1, Meat One, +92-21-111163281, Al Shaheer Corporation, Retailers, 2008, 3\n<\/code><\/pre>\n<p>Table describing their outlets:<\/p>\n<pre><code>business ID, city, address, landmarks, phone\n1, Karachi UAN, \"Shop 13, Ground Floor, Plot 14-D, Sky Garden, Main Tipu Sultan Road, KDA Scheme No.1, Karachi\", \"Nadra Chowrangi, Sky Garden, Tipu Sultan Road\", +92-21-111163281\n1, Karachi UAN, \"Near Jan's Broast, Boat Basin, Khayaban-e-Roomi, Block 5, Clifton, Karachi\", \"Boat Basin, Jans Broast, Khayaban-e-Roomi\", +92-21-111163281\n<\/code><\/pre>\n<p>Table describing their headings:<\/p>\n<pre><code>business ID, business heading ID\n1, 1\n1, 2\n1, 3\n\u2026\n<\/code><\/pre>\n<p>Handling all this would require a complex <code>store_business()<\/code> function. It may be worth looking into SQLite and some ORM framework, if going with relational way of keeping the data.<\/p>\n<\/p><\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\">4<\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved Reformat csv file using python? <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Think of it in terms of two separate tasks: Collect some data items from a \u2018dirty\u2019 source (this CSV file) Store that data somewhere so that it\u2019s easy to access and manipulate programmatically (according to what you want to do with it) Processing dirty CSV One way to do this is to have a &#8230; <a title=\"[Solved] Reformat csv file using python?\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\" aria-label=\"More on [Solved] Reformat csv file using python?\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[483,482],"class_list":["post-30228","post","type-post","status-publish","format-standard","hentry","category-solved","tag-csv","tag-python-3-x"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>[Solved] Reformat csv file using python? - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] Reformat csv file using python? - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] Think of it in terms of two separate tasks: Collect some data items from a \u2018dirty\u2019 source (this CSV file) Store that data somewhere so that it\u2019s easy to access and manipulate programmatically (according to what you want to do with it) Processing dirty CSV One way to do this is to have a ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2023-01-14T01:02:06+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] Reformat csv file using python?\",\"datePublished\":\"2023-01-14T01:02:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\"},\"wordCount\":303,\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"keywords\":[\"csv\",\"python-3.x\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\",\"url\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\",\"name\":\"[Solved] Reformat csv file using python? - JassWeb\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#website\"},\"datePublished\":\"2023-01-14T01:02:06+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/jassweb.com\/solved\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] Reformat csv file using python?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/jassweb.com\/solved\/#website\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/jassweb.com\/solved\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"contentUrl\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775798750\",\"contentUrl\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775798750\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\/\/jassweb.com\"],\"url\":\"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] Reformat csv file using python? - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] Reformat csv file using python? - JassWeb","og_description":"[ad_1] Think of it in terms of two separate tasks: Collect some data items from a \u2018dirty\u2019 source (this CSV file) Store that data somewhere so that it\u2019s easy to access and manipulate programmatically (according to what you want to do with it) Processing dirty CSV One way to do this is to have a ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/","og_site_name":"JassWeb","article_published_time":"2023-01-14T01:02:06+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] Reformat csv file using python?","datePublished":"2023-01-14T01:02:06+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/"},"wordCount":303,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["csv","python-3.x"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/","url":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/","name":"[Solved] Reformat csv file using python? - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2023-01-14T01:02:06+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-reformat-csv-file-using-python\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] Reformat csv file using python?"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775798750","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1775798750","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/30228","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=30228"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/30228\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=30228"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=30228"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=30228"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}