{"id":4393,"date":"2022-08-22T15:47:44","date_gmt":"2022-08-22T10:17:44","guid":{"rendered":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/"},"modified":"2022-08-22T15:47:44","modified_gmt":"2022-08-22T10:17:44","slug":"solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/","title":{"rendered":"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-35256979\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"35256979\" data-parentid=\"35255712\" data-score=\"1\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>When you merge your dataframes, you are doing a join on values that are not unique. When you are joining all these dataframes together, you are getting many matches. As you add more and more currencies you are getting something similar to a Cartesian product rather than a join. In the snippet below, I added code to sort the values and then remove duplicates.<\/p>\n<pre><code>from pandas import Series, DataFrame\nimport pandas as pd\n\ncoins=\"\"'\nBitcoin\nRipple\nEthereum\nLitecoin\nDogecoin\nDash\nPeercoin\nMaidSafeCoin\nStellar\nFactom\nNxt\nBitShares\n'''\ncoins = coins.split('\\n')\nAPI = 'https:\/\/api.coinmarketcap.com\/v1\/datapoints\/'\ndata = {}\n\nfor coin in coins:\n    print(coin)\n    try:\n        data[coin]=(pd.read_json(API + coin))\n    except: pass\ndata2 = {}\nfor coin in data:\n    TS = data[coin].market_cap_by_available_supply.map(lambda r: r[0])\n    TS = pd.to_datetime(TS,unit=\"ms\").dt.date\n    cap = data[coin].market_cap_by_available_supply.map(lambda r: r[1])\n    df = DataFrame(columns=['timestamp','cap'])\n    df.timestamp = TS\n    df.cap = cap\n    df.columns = ['timestamp',coin+'_cap']\n    df.sort_values(by=['timestamp',coin+'_cap'])\n    df= df.drop_duplicates(subset=\"timestamp\",keep='last')\n    data2[coin] = df\n\ndf = data2['Bitcoin']\nkeys = data2.keys()\nkeys.remove('Bitcoin')\nfor coin in keys:\n    df = pd.merge(left=df,right=data2[coin],left_on='timestamp', right_on='timestamp', how='left')\n    print len(df),len(df.columns)\ndf.to_csv('caps.csv')\n<\/code><\/pre>\n<p>EDIT:I have added a table belowing showing how the size of the table grows as you do your join operation. <\/p>\n<p>This table shows the number of rows after joining 5,10,15,20,25, and 30 currencies. <\/p>\n<pre><code>Rows,Columns\n1015 5\n1255 10\n5095 15\n132071 20\n4195303 25\n16778215 30\n<\/code><\/pre>\n<p>This table shows how removing duplicates makes your joins only match a single row.<\/p>\n<pre><code>Rows,Columns\n1000 5\n1000 10\n1000 15\n1000 20\n1000 25\n1000 30\n<\/code><\/pre>\n<\/p><\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\">7<\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved Function should clean data to half the size, instead it enlarges it by an order of magnitude <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] When you merge your dataframes, you are doing a join on values that are not unique. When you are joining all these dataframes together, you are getting many matches. As you add more and more currencies you are getting something similar to a Cartesian product rather than a join. In the snippet below, I &#8230; <a title=\"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\" aria-label=\"More on [Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[461,535,415,349],"class_list":["post-4393","post","type-post","status-publish","format-standard","hentry","category-solved","tag-dataframe","tag-memory","tag-pandas","tag-python"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] When you merge your dataframes, you are doing a join on values that are not unique. When you are joining all these dataframes together, you are getting many matches. As you add more and more currencies you are getting something similar to a Cartesian product rather than a join. In the snippet below, I ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2022-08-22T10:17:44+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude\",\"datePublished\":\"2022-08-22T10:17:44+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\"},\"wordCount\":148,\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"keywords\":[\"dataframe\",\"memory\",\"pandas\",\"python\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\",\"url\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\",\"name\":\"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude - JassWeb\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#website\"},\"datePublished\":\"2022-08-22T10:17:44+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/jassweb.com\/solved\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/jassweb.com\/solved\/#website\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/jassweb.com\/solved\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"contentUrl\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"contentUrl\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\/\/jassweb.com\"],\"url\":\"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude - JassWeb","og_description":"[ad_1] When you merge your dataframes, you are doing a join on values that are not unique. When you are joining all these dataframes together, you are getting many matches. As you add more and more currencies you are getting something similar to a Cartesian product rather than a join. In the snippet below, I ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/","og_site_name":"JassWeb","article_published_time":"2022-08-22T10:17:44+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude","datePublished":"2022-08-22T10:17:44+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/"},"wordCount":148,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["dataframe","memory","pandas","python"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/","url":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/","name":"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2022-08-22T10:17:44+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-function-should-clean-data-to-half-the-size-instead-it-enlarges-it-by-an-order-of-magnitude\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] Function should clean data to half the size, instead it enlarges it by an order of magnitude"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/4393","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=4393"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/4393\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=4393"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=4393"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=4393"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}