{"id":18278,"date":"2022-10-30T10:32:28","date_gmt":"2022-10-30T05:02:28","guid":{"rendered":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/"},"modified":"2022-10-30T10:32:28","modified_gmt":"2022-10-30T05:02:28","slug":"solved-search-and-output-with-python-closed","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/","title":{"rendered":"[Solved] Search and output with Python [closed]"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-34146977\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"34146977\" data-parentid=\"33665160\" data-score=\"0\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>I don&#8217;t think I fully understand your question. Posting your code and an example file would have been very helpful.<\/p>\n<p>This code will count all entries in all files, then it will identify unique entries per file. After that, it will count each entry&#8217;s occurrence in each file. Then, it will select only entries that appeared at least in 90% of all files.<\/p>\n<p>Also, this code could have been shorter, but for readability&#8217;s sake, I created many variables, with long, meaningful names. <\/p>\n<p>Please read the comments \ud83d\ude09<\/p>\n<pre class=\"lang-python prettyprint-override\"><code>import os\nfrom collections import Counter\nfrom sys import argv\n\n# adjust your cut point\nPERCENT_CUT = 0.9\n\n# here we are going to save each file's entries, so we can sum them later\nfiles_dict = {}\n\n# total files seems to be the number you'll need to check against count\ntotal_files  = 0;\n\n# raw total entries, even duplicates\ntotal_entries = 0;\n\nunique_entries = 0;\n\n# first argument is script name, so have the second one be the folder to search\nsearch_dir = argv[1]\n\n# list everything under search dir - ideally only your input files\n# CHECK HOW TO READ ONLY SPECIFIC FILE types if you have something inside the same folder\nfiles_list = os.listdir(search_dir)\n\ntotal_files = len(files_list)\n\nprint('Files READ:')\n\n# iterate over each file found at given folder\nfor file_name in files_list:\n    print(\"    \"+file_name)\n\n    file_object = open(search_dir+file_name, 'r')\n\n    # returns a list of entries with 'newline' stripped\n    file_entries = map(lambda it: it.strip(\"\\r\\n\"), file_object.readlines())\n\n    # gotta count'em all\n    total_entries += len(file_entries)\n\n    # set doesn't allow duplicate entries\n    entries_set = set(file_entries)\n\n    #creates a dict from the set, set each key's value to 1.\n    file_entries_dict = dict.fromkeys(entries_set, 1)\n\n    # entries dict is now used differenty, each key will hold a COUNTER\n    files_dict[file_name] = Counter(file_entries_dict)\n\n    file_object.close();\n\n\nprint(\"\\n\\nALL ENTRIES COUNT: \"+str(total_entries))\n\n# now we create a dict that will hold each unique key's count so we can sum all dicts read from files\nentries_dict = Counter({})\n\nfor file_dict_key, file_dict_value in files_dict.items():\n    print(str(file_dict_key)+\" - \"+str(file_dict_value))\n    entries_dict += file_dict_value\n\nprint(\"\\nUNIQUE ENTRIES COUNT: \"+str(len(entries_dict.keys())))\n\n# print(entries_dict)\n\n# 90% from your question\ncut_line = total_files * PERCENT_CUT\nprint(\"\\nNeeds at least \"+str(int(cut_line))+\" entries to be listed below\")\n#output dict is the final dict, where we put entries that were present in &gt; 90%  of the files.\noutput_dict = {}\n# this is PYTHON 3 - CHECK YOUR VERSION as older versions might use iteritems() instead of items() in the line belows\nfor entry, count in entries_dict.items():\n    if count &gt; cut_line:\n        output_dict[entry] = count;\n\nprint(output_dict)\n<\/code><\/pre>\n<\/p><\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\">2<\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved Search and output with Python [closed] <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] I don&#8217;t think I fully understand your question. Posting your code and an example file would have been very helpful. This code will count all entries in all files, then it will identify unique entries per file. After that, it will count each entry&#8217;s occurrence in each file. Then, it will select only entries &#8230; <a title=\"[Solved] Search and output with Python [closed]\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\" aria-label=\"More on [Solved] Search and output with Python [closed]\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[933,349],"class_list":["post-18278","post","type-post","status-publish","format-standard","hentry","category-solved","tag-io","tag-python"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>[Solved] Search and output with Python [closed] - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] Search and output with Python [closed] - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] I don&#8217;t think I fully understand your question. Posting your code and an example file would have been very helpful. This code will count all entries in all files, then it will identify unique entries per file. After that, it will count each entry&#8217;s occurrence in each file. Then, it will select only entries ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2022-10-30T05:02:28+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] Search and output with Python [closed]\",\"datePublished\":\"2022-10-30T05:02:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\"},\"wordCount\":104,\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"keywords\":[\"io\",\"python\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\",\"url\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\",\"name\":\"[Solved] Search and output with Python [closed] - JassWeb\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#website\"},\"datePublished\":\"2022-10-30T05:02:28+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/jassweb.com\/solved\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] Search and output with Python [closed]\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/jassweb.com\/solved\/#website\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/jassweb.com\/solved\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"contentUrl\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"contentUrl\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\/\/jassweb.com\"],\"url\":\"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] Search and output with Python [closed] - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] Search and output with Python [closed] - JassWeb","og_description":"[ad_1] I don&#8217;t think I fully understand your question. Posting your code and an example file would have been very helpful. This code will count all entries in all files, then it will identify unique entries per file. After that, it will count each entry&#8217;s occurrence in each file. Then, it will select only entries ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/","og_site_name":"JassWeb","article_published_time":"2022-10-30T05:02:28+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] Search and output with Python [closed]","datePublished":"2022-10-30T05:02:28+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/"},"wordCount":104,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["io","python"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/","url":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/","name":"[Solved] Search and output with Python [closed] - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2022-10-30T05:02:28+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-search-and-output-with-python-closed\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] Search and output with Python [closed]"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/18278","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=18278"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/18278\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=18278"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=18278"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=18278"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}