{"id":4346,"date":"2022-08-22T10:33:13","date_gmt":"2022-08-22T05:03:13","guid":{"rendered":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/"},"modified":"2022-08-22T10:33:13","modified_gmt":"2022-08-22T05:03:13","slug":"solved-predicting-numerical-features-based-on-string-features-using-sk-learn","status":"publish","type":"post","link":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/","title":{"rendered":"[Solved] Predicting numerical features based on string features using sk-learn"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"answer-64079663\" class=\"answer js-answer accepted-answer js-accepted-answer\" data-answerid=\"64079663\" data-parentid=\"64078945\" data-score=\"0\" data-position-on-page=\"1\" data-highest-scored=\"1\" data-question-has-accepted-highest-score=\"1\" itemprop=\"acceptedAnswer\" itemscope itemtype=\"https:\/\/schema.org\/Answer\">\n<div class=\"post-layout\">\n<div class=\"votecell post-layout--left\"><\/div>\n<div class=\"answercell post-layout--right\">\n<div class=\"s-prose js-post-body\" itemprop=\"text\">\n<p>Below is tested and fully working code of yours:<\/p>\n<pre><code>data_train = pd.read_csv(r\"train.csv\")\ndata_test = pd.read_csv(r\"test.csv\")\n\n\ncolumns = ['Id', 'HomeTeam', 'AwayTeam', 'Full_Time_Home_Goals']\ncol = ['Id', 'HomeTeam', 'AwayTeam']\ndata_test = data_test[col]\ndata_train = data_train[columns]\n\ndata_train = data_train.dropna()\ndata_test = data_test.dropna()\n\ndata_train['Full_Time_Home_Goals'] = data_train['Full_Time_Home_Goals'].astype(int)\n\nfrom sklearn import preprocessing\n\n\ndef encode_features(df_train, df_test):\n    features = ['HomeTeam', 'AwayTeam']\n    df_combined = pd.concat([df_train[features], df_test[features]])\n\n    for feature in features:\n        le = preprocessing.LabelEncoder()\n        le = le.fit(df_combined[feature])\n        df_train[feature] = le.transform(df_train[feature])\n        df_test[feature] = le.transform(df_test[feature])\n    return df_train, df_test\n\n\ndata_train, data_test = encode_features(data_train, data_test)\nprint(data_train.head())\nprint(data_test.head())\n\n# X_all would contain all columns required for prediction and y_all would have that one columns we want to predict\n\ny_all = data_train['Full_Time_Home_Goals']\nX_all = data_train.drop(['Full_Time_Home_Goals'], axis=1)\n\nfrom sklearn.model_selection import train_test_split\n\nnum_test = 0.20  # 80-20 split\nX_train, X_test, y_train, y_test = train_test_split(X_all, y_all, test_size=num_test, random_state=23)\n\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.metrics import make_scorer, accuracy_score\nfrom sklearn.model_selection import GridSearchCV\n\n# Using Random Forest and using parameters that we defined\n\nclf = RandomForestClassifier()\n\nparameters = {'n_estimators': [4, 6, 9],\n              'max_features': ['log2', 'sqrt', 'auto'],\n              'criterion': ['entropy', 'gini'],\n              'max_depth': [2, 3, 5, 10],\n              'min_samples_split': [2, 3, 5],\n              'min_samples_leaf': [1, 5, 8]\n              }\n\nacc_scorer = make_scorer(accuracy_score)\n\ngrid_obj = GridSearchCV(clf, parameters, scoring=acc_scorer)\ngrid_obj = grid_obj.fit(X_train, y_train)\n\nclf = grid_obj.best_estimator_\n\nclf.fit(X_train, y_train)\n\npredictions = clf.predict(X_test)\n\nprint(accuracy_score(y_test, predictions))\n\nids = data_test['Id']\npredictions = clf.predict(data_test)\n\ndf_preds = pd.DataFrame({\"id\":ids, \"predictions\":predictions})\ndf_preds\n\n   Id  HomeTeam  AwayTeam  Full_Time_Home_Goals\n0   1        55       440                     3\n1   2       158       493                     2\n2   3       178       745                     1\n3   4       185       410                     1\n4   5       249        57                     2\n       Id  HomeTeam  AwayTeam\n0  190748       284        54\n1  190749       124       441\n2  190750       446        57\n3  190751       185       637\n4  190752       749       482\n0.33213786556261704\nid  predictions\n0   190748  1\n1   190749  1\n2   190750  1\n3   190751  1\n4   190752  1\n... ... ...\n375 191123  1\n376 191124  1\n377 191125  1\n378 191126  1\n379 191127  1\n380 rows \u00d7 2 columns\n<\/code><\/pre>\n<\/p><\/div>\n<div class=\"mt24\"><\/div>\n<\/div>\n<p>            <span class=\"d-none\" itemprop=\"commentCount\">6<\/span> <\/p><\/div>\n<\/div>\n<p>[ad_2]<\/p>\n<p>solved Predicting numerical features based on string features using sk-learn <\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Below is tested and fully working code of yours: data_train = pd.read_csv(r&#8221;train.csv&#8221;) data_test = pd.read_csv(r&#8221;test.csv&#8221;) columns = [&#8216;Id&#8217;, &#8216;HomeTeam&#8217;, &#8216;AwayTeam&#8217;, &#8216;Full_Time_Home_Goals&#8217;] col = [&#8216;Id&#8217;, &#8216;HomeTeam&#8217;, &#8216;AwayTeam&#8217;] data_test = data_test[col] data_train = data_train[columns] data_train = data_train.dropna() data_test = data_test.dropna() data_train[&#8216;Full_Time_Home_Goals&#8217;] = data_train[&#8216;Full_Time_Home_Goals&#8217;].astype(int) from sklearn import preprocessing def encode_features(df_train, df_test): features = [&#8216;HomeTeam&#8217;, &#8216;AwayTeam&#8217;] df_combined = &#8230; <a title=\"[Solved] Predicting numerical features based on string features using sk-learn\" class=\"read-more\" href=\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\" aria-label=\"More on [Solved] Predicting numerical features based on string features using sk-learn\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[320],"tags":[793,349,792],"class_list":["post-4346","post","type-post","status-publish","format-standard","hentry","category-solved","tag-encoder","tag-python","tag-scikit-learn"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>[Solved] Predicting numerical features based on string features using sk-learn - JassWeb<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"[Solved] Predicting numerical features based on string features using sk-learn - JassWeb\" \/>\n<meta property=\"og:description\" content=\"[ad_1] Below is tested and fully working code of yours: data_train = pd.read_csv(r&quot;train.csv&quot;) data_test = pd.read_csv(r&quot;test.csv&quot;) columns = [&#039;Id&#039;, &#039;HomeTeam&#039;, &#039;AwayTeam&#039;, &#039;Full_Time_Home_Goals&#039;] col = [&#039;Id&#039;, &#039;HomeTeam&#039;, &#039;AwayTeam&#039;] data_test = data_test[col] data_train = data_train[columns] data_train = data_train.dropna() data_test = data_test.dropna() data_train[&#039;Full_Time_Home_Goals&#039;] = data_train[&#039;Full_Time_Home_Goals&#039;].astype(int) from sklearn import preprocessing def encode_features(df_train, df_test): features = [&#039;HomeTeam&#039;, &#039;AwayTeam&#039;] df_combined = ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\" \/>\n<meta property=\"og:site_name\" content=\"JassWeb\" \/>\n<meta property=\"article:published_time\" content=\"2022-08-22T05:03:13+00:00\" \/>\n<meta name=\"author\" content=\"Kirat\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Kirat\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\"},\"author\":{\"name\":\"Kirat\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\"},\"headline\":\"[Solved] Predicting numerical features based on string features using sk-learn\",\"datePublished\":\"2022-08-22T05:03:13+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\"},\"wordCount\":31,\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"keywords\":[\"encoder\",\"python\",\"scikit-learn\"],\"articleSection\":[\"Solved\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\",\"url\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\",\"name\":\"[Solved] Predicting numerical features based on string features using sk-learn - JassWeb\",\"isPartOf\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#website\"},\"datePublished\":\"2022-08-22T05:03:13+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/jassweb.com\/solved\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"[Solved] Predicting numerical features based on string features using sk-learn\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/jassweb.com\/solved\/#website\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"name\":\"JassWeb\",\"description\":\"Build High-quality Websites\",\"publisher\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/jassweb.com\/solved\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/jassweb.com\/solved\/#organization\",\"name\":\"Jass Web\",\"url\":\"https:\/\/jassweb.com\/solved\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"contentUrl\":\"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png\",\"width\":693,\"height\":132,\"caption\":\"Jass Web\"},\"image\":{\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31\",\"name\":\"Kirat\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"contentUrl\":\"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586\",\"caption\":\"Kirat\"},\"sameAs\":[\"http:\/\/jassweb.com\"],\"url\":\"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"[Solved] Predicting numerical features based on string features using sk-learn - JassWeb","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/","og_locale":"en_US","og_type":"article","og_title":"[Solved] Predicting numerical features based on string features using sk-learn - JassWeb","og_description":"[ad_1] Below is tested and fully working code of yours: data_train = pd.read_csv(r\"train.csv\") data_test = pd.read_csv(r\"test.csv\") columns = ['Id', 'HomeTeam', 'AwayTeam', 'Full_Time_Home_Goals'] col = ['Id', 'HomeTeam', 'AwayTeam'] data_test = data_test[col] data_train = data_train[columns] data_train = data_train.dropna() data_test = data_test.dropna() data_train['Full_Time_Home_Goals'] = data_train['Full_Time_Home_Goals'].astype(int) from sklearn import preprocessing def encode_features(df_train, df_test): features = ['HomeTeam', 'AwayTeam'] df_combined = ... Read more","og_url":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/","og_site_name":"JassWeb","article_published_time":"2022-08-22T05:03:13+00:00","author":"Kirat","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Kirat","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/#article","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/"},"author":{"name":"Kirat","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31"},"headline":"[Solved] Predicting numerical features based on string features using sk-learn","datePublished":"2022-08-22T05:03:13+00:00","mainEntityOfPage":{"@id":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/"},"wordCount":31,"publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"keywords":["encoder","python","scikit-learn"],"articleSection":["Solved"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/","url":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/","name":"[Solved] Predicting numerical features based on string features using sk-learn - JassWeb","isPartOf":{"@id":"https:\/\/jassweb.com\/solved\/#website"},"datePublished":"2022-08-22T05:03:13+00:00","breadcrumb":{"@id":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/jassweb.com\/solved\/solved-predicting-numerical-features-based-on-string-features-using-sk-learn\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/jassweb.com\/solved\/"},{"@type":"ListItem","position":2,"name":"[Solved] Predicting numerical features based on string features using sk-learn"}]},{"@type":"WebSite","@id":"https:\/\/jassweb.com\/solved\/#website","url":"https:\/\/jassweb.com\/solved\/","name":"JassWeb","description":"Build High-quality Websites","publisher":{"@id":"https:\/\/jassweb.com\/solved\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/jassweb.com\/solved\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/jassweb.com\/solved\/#organization","name":"Jass Web","url":"https:\/\/jassweb.com\/solved\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/","url":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","contentUrl":"https:\/\/jassweb.com\/wp-content\/uploads\/2021\/02\/jass-website-logo-1.png","width":693,"height":132,"caption":"Jass Web"},"image":{"@id":"https:\/\/jassweb.com\/solved\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/65c9c7b7958150c0dc8371fa35dd7c31","name":"Kirat","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/jassweb.com\/solved\/#\/schema\/person\/image\/","url":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","contentUrl":"https:\/\/jassweb.com\/solved\/wp-content\/litespeed\/avatar\/1261af3c9451399fa1336d28b98ea3bb.jpg?ver=1776403586","caption":"Kirat"},"sameAs":["http:\/\/jassweb.com"],"url":"https:\/\/jassweb.com\/solved\/author\/jaspritsinghghumangmail-com\/"}]}},"_links":{"self":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/4346","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/comments?post=4346"}],"version-history":[{"count":0,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/posts\/4346\/revisions"}],"wp:attachment":[{"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/media?parent=4346"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/categories?post=4346"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jassweb.com\/solved\/wp-json\/wp\/v2\/tags?post=4346"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}