{"id":25528945,"date":"2022-06-17T16:26:03","date_gmt":"2022-06-17T10:56:03","guid":{"rendered":"https:\/\/entri.app\/blog\/?p=25528945"},"modified":"2023-05-23T12:34:41","modified_gmt":"2023-05-23T07:04:41","slug":"automated-feature-engineering-in-python-a-study","status":"publish","type":"post","link":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/","title":{"rendered":"Automated Feature Engineering in Python: A Study"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_79_2 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-69d73764da8a8\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-69d73764da8a8\"  aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#What_is_Feature_Engineering\" >What is Feature Engineering?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#Why_is_Feature_Engineering_required\" >Why is Feature Engineering required?<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#Automating_Feature_Engineering\" >Automating Feature Engineering<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#Introduction_to_Featuretools\" >Introduction to Featuretools<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#Implementation_of_Featuretools\" >Implementation of Featuretools<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#Featuretools_Interpretability\" >Featuretools Interpretability<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#End_Notes\" >End Notes<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#Automated_Feature_Engineering_Tools\" >Automated Feature Engineering Tools<\/a><\/li><\/ul><\/nav><\/div>\n<p>In the context of machine learning, a feature can be described as\u00a0 a set of characteristics, that explains the occurrence of a phenomenon. When these characteristics are converted into some measurable form, they are called features.<\/p>\n<p><a href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-25520910 size-full\" src=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square.png\" alt=\"Python and Machine Learning Square\" width=\"345\" height=\"345\" srcset=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square.png 345w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square-300x300.png 300w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square-150x150.png 150w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square-24x24.png 24w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square-48x48.png 48w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square-96x96.png 96w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Square-75x75.png 75w\" sizes=\"auto, (max-width: 345px) 100vw, 345px\" \/><\/a><\/p>\n<div class=\"lead-gen-block\"><a href=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/03\/Python_PDF.pdf\" data-url=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/03\/Python_PDF.pdf\" class=\"lead-pdf-download\" data-id=\"25556851\">\n<p style=\"text-align: center;\"><strong>Download Python Programming Course Syllabus! <\/a><\/div><\/strong><\/p>\n<p>For example, assume you have a list of students. This list contains the name of each student, number of hours they studied, their IQ, and their total marks in the previous examinations. Now you are given information about a new student\u2014 the number of hours he\/she studied and his IQ, but his\/her marks are missing. You have to estimate his\/her probable marks.<\/p>\n<p style=\"text-align: center;\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\">&#8220;Ready to take your python skills to the next level? Sign up for a free demo today!&#8221;<\/a><\/strong><\/p>\n<p>Here, you\u2019d use IQ and study_hours to build a predictive model to estimate these missing marks. So, IQ and study_ hours are called the features for this model.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-46422\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/features.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/features.png 624w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/features-300x123.png 300w\" alt=\"\" width=\"624\" height=\"256\" \/><\/p>\n<h2 id=\"h2_6\" class=\"target\" data-tocid=\"h2_toc_6\"><span class=\"ez-toc-section\" id=\"What_is_Feature_Engineering\"><\/span><strong>What is Feature Engineering?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Feature Engineering can simply be defined as the process of creating new features from the existing features in a dataset. Let\u2019s consider a sample data that has details about a few items, such as their weight and price.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-46424\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/feat_engg_example.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/feat_engg_example.png 311w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/feat_engg_example-300x127.png 300w\" alt=\"\" width=\"311\" height=\"132\" \/><\/p>\n<p>Now, to create a new feature we can use Item_Weight and Item_Price. So, let\u2019s create a feature called Price_per_Weight. It is nothing but the price of the item divided by the weight of the item. This process is called feature engineering.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-46425\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/feat_engg_example_2.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/feat_engg_example_2.png 457w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/feat_engg_example_2-300x85.png 300w\" alt=\"\" width=\"457\" height=\"130\" \/><\/p>\n<p>This was just a simple example to create a new feature from existing ones, but in practice, when we have quite a lot of features, feature engineering can become quite complex and cumbersome.<\/p>\n<p><strong><div class=\"lead-gen-block\"><a href=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2023\/05\/1_merged-3_compressed.pdf\" data-url=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2023\/05\/1_merged-3_compressed.pdf\" class=\"lead-pdf-download\" data-id=\"25556851\"><\/strong><\/p>\n<p style=\"text-align: center;\"><button class=\"btn btn-default\">Free SQL Tutorial for Beginners &#8211; Download PDF<\/button><\/p>\n<p><strong><\/a><\/div><\/strong><\/p>\n<p>Let\u2019s take another example. In the popular Titanic dataset, there is a passenger name feature and below are some of the names in the dataset:<\/p>\n<ul>\n<li>Montvila, Rev. Juozas<\/li>\n<li>Graham, Miss. Margaret Edith<\/li>\n<li>Johnston, Miss. Catherine Helen \u201cCarrie\u201d<\/li>\n<li>Behr, Mr. Karl Howell<\/li>\n<li>Dooley, Mr. Patrick<\/li>\n<\/ul>\n<p>These names can actually be broken down\u00a0into additional meaningful features. For example, we can extract and group similar titles into single categories. Let\u2019s have a look at the unique number of titles in the passenger names.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-46518 alignleft\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/titanic_titles.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/titanic_titles.png 160w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/titanic_titles-124x300.png 124w\" alt=\"\" width=\"160\" height=\"388\" \/><\/p>\n<p>&nbsp;<\/p>\n<h2 id=\"h2_7\" class=\"target\" data-tocid=\"h2_toc_7\"><\/h2>\n<h2 id=\"h2_8\" class=\"target\" data-tocid=\"h2_toc_8\"><\/h2>\n<h2 id=\"h2_9\" class=\"target\" data-tocid=\"h2_toc_9\"><\/h2>\n<h2 id=\"h2_10\" class=\"target\" data-tocid=\"h2_toc_10\"><\/h2>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>It turns out that titles like\u00a0\u2018Dona\u2019, \u2018Lady\u2019, \u2018the Countess\u2019, \u2018Capt\u2019, \u2018Col\u2019, \u2018Don\u2019, \u2018Dr\u2019, \u2018Major\u2019, \u2018Rev\u2019, \u2018Sir\u2019, and \u2018Jonkheer\u2019 are quite rare and can be put under a single label. Let\u2019s call it\u00a0<em>rare_title<\/em>. Apart from this, the titles \u2018Mlle\u2019 and \u2018Ms\u2019 can be placed under \u2018Miss\u2019, and \u2018Mme\u2019 can be replaced with \u2018Mrs\u2019.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-46528\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/rare_title1.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/rare_title1.png 430w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/rare_title1-300x170.png 300w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/rare_title1-257x144.png 257w\" alt=\"\" width=\"430\" height=\"243\" \/><\/p>\n<p>Hence, the new title feature would have only 5 unique values as shown below:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-46531\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/titanic_titles_new.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/titanic_titles_new.png 360w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/titanic_titles_new-300x54.png 300w\" alt=\"\" width=\"360\" height=\"65\" \/><\/p>\n<p>So, this is how we can extract useful information with the help of feature engineering, even from features like passenger names which initially seemed fairly pointless.<\/p>\n<p style=\"text-align: center;\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\">&#8220;Experience the power of our web development course with a free demo &#8211; enroll now!&#8221;<\/a><\/strong><\/p>\n<h2 id=\"h2_11\" class=\"target\" data-tocid=\"h2_toc_11\"><span class=\"ez-toc-section\" id=\"Why_is_Feature_Engineering_required\"><\/span><strong>Why is Feature Engineering required?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The performance of a predictive model is heavily dependent on the quality of the features in the dataset used to train that model. If you are able to create new features which help in providing more information to the model about the target variable, it\u2019s performance will go up. Hence, when we don\u2019t have enough quality features in our dataset, we have to lean on feature engineering.<\/p>\n<p>As explained in this\u00a0article, smart feature engineering was instrumental in securing a place in the top 5 percentile of the leaderboard. Some of the features created are given below:<\/p>\n<ol>\n<li><b>Hour Bins<\/b>: A new feature was created by binning the\u00a0<i>hour<\/i>\u00a0feature with the help of a decision tree<\/li>\n<li><b>Temp Bins<\/b>: Similarly, a binned feature for the temperature variable<\/li>\n<li><b>Year Bins<\/b>: 8 quarterly bins were created for a period of 2 years<\/li>\n<li><b>Day Type<\/b>: Days were categorized as \u201cweekday\u201d, \u201cweekend\u201d or \u201choliday\u201d<\/li>\n<\/ol>\n<p>Creating such features is no child&#8217;s play \u2013 it takes a great deal of brainstorming and extensive data exploration. Not everyone is good at feature engineering because it is not something that you can learn by reading books or watching videos. This is why feature engineering is also called an art. If you are good at it, then you have a major edge over the competition.<\/p>\n<h2 id=\"h2_12\" class=\"target\" data-tocid=\"h2_toc_12\"><span class=\"ez-toc-section\" id=\"Automating_Feature_Engineering\"><\/span><strong>Automating Feature Engineering<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-46427\" src=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/automation_car.png\" srcset=\"https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/automation_car.png 576w, https:\/\/cdn.analyticsvidhya.com\/wp-content\/uploads\/2018\/08\/automation_car-300x104.png 300w\" alt=\"\" width=\"576\" height=\"199\" \/><\/p>\n<p>Analyze the two images shown above. The left one shows a car being assembled by a group of men during early 20th century, and the right picture shows robots doing the same job in today\u2019s world. Automating any process has the potential to make it much more efficient and cost-effective.<\/p>\n<p>Building machine learning models can often be a painstaking process. It involves many steps so if we are able to automate a certain percentage of feature engineering tasks, then the data scientists or the domain experts can focus on other aspects of the model.<\/p>\n<p>Now that we have understood that automating feature engineering is the need of the hour, the next question to ask is \u2013 how is it going to happen? Well, we have a great tool to address this issue and it\u2019s called Featuretools.<\/p>\n<h2 id=\"h2_13\" class=\"target\" data-tocid=\"h2_toc_13\"><span class=\"ez-toc-section\" id=\"Introduction_to_Featuretools\"><\/span><strong>Introduction to Featuretools<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Featuretools is an open source library for performing automated feature engineering. It is a great tool designed to fast-forward the feature generation process, by giving more time to focus on other aspects of machine learning model building. In other words, it makes your data \u201cmachine learning ready\u201d.<\/p>\n<p>Before taking Featuretools for a spin, there are three major components of the package that we should be aware of:<\/p>\n<ul>\n<li>Entities<\/li>\n<li>Deep Feature Synthesis (DFS)<\/li>\n<li>Feature primitives<\/li>\n<\/ul>\n<p>a) An\u00a0<strong>Entity<\/strong>\u00a0can be considered as a representation of a Pandas DataFrame. A collection of multiple entities is called an\u00a0<strong>Entityset<\/strong>.<\/p>\n<p>b)\u00a0<strong>Deep Feature Synthesis<\/strong> (DFS) is actually a Feature Engineering method and is the backbone of Featuretools. It enables the creation of new features from single, as well as multiple dataframes.<\/p>\n<p>c) DFS create features by applying\u00a0<strong>Feature primitives<\/strong>\u00a0to the Entity-relationships in an EntitySet. These primitives are the often-used methods to generate features manually. For example, the primitive \u201cmean\u201d would find the mean of a variable at an aggregated level.<\/p>\n<p>The best way to understand and become comfortable with Featuretools is by applying it on a dataset.<\/p>\n<p style=\"text-align: center;\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\">&#8220;Get hands-on with our python course &#8211; sign up for a free demo!&#8221;<\/a><\/strong><\/p>\n<h2 id=\"h2_14\" class=\"target\" data-tocid=\"h2_toc_14\"><span class=\"ez-toc-section\" id=\"Implementation_of_Featuretools\"><\/span><strong>Implementation of Featuretools<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The objective of the BigMart Sales challenge is to build a predictive model to estimate the sales of each product at a particular store. This would help the decision makers at BigMart to find out the properties of any product or store, which play a key role in increasing the overall sales. Note that there are 1559 products across 10 stores in the given dataset.<\/p>\n<p>The below table shows the features provided in our data:<\/p>\n<table>\n<tbody>\n<tr>\n<th>Variable<\/th>\n<th>Description<\/th>\n<\/tr>\n<tr>\n<td>Item_Identifier<\/td>\n<td>Unique product ID<\/td>\n<\/tr>\n<tr>\n<td>Item_Weight<\/td>\n<td>Weight of product<\/td>\n<\/tr>\n<tr>\n<td>Item_Fat_Content<\/td>\n<td>Whether the product is low fat or not<\/td>\n<\/tr>\n<tr>\n<td>Item_Visibility<\/td>\n<td>The % of total display area of all products in a store allocated to the particular product<\/td>\n<\/tr>\n<tr>\n<td>Item_Type<\/td>\n<td>The category to which the product belongs<\/td>\n<\/tr>\n<tr>\n<td>Item_MRP<\/td>\n<td>Maximum Retail Price (list price) of the product<\/td>\n<\/tr>\n<tr>\n<td>Outlet_Identifier<\/td>\n<td>Unique store ID<\/td>\n<\/tr>\n<tr>\n<td>Outlet_Establishment_Year<\/td>\n<td>The year in which store was established<\/td>\n<\/tr>\n<tr>\n<td>Outlet_Size<\/td>\n<td>The size of the store in terms of ground area covered<\/td>\n<\/tr>\n<tr>\n<td>Outlet_Location_Type<\/td>\n<td>The type of city in which the store is located<\/td>\n<\/tr>\n<tr>\n<td>Outlet_Type<\/td>\n<td>Whether the outlet is just a grocery store or some sort of supermarket<\/td>\n<\/tr>\n<tr>\n<td>Item_Outlet_Sales<\/td>\n<td>Sales of the product in the particulat store. This is the outcome variable to be predicted.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h4 style=\"text-align: center;\"><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;\\&quot;Ready to take your python skills to the next level? Sign up for a free demo today!\\&quot;&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:1061443,&quot;3&quot;:{&quot;1&quot;:0},&quot;4&quot;:{&quot;1&quot;:2,&quot;2&quot;:16316664},&quot;9&quot;:0,&quot;12&quot;:0,&quot;15&quot;:&quot;Arial&quot;,&quot;16&quot;:10,&quot;23&quot;:1}\" data-sheets-hyperlink=\"https:\/\/entri.app\/course\/python-programming-course\/\"><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\">&#8220;Ready to take your python skills to the next level? Sign up for a free demo today!&#8221;<\/a><\/span><\/h4>\n<h2 id=\"h2_15\" class=\"target\" data-tocid=\"h2_toc_15\"><span class=\"ez-toc-section\" id=\"Featuretools_Interpretability\"><\/span><strong>Featuretools Interpretability<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Making our data science solutions interpretable is a very important aspect of performing machine learning. Features generated by Featuretools can be easily explained even to a non-technical person because they are based on the primitives, which are easy to understand.<\/p>\n<p>For example, the features\u00a0<em>outlet.SUM(bigmart.Item_Weight)<\/em>\u00a0and\u00a0<em>outlet.STD(bigmart.Item_MRP)<\/em>\u00a0mean outlet-level sum of weight of the items and standard deviation of the cost of the items, respectively.<\/p>\n<p>This makes it possible for those people who are not machine learning experts, to contribute as well in terms of their domain expertise.<\/p>\n<h2 id=\"h2_16\" class=\"target\" data-tocid=\"h2_toc_16\"><span class=\"ez-toc-section\" id=\"End_Notes\"><\/span><strong>End Notes<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The featuretools package is truly a game-changer in machine learning. While it\u2019s applications are understandably still limited in industry use cases, it has quickly become ultra popular in Machine Learning competitions. The amount of time it saves, and the usefulness of feature it generates, has truly won me over.<\/p>\n<article>\n<div class=\"l\">\n<div class=\"l\">\n<section>\n<div class=\"ip iq ir is it\">\n<div class=\"\">\n<h2 id=\"39c6\" class=\"pw-post-title iu iv iw bn ix iy iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq jr js fy\"><span class=\"ez-toc-section\" id=\"Automated_Feature_Engineering_Tools\"><\/span><strong>Automated Feature Engineering Tools<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<\/div>\n<\/div>\n<div class=\"o dz jt ju ib jv\" role=\"separator\"><\/div>\n<div class=\"ip iq ir is it\">\n<p id=\"a4fa\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">Feature Engineering is a technique to convert raw data columns to something meaningful which can help in predicting the outcomes in a machine learning task. Feature Engineering can be a very tedious and often the most time taking in machine learning life cycle.<\/p>\n<\/div>\n<div class=\"o dz jt ju ib jv\" role=\"separator\">But to our rescue comes some of the cool tools which automates the whole feature engineering process and creates a large pool of features in a very short span for both classification and regression tasks.<\/div>\n<div class=\"ip iq ir is it\">\n<p id=\"de1e\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">We have found following tools which automates the whole feature engineering process and creates large number of features for both relation and non-relational data. While some of them only performs feature engineering, we have some tools which also perform feature selection.<\/p>\n<p id=\"f9fa\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">Here is the list of some of the best tools available :-<\/p>\n<p id=\"3bca\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">1.<\/strong>\u00a0<strong class=\"kc ix\">FeatureTools<\/strong><\/p>\n<p id=\"7d04\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">2.<\/strong>\u00a0<strong class=\"kc ix\">AutoFeat<\/strong><\/p>\n<p id=\"2ead\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">3.<\/strong>\u00a0<strong class=\"kc ix\">TsFresh<\/strong><\/p>\n<p id=\"7a73\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">4.<\/strong>\u00a0<strong class=\"kc ix\">Cognito<\/strong><\/p>\n<p id=\"d14c\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">5.<\/strong>\u00a0<strong class=\"kc ix\">OneBM<\/strong><\/p>\n<p id=\"a748\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">6.<\/strong>\u00a0<strong class=\"kc ix\">ExploreKit<\/strong><\/p>\n<p id=\"495f\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">7.<\/strong>\u00a0<strong class=\"kc ix\">PyFeat<\/strong><\/p>\n<h3 id=\"5cfe\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">FeatureTools\u00a0<\/strong><\/h3>\n<p id=\"f7d1\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">One of the most popular Python library for automated feature engineering is\u00a0FeatureTools, which generates a large feature set using\u00a0\u201cdeep feature synthesis\u201d. This library is targeted towards relational data, where features can be created through aggregations or transformations. DFS requires structured and relational data for creating new features.<\/p>\n<p id=\"7aa7\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">There are 2 main components of FeatureTools :-<\/p>\n<p id=\"ce66\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7\u00a0<strong class=\"kc ix\">Entity and Entity-Set<\/strong>\u00a0: Entity can be thought of as a single data-frame while Entity-Set is combination of more than one data-frames.<\/p>\n<p id=\"0756\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7\u00a0<strong class=\"kc ix\">Primitives<\/strong>\u00a0: These are basic operations like mean, mode, max etc. that can be applied to the data. It can either be a\u00a0Transformation or Aggregation.<\/p>\n<h4 id=\"b788\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>Advantages\u00a0<\/strong><\/h4>\n<p id=\"566f\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">1) Most popular and hence lot of resources are available.<\/p>\n<p id=\"ac48\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">2) We can specify the variable types.<\/p>\n<p id=\"eadb\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">3) Custom primitives can be created.<\/p>\n<p id=\"5b5d\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">4) Expanding Features with respect to time can be created.<\/p>\n<p id=\"e9ba\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">5) Best at handling relational database.<\/p>\n<h4 id=\"8bf3\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>Limitations\u00a0<\/strong><\/h4>\n<p id=\"fd45\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">1) Creates large number of features leading to curse of dimensionality.<\/p>\n<p id=\"f8cb\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">2) For database which are not relational we will have to use normalization.<\/p>\n<p id=\"636a\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">3) Does not have support for unstructured data.<\/p>\n<p id=\"2bc9\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">4) Features extracted are basic statistical features which are aggregated independently of other columns of target variable.<\/p>\n<h3 id=\"a8d4\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>AutoFeat\u00a0<\/strong><\/h3>\n<p id=\"9399\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">AutoFeat is one of the python library which automates feature engineering and feature selection along with fitting a Linear Regression model. They generally fit Linear Regression model to make the process more explainable.<\/p>\n<p id=\"1f68\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">AutoFeat is not meant for relational data, found in many business application areas, but was rather built with scientific use cases in mind, where experimental measurements would instead be stored in a single table. For this reason, AutoFeat\u00a0also makes it possible to specify the units of the input variables to prevent the creation of physically nonsensical features.<\/p>\n<p style=\"text-align: center;\" data-selectable-paragraph=\"\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\">&#8220;Experience the power of our web development course with a free demo &#8211; enroll now!&#8221;<\/a><\/strong><\/p>\n<h4 id=\"0a83\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Advantages\u00a0<\/strong><\/h4>\n<p id=\"6b6d\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Only open source framework for general purpose automated feature engineering which does not care about relational data.<\/p>\n<p id=\"89eb\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Also does feature selection to reduce the dimensionality problem.<\/p>\n<p id=\"8028\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Does not create physically non-sensical features and hence useful in logistic data.<\/p>\n<h4 id=\"430e\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Limitations<\/strong><\/h4>\n<p id=\"e328\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Not good at handling relational data.<\/p>\n<p id=\"e329\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Only make simpler features like ratios, products and other basic transformations.<\/p>\n<p id=\"b9d4\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Does not consider feature interaction for making new features.<\/p>\n<p id=\"9a7c\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Only fit automated model for regression data not for classification problem.<\/p>\n<h3 id=\"78d2\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>TsFresh\u00a0<\/strong><\/h3>\n<p id=\"37d7\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">TsFresh, which stands for \u201cTime Series Feature extraction based on scalable hypothesis tests\u201d, is a Python package for time series analysis that contains feature extraction methods and a feature selection algorithm. Currently, it automatically extracts\u00a064 features\u00a0from time series data that describe both basic and complex characteristics of a time series (such as the number of peaks, average value, maximum value,\u00a0time reversal symmetry statistic, etc.), and those features can be used to build regression or classification based machine learning models.<\/p>\n<h4 id=\"54da\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Advantages<\/strong><\/h4>\n<p id=\"9d51\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Best open source python tool available for time series classification and regression.<\/p>\n<p id=\"e64b\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Can be easily integrated with FeatureTools.<\/p>\n<h4 id=\"8cda\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Limitations<\/strong><\/h4>\n<p id=\"0ed7\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Can only be used for time-series data that too only good for supervised learning.<\/p>\n<h3 id=\"3298\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>PyFeat\u00a0<\/strong><\/h3>\n<p id=\"14c9\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\"><strong class=\"kc ix\">PyFeat<\/strong>\u00a0is a practical and easy to use toolkit implemented in Python for extracting various features from proteins, DNAs and RNAs.<\/p>\n<p id=\"678b\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">It can only be used with genome data, hence making it not so useful for everyday classification and regression task but find its usefulness in pharma industries<strong class=\"kc ix\">.<\/strong><\/p>\n<h3 id=\"0c9b\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>ExploreKit\u00a0<\/strong><\/h3>\n<p id=\"9a2e\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">ExploreKit is based on the intuition that highly informative features often result from manipulations of elementary ones, they identify common operators to transform each feature individually or combine several of them together. It uses these operators to generate many candidate features, and chooses the subset to add based on the empirical performance of models trained with candidate features added.<\/p>\n<p id=\"a526\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">Generation and Selection process is as follows:<\/p>\n<figure class=\"kz la lb lc gv ld gj gk paragraph-image\">\n<div class=\"gj gk lh\"><img loading=\"lazy\" decoding=\"async\" class=\"cf le lf\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/1318\/1*cF-MjVs5VDgwPD2FTRXSUg.png\" alt=\"\" width=\"659\" height=\"519\" \/><\/div>\n<\/figure>\n<h4 id=\"b68a\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Advantages\u00a0<\/strong><\/h4>\n<p id=\"9718\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Uses meta learning to rank candidate features rather than running feature selection on all created features which can sometimes be very large.<\/p>\n<h4 id=\"ba95\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Limitations\u00a0<\/strong><\/h4>\n<p id=\"4234\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 No open source implementation either in Python or R.<\/p>\n<h3 id=\"a7c7\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>OneBM\u00a0<\/strong><\/h3>\n<p id=\"3d62\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">OneBM works directly with multiple raw tables in a database. It joins the tables incrementally, following different paths on the relational graph. It automatically identifies data types of the joint results, including simple data types (numerical or categorical) and complex data types (set of numbers, set of categories, sequences, time series and texts), and applies corresponding pre-defined feature engineering techniques on the given types. By doing so, new feature engineering techniques could be plugged in through an interface with its feature extractor modules to extract desired types of features in specific domain.<\/p>\n<p id=\"03c9\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">Feature selection is used to remove irrelevant features extracted in the prior steps. First, duplicated features are removed. Second, if the training and test data have an implicit order defined by a column, e.g. timestamp, then drift features are detected by comparing the distribution between the value of features in the training and a validation set. If two distributions are different, the feature is identified as a drift feature which may cause over-fitting. Drift features are all removed from the feature set.<\/p>\n<p style=\"text-align: center;\" data-selectable-paragraph=\"\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\">&#8220;Get hands-on with our python course &#8211; sign up for a free demo!&#8221;<\/a><\/strong><\/p>\n<h4 id=\"409b\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong>Advantages\u00a0<\/strong><\/h4>\n<p id=\"6bda\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Works well with both relational as well as non-relational data.<\/p>\n<p id=\"0e84\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Generates simple as well as complex features.<\/p>\n<p id=\"8341\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Can be used to create feature for big data also.<\/p>\n<h4 id=\"7005\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Limitations\u00a0<\/strong><\/h4>\n<p id=\"d452\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 No open source implementation.<\/p>\n<h3 id=\"1d41\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Cognito\u00a0<\/strong><\/h3>\n<p id=\"1682\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">Cognito is a system that automates feature engineering from a single database table. In each step, it recursively applies a set of predefined mathematical transformations on the table\u2019s columns to obtain new features from the original table. By doing so, the number of features is exponential in the number of steps. Therefore, a feature selection strategy was proposed to remove redundant features. It improves prediction accuracy on UCI datasets.<\/p>\n<h4 id=\"5ac5\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\"><strong class=\"kc ix\">Limitations\u00a0<\/strong><\/h4>\n<p id=\"9415\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 No open source implementation.<\/p>\n<p id=\"535b\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">\u00b7 Extra efforts needed with relational data.<\/p>\n<p data-selectable-paragraph=\"\"><a href=\"https:\/\/entri.app\/course\/python-programming-course\/\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-25522670 size-full\" src=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Rectangle-1.png\" alt=\"Python and Machine Learning Rectangle\" width=\"970\" height=\"250\" srcset=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Rectangle-1.png 970w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Rectangle-1-300x77.png 300w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Rectangle-1-768x198.png 768w, https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/04\/Python-and-Machine-Learning-Rectangle-1-750x193.png 750w\" sizes=\"auto, (max-width: 970px) 100vw, 970px\" \/><\/a><\/p>\n<p id=\"6cea\" class=\"pw-post-body-paragraph ka kb iw kc b kd ke kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ip fy\" data-selectable-paragraph=\"\">We can use these techniques to come with large number of features without investing much time and focus more on other aspects of machine learning which is modelling and at scale deployment.<\/p>\n<h4><strong>Related Articles<\/strong><\/h4>\n<div class=\"table-responsive wprt_style_display\">\n<div class=\"table-responsive wprt_style_display\">\n<div class=\"table-responsive wprt_style_display\">\n<div class=\"table-responsive wprt_style_display\">\n<table class=\"table\" dir=\"ltr\" border=\"1\" cellspacing=\"0\" cellpadding=\"0\">\n<colgroup>\n<col width=\"329\" \/>\n<col width=\"309\" \/><\/colgroup>\n<tbody>\n<tr>\n<td data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Kerala PSC VFA Syllabus&quot;}\" data-sheets-hyperlink=\"https:\/\/entri.app\/blog\/kerala-psc-village-field-assistant-vfa-syllabus-exam-pattern\/\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/blog\/step-by-step-guide-for-getting-a-job-as-a-python-developer\/42\" target=\"_blank\" rel=\"noopener\">A Step-by-Step Guide for Getting a Job as a Python Developer<\/a><\/strong><\/td>\n<td data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Kerala PSC VFA Mock Test&quot;}\" data-sheets-hyperlink=\"https:\/\/entri.app\/blog\/kerala-psc-vfa-free-mock-test\/\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/blog\/why-python-is-used-for-data-science\/\" target=\"_blank\" rel=\"noopener\">Why Python Is Used For Data Science?<\/a><\/strong><\/td>\n<\/tr>\n<tr>\n<td data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Kerala PSC VFA Exam Date&quot;}\" data-sheets-hyperlink=\"https:\/\/entri.app\/blog\/kerala-psc-vfa-exam-date\/\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/blog\/step-by-step-guide-for-getting-a-job-as-a-python-developer\/\" target=\"_blank\" rel=\"noopener\">Guide for getting a job as a Python Developer<\/a><\/strong><\/td>\n<td data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Kerala PSC VFA Video Course&quot;}\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/blog\/top-python-interview-questions-and-answers\/\" target=\"_blank\" rel=\"noopener\">Python Advanced Interview Questions and Answers<\/a><\/strong><\/td>\n<\/tr>\n<tr>\n<td data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Kerala PSC VFA Application Form&quot;}\" data-sheets-hyperlink=\"https:\/\/entri.app\/blog\/kerala-psc-vfa-apply-online\/\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/blog\/python-online-course\/\" target=\"_blank\" rel=\"noopener\">Best Online Python Course with Certificate<\/a><\/strong><\/td>\n<td data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Kerala PSC VFA Study Materials&quot;}\" data-sheets-hyperlink=\"https:\/\/entri.app\/blog\/kerala-psc-vfa-study-material\/\"><strong><a class=\"in-cell-link\" href=\"https:\/\/entri.app\/blog\/type-conversion-in-python\/\" target=\"_blank\" rel=\"noopener\">What is Type Conversion in Python?<\/a><\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"modal\" id=\"modal25556851\"><div class=\"modal-content\"><span class=\"close-button\">&times;<\/span>\n\n<div class=\"wpcf7 no-js\" id=\"wpcf7-f25556851-o1\" lang=\"en-US\" dir=\"ltr\" data-wpcf7-id=\"25556851\">\n<div class=\"screen-reader-response\"><p role=\"status\" aria-live=\"polite\" aria-atomic=\"true\"><\/p> <ul><\/ul><\/div>\n<form action=\"\/blog\/wp-json\/wp\/v2\/posts\/25528945#wpcf7-f25556851-o1\" method=\"post\" class=\"wpcf7-form init\" aria-label=\"Contact form\" novalidate=\"novalidate\" data-status=\"init\">\n<fieldset class=\"hidden-fields-container\"><input type=\"hidden\" name=\"_wpcf7\" value=\"25556851\" \/><input type=\"hidden\" name=\"_wpcf7_version\" value=\"6.1.4\" \/><input type=\"hidden\" name=\"_wpcf7_locale\" value=\"en_US\" \/><input type=\"hidden\" name=\"_wpcf7_unit_tag\" value=\"wpcf7-f25556851-o1\" \/><input type=\"hidden\" name=\"_wpcf7_container_post\" value=\"0\" \/><input type=\"hidden\" name=\"_wpcf7_posted_data_hash\" value=\"\" \/><input type=\"hidden\" name=\"_wpcf7cf_hidden_group_fields\" value=\"[]\" \/><input type=\"hidden\" name=\"_wpcf7cf_hidden_groups\" value=\"[]\" \/><input type=\"hidden\" name=\"_wpcf7cf_visible_groups\" value=\"[]\" \/><input type=\"hidden\" name=\"_wpcf7cf_repeaters\" value=\"[]\" \/><input type=\"hidden\" name=\"_wpcf7cf_steps\" value=\"{}\" \/><input type=\"hidden\" name=\"_wpcf7cf_options\" value=\"{&quot;form_id&quot;:25556851,&quot;conditions&quot;:[],&quot;settings&quot;:{&quot;animation&quot;:&quot;yes&quot;,&quot;animation_intime&quot;:200,&quot;animation_outtime&quot;:200,&quot;conditions_ui&quot;:&quot;normal&quot;,&quot;notice_dismissed&quot;:false,&quot;notice_dismissed_update-cf7-5.9.8&quot;:true,&quot;notice_dismissed_update-cf7-6.1.1&quot;:true}}\" \/>\n<\/fieldset>\n<p><span class=\"wpcf7-form-control-wrap\" data-name=\"full_name\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text wpcf7-validates-as-required\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Name\" value=\"\" type=\"text\" name=\"full_name\" \/><\/span><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"phone\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-tel wpcf7-validates-as-required wpcf7-text wpcf7-validates-as-tel\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Phone\" value=\"\" type=\"tel\" name=\"phone\" \/><\/span><br \/>\n<span class=\"wpcf7-form-control-wrap\" data-name=\"email_id\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-email wpcf7-text wpcf7-validates-as-email\" aria-invalid=\"false\" placeholder=\"Email\" value=\"\" type=\"email\" name=\"email_id\" \/><\/span>\n<\/p>\n<div class=\"custom-form-group-1\">\n\t<p><span class=\"wpcf7-form-control-wrap\" data-name=\"language\"><select class=\"wpcf7-form-control wpcf7-select wpcf7-validates-as-required language-select1\" aria-required=\"true\" aria-invalid=\"false\" name=\"language\"><option value=\"\">Select Language<\/option><option value=\"Malayalam\">Malayalam<\/option><option value=\"Tamil\">Tamil<\/option><option value=\"Telugu\">Telugu<\/option><option value=\"Kannada\">Kannada<\/option><\/select><\/span>\n\t<\/p>\n<\/div>\n<div class=\"custom-form-group-1\">\n\t<p><span class=\"wpcf7-form-control-wrap\" data-name=\"course\"><select class=\"wpcf7-form-control wpcf7-select wpcf7-validates-as-required course-select1\" aria-required=\"true\" aria-invalid=\"false\" name=\"course\"><option value=\"\">Select an option<\/option><option value=\"Kerala PSC Exams\">Kerala PSC Exams<\/option><option value=\"Kerala PSC Teaching Exams\">Kerala PSC Teaching Exams<\/option><option value=\"Kerala PSC Technical Exams\">Kerala PSC Technical Exams<\/option><option value=\"SSC\/RRB\">SSC\/RRB<\/option><option value=\"GATE\">GATE<\/option><option value=\"Banking &amp; Insurance\">Banking &amp; Insurance<\/option><option value=\"Coding\">Coding<\/option><option value=\"Commerce\">Commerce<\/option><option value=\"Personal Finance\">Personal Finance<\/option><option value=\"Spoken English\/Personality Dev\">Spoken English\/Personality Dev<\/option><option value=\"German Language\">German Language<\/option><option value=\"Montessori Teacher Training\">Montessori Teacher Training<\/option><option value=\"IELTS\">IELTS<\/option><option value=\"MEP\">MEP<\/option><option value=\"Quantity Surveying\">Quantity Surveying<\/option><option value=\"Structural Design\">Structural Design<\/option><option value=\"Yoga TTC\">Yoga TTC<\/option><option value=\"Digital Marketing\">Digital Marketing<\/option><option value=\"Hospital and Healthcare Administration\">Hospital and Healthcare Administration<\/option><option value=\"BIM\">BIM<\/option><option value=\"HR Management\">HR Management<\/option><option value=\"Embedded System Software Engineering\">Embedded System Software Engineering<\/option><\/select><\/span>\n\t<\/p>\n<\/div>\n<div class=\"custom-form-group-1\">\n\t<p><span class=\"wpcf7-form-control-wrap\" data-name=\"course_name\"><select class=\"wpcf7-form-control wpcf7-select wpcf7-validates-as-required course-name-select1\" aria-required=\"true\" aria-invalid=\"false\" name=\"course_name\"><option value=\"\">Select an option<\/option><option value=\"KAS\">KAS<\/option><option value=\"Degree level\">Degree level<\/option><option value=\"12th level\">12th level<\/option><option value=\"10th level\">10th level<\/option><option value=\"Secretariat Assistant\">Secretariat Assistant<\/option><option value=\"LDC\">LDC<\/option><option value=\"LGS\">LGS<\/option><option value=\"University Assistant\">University Assistant<\/option><option value=\"FSO\">FSO<\/option><option value=\"VEO\">VEO<\/option><option value=\"VFA\">VFA<\/option><option value=\"Dental Surgeon\">Dental Surgeon<\/option><option value=\"Staff Nurse\">Staff Nurse<\/option><option value=\"Sub Inspector\">Sub Inspector<\/option><option value=\"Divisional Accountant\">Divisional Accountant<\/option><option value=\"Fireman\/Firewomen\/Driver\">Fireman\/Firewomen\/Driver<\/option><option value=\"CPO\/WCPO\/Driver\">CPO\/WCPO\/Driver<\/option><option value=\"Excise\">Excise<\/option><option value=\"LD Typist\">LD Typist<\/option><option value=\"Junior Health Inspector\">Junior Health Inspector<\/option><option value=\"Assistant Jailor\">Assistant Jailor<\/option><option value=\"Kerala High Court Assistant\">Kerala High Court Assistant<\/option><option value=\"Beat Forest Officer\">Beat Forest Officer<\/option><option value=\"Junior Employment Officer\">Junior Employment Officer<\/option><option value=\"Junior Lab Assistant\">Junior Lab Assistant<\/option><option value=\"Dewaswom Board LDC\">Dewaswom Board LDC<\/option><option value=\"LSGS\">LSGS<\/option><option value=\"SBCID\">SBCID<\/option><option value=\"IRB Regular wing\">IRB Regular wing<\/option><option value=\"Assistant Salesman\">Assistant Salesman<\/option><option value=\"Secretariat OA\">Secretariat OA<\/option><option value=\"Driver Cum OA\">Driver Cum OA<\/option><option value=\"Departmental Test\">Departmental Test<\/option><option value=\"HSST\">HSST<\/option><option value=\"HSA\">HSA<\/option><option value=\"SET\">SET<\/option><option value=\"KTET\">KTET<\/option><option value=\"LP UP\">LP UP<\/option><option value=\"KVS\">KVS<\/option><option value=\"Finger Print Searcher\">Finger Print Searcher<\/option><option value=\"Nursery School Teacher\">Nursery School Teacher<\/option><option value=\"Railway Teacher\">Railway Teacher<\/option><option value=\"Scientific Officer\">Scientific Officer<\/option><option value=\"Probation Officer\">Probation Officer<\/option><option value=\"ICDS\">ICDS<\/option><option value=\"Welfare Officer Gr. II\">Welfare Officer Gr. II<\/option><option value=\"Assistant Professor\">Assistant Professor<\/option><option value=\"CTET\">CTET<\/option><option value=\"UGC NET\">UGC NET<\/option><option value=\"Sanitary Chemist\">Sanitary Chemist<\/option><option value=\"AE\">AE<\/option><option value=\"IEO\">IEO<\/option><option value=\"Electrician\">Electrician<\/option><option value=\"KSEB AE\/Sub Engineer\">KSEB AE\/Sub Engineer<\/option><option value=\"Kerala Agro Industries AE\">Kerala Agro Industries AE<\/option><option value=\"Overseer\/Draftsman\">Overseer\/Draftsman<\/option><option value=\"Lecturer in Polytechnic\">Lecturer in Polytechnic<\/option><option value=\"LSGD AE\">LSGD AE<\/option><option value=\"Devaswom Work Superintendent\">Devaswom Work Superintendent<\/option><option value=\"Devaswom Board Lineman\">Devaswom Board Lineman<\/option><option value=\"Devaswom Board Plumber\">Devaswom Board Plumber<\/option><option value=\"Assistant Town Planner\">Assistant Town Planner<\/option><option value=\"AAI ATC\">AAI ATC<\/option><option value=\"Central Govt PSU\">Central Govt PSU<\/option><option value=\"RRB ALP\">RRB ALP<\/option><option value=\"RRB JE\">RRB JE<\/option><option value=\"GATE\">GATE<\/option><option value=\"Skilled Assistant\">Skilled Assistant<\/option><option value=\"Workshop Instructor\">Workshop Instructor<\/option><option value=\"AMVI\">AMVI<\/option><option value=\"Technician gr 1\">Technician gr 1<\/option><option value=\"Technician gr 3\">Technician gr 3<\/option><option value=\"Assistant Professor - Tech\">Assistant Professor - Tech<\/option><option value=\"KSEB Worker\">KSEB Worker<\/option><option value=\"SSC CGL\">SSC CGL<\/option><option value=\"SSC CHSL\">SSC CHSL<\/option><option value=\"SSC CPO\">SSC CPO<\/option><option value=\"SSC MTS\">SSC MTS<\/option><option value=\"SSC GD Constable\">SSC GD Constable<\/option><option value=\"SSC JE\">SSC JE<\/option><option value=\"SSC Stenographer\">SSC Stenographer<\/option><option value=\"SSC JHT\">SSC JHT<\/option><option value=\"SSC Selection Post\">SSC Selection Post<\/option><option value=\"SSC Scientific Assistant IMD\">SSC Scientific Assistant IMD<\/option><option value=\"SSC Phase IX\/XI Selection Posts\">SSC Phase IX\/XI Selection Posts<\/option><option value=\"RRB NTPC\">RRB NTPC<\/option><option value=\"RRB Group D\">RRB Group D<\/option><option value=\"RRB Paramedical\">RRB Paramedical<\/option><option value=\"RRB Ministerial and Isolated Categories\">RRB Ministerial and Isolated Categories<\/option><option value=\"RRB RPF\">RRB RPF<\/option><option value=\"IBPS PO\">IBPS PO<\/option><option value=\"IBPS Clerk\">IBPS Clerk<\/option><option value=\"IBPS SO\">IBPS SO<\/option><option value=\"IBPS RRB PO\">IBPS RRB PO<\/option><option value=\"IBPS RRB Clerk\">IBPS RRB Clerk<\/option><option value=\"SBI PO\">SBI PO<\/option><option value=\"SBI Clerk\">SBI Clerk<\/option><option value=\"SBI SO\">SBI SO<\/option><option value=\"RBI Grade B\">RBI Grade B<\/option><option value=\"RBI Assistant\">RBI Assistant<\/option><option value=\"NABARD Grade A\">NABARD Grade A<\/option><option value=\"NABARD Grade B\">NABARD Grade B<\/option><option value=\"SIDBI Grade A\">SIDBI Grade A<\/option><option value=\"Insurance Exams\">Insurance Exams<\/option><option value=\"Federal Bank Exams\">Federal Bank Exams<\/option><option value=\"Union Bank of India Exams\">Union Bank of India Exams<\/option><option value=\"Full Stack Development Course\">Full Stack Development Course<\/option><option value=\"Data Science Course\">Data Science Course<\/option><option value=\"Data Analytics Course\">Data Analytics Course<\/option><option value=\"Software Testing Course\">Software Testing Course<\/option><option value=\"Python Programming Course\">Python Programming Course<\/option><option value=\"UI\/UX\">UI\/UX<\/option><option value=\"AWS Course\">AWS Course<\/option><option value=\"Flutter\">Flutter<\/option><option value=\"Cybersecurity\">Cybersecurity<\/option><option value=\"Practical Accounting Course\">Practical Accounting Course<\/option><option value=\"SAP FICO Course\">SAP FICO Course<\/option><option value=\"SAP MM Course\">SAP MM Course<\/option><option value=\"SAP SD Course\">SAP SD Course<\/option><option value=\"PwC Edge: Strategic Accounting &amp; Finance Programme\">PwC Edge: Strategic Accounting &amp; Finance Programme<\/option><option value=\"ACCA\">ACCA<\/option><option value=\"Tally\">Tally<\/option><option value=\"UAE Accounting\">UAE Accounting<\/option><option value=\"GST\">GST<\/option><option value=\"Stock Market Course\">Stock Market Course<\/option><option value=\"Mutual Funds\">Mutual Funds<\/option><option value=\"Forex Trading\">Forex Trading<\/option><option value=\"Kerala PSC Exams\">Kerala PSC Exams<\/option><option value=\"Kerala PSC Teaching Exams\">Kerala PSC Teaching Exams<\/option><option value=\"Kerala PSC Technical Exams\">Kerala PSC Technical Exams<\/option><option value=\"SSC\/RRB\">SSC\/RRB<\/option><option value=\"GATE\">GATE<\/option><option value=\"Banking &amp; Insurance\">Banking &amp; Insurance<\/option><option value=\"Coding\">Coding<\/option><option value=\"Commerce\">Commerce<\/option><option value=\"Personal Finance\">Personal Finance<\/option><option value=\"Spoken English\/Personality Dev\">Spoken English\/Personality Dev<\/option><option value=\"German Language\">German Language<\/option><option value=\"Montessori Teacher Training\">Montessori Teacher Training<\/option><option value=\"IELTS\">IELTS<\/option><option value=\"MEP\">MEP<\/option><option value=\"Quantity Surveying\">Quantity Surveying<\/option><option value=\"Structural Design\">Structural Design<\/option><option value=\"Yoga TTC\">Yoga TTC<\/option><option value=\"Digital Marketing\">Digital Marketing<\/option><option value=\"Hospital and Healthcare Administration\">Hospital and Healthcare Administration<\/option><option value=\"BIM\">BIM<\/option><option value=\"HR Management\">HR Management<\/option><option value=\"Embedded System Software Engineering\">Embedded System Software Engineering<\/option><\/select><\/span>\n\t<\/p>\n<\/div>\n<p><span class=\"wpcf7-form-control-wrap\" data-name=\"education\"><input size=\"40\" maxlength=\"400\" class=\"wpcf7-form-control wpcf7-text wpcf7-validates-as-required\" aria-required=\"true\" aria-invalid=\"false\" placeholder=\"Educational qualification\" value=\"\" type=\"text\" name=\"education\" \/><\/span>\n<\/p>\n<div style=\"display:none\">\n<input class=\"wpcf7-form-control wpcf7-hidden utm-source\" value=\"\" type=\"hidden\" name=\"utm_source\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden utm-medium\" value=\"\" type=\"hidden\" name=\"utm_medium\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden utm-campaign\" value=\"\" type=\"hidden\" name=\"utm_campaign\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden utm-content\" value=\"\" type=\"hidden\" name=\"utm_content\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden utm-term\" value=\"\" type=\"hidden\" name=\"utm_term\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden blog-url\" value=\"\" type=\"hidden\" name=\"blog_url\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden post-category-name\" value=\"\" type=\"hidden\" name=\"post_category_name\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden post-author-name\" value=\"\" type=\"hidden\" name=\"post_author_name\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden file-url\" value=\"\" type=\"hidden\" name=\"file_url\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden video-url\" value=\"\" type=\"hidden\" name=\"video_url\" \/>\n<input class=\"wpcf7-form-control wpcf7-hidden courseid\" value=\"\" type=\"hidden\" name=\"course_id\" \/>\n<\/div>\n<div class=\"cf7-cf-turnstile\" style=\"margin-top: 0px; margin-bottom: -15px;\"> <div id=\"cf-turnstile-cf7-360636199\" class=\"cf-turnstile\" data-sitekey=\"0x4AAAAAABVigxtkiZeGTu5L\" data-theme=\"light\" data-language=\"auto\" data-size=\"normal\" data-retry=\"auto\" data-retry-interval=\"1000\" data-action=\"contact-form-7\" data-appearance=\"always\"><\/div> <script>document.addEventListener(\"DOMContentLoaded\", function() { setTimeout(function(){ var e=document.getElementById(\"cf-turnstile-cf7-360636199\"); e&&!e.innerHTML.trim()&&(turnstile.remove(\"#cf-turnstile-cf7-360636199\"), turnstile.render(\"#cf-turnstile-cf7-360636199\", {sitekey:\"0x4AAAAAABVigxtkiZeGTu5L\"})); }, 0); });<\/script> <br class=\"cf-turnstile-br cf-turnstile-br-cf7-360636199\"> <style>#cf-turnstile-cf7-360636199 { margin-left: -15px; }<\/style> <script>document.addEventListener(\"DOMContentLoaded\",function(){document.querySelectorAll('.wpcf7-form').forEach(function(e){e.addEventListener('submit',function(){if(document.getElementById('cf-turnstile-cf7-360636199')){setTimeout(function(){turnstile.reset('#cf-turnstile-cf7-360636199');},1000)}})})});<\/script> <\/div><br\/><input class=\"wpcf7-form-control wpcf7-submit has-spinner\" type=\"submit\" value=\"Submit\" \/>\n<\/p><div class=\"wpcf7-response-output\" aria-hidden=\"true\"><\/div>\n<\/form>\n<\/div>\n\n<\/div><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/article>\n","protected":false},"excerpt":{"rendered":"<p>In the context of machine learning, a feature can be described as\u00a0 a set of characteristics, that explains the occurrence of a phenomenon. When these characteristics are converted into some measurable form, they are called features. For example, assume you have a list of students. This list contains the name of each student, number of [&hellip;]<\/p>\n","protected":false},"author":111,"featured_media":25528992,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[802,1888],"tags":[],"class_list":["post-25528945","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-articles","category-python-programming"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Automated Feature Engineering in Python: A Study - Entri Blog<\/title>\n<meta name=\"description\" content=\"Automated Feature Engineering is a technique that pulls out useful and meaningful features using a framework that can be applied to any problem\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Automated Feature Engineering in Python: A Study - Entri Blog\" \/>\n<meta property=\"og:description\" content=\"Automated Feature Engineering is a technique that pulls out useful and meaningful features using a framework that can be applied to any problem\" \/>\n<meta property=\"og:url\" content=\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\" \/>\n<meta property=\"og:site_name\" content=\"Entri Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/entri.me\/\" \/>\n<meta property=\"article:published_time\" content=\"2022-06-17T10:56:03+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-05-23T07:04:41+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png\" \/>\n\t<meta property=\"og:image:width\" content=\"820\" \/>\n\t<meta property=\"og:image:height\" content=\"615\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Feeba Mahin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@entri_app\" \/>\n<meta name=\"twitter:site\" content=\"@entri_app\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Feeba Mahin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"13 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\"},\"author\":{\"name\":\"Feeba Mahin\",\"@id\":\"https:\/\/entri.app\/blog\/#\/schema\/person\/f036dab84abae3dcc9390a1110d95d36\"},\"headline\":\"Automated Feature Engineering in Python: A Study\",\"datePublished\":\"2022-06-17T10:56:03+00:00\",\"dateModified\":\"2023-05-23T07:04:41+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\"},\"wordCount\":2618,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/entri.app\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png\",\"articleSection\":[\"Articles\",\"Python Programming\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\",\"url\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\",\"name\":\"Automated Feature Engineering in Python: A Study - Entri Blog\",\"isPartOf\":{\"@id\":\"https:\/\/entri.app\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png\",\"datePublished\":\"2022-06-17T10:56:03+00:00\",\"dateModified\":\"2023-05-23T07:04:41+00:00\",\"description\":\"Automated Feature Engineering is a technique that pulls out useful and meaningful features using a framework that can be applied to any problem\",\"breadcrumb\":{\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage\",\"url\":\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png\",\"contentUrl\":\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png\",\"width\":820,\"height\":615},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/entri.app\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Python Programming\",\"item\":\"https:\/\/entri.app\/blog\/category\/python-programming\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Automated Feature Engineering in Python: A Study\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/entri.app\/blog\/#website\",\"url\":\"https:\/\/entri.app\/blog\/\",\"name\":\"Entri Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/entri.app\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/entri.app\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/entri.app\/blog\/#organization\",\"name\":\"Entri App\",\"url\":\"https:\/\/entri.app\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/entri.app\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2019\/10\/Entri-Logo-1.png\",\"contentUrl\":\"https:\/\/entri.app\/blog\/wp-content\/uploads\/2019\/10\/Entri-Logo-1.png\",\"width\":989,\"height\":446,\"caption\":\"Entri App\"},\"image\":{\"@id\":\"https:\/\/entri.app\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/entri.me\/\",\"https:\/\/x.com\/entri_app\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/entri.app\/blog\/#\/schema\/person\/f036dab84abae3dcc9390a1110d95d36\",\"name\":\"Feeba Mahin\",\"url\":\"https:\/\/entri.app\/blog\/author\/feeba123\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Automated Feature Engineering in Python: A Study - Entri Blog","description":"Automated Feature Engineering is a technique that pulls out useful and meaningful features using a framework that can be applied to any problem","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/","og_locale":"en_US","og_type":"article","og_title":"Automated Feature Engineering in Python: A Study - Entri Blog","og_description":"Automated Feature Engineering is a technique that pulls out useful and meaningful features using a framework that can be applied to any problem","og_url":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/","og_site_name":"Entri Blog","article_publisher":"https:\/\/www.facebook.com\/entri.me\/","article_published_time":"2022-06-17T10:56:03+00:00","article_modified_time":"2023-05-23T07:04:41+00:00","og_image":[{"width":820,"height":615,"url":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png","type":"image\/png"}],"author":"Feeba Mahin","twitter_card":"summary_large_image","twitter_creator":"@entri_app","twitter_site":"@entri_app","twitter_misc":{"Written by":"Feeba Mahin","Est. reading time":"13 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#article","isPartOf":{"@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/"},"author":{"name":"Feeba Mahin","@id":"https:\/\/entri.app\/blog\/#\/schema\/person\/f036dab84abae3dcc9390a1110d95d36"},"headline":"Automated Feature Engineering in Python: A Study","datePublished":"2022-06-17T10:56:03+00:00","dateModified":"2023-05-23T07:04:41+00:00","mainEntityOfPage":{"@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/"},"wordCount":2618,"commentCount":0,"publisher":{"@id":"https:\/\/entri.app\/blog\/#organization"},"image":{"@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage"},"thumbnailUrl":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png","articleSection":["Articles","Python Programming"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/","url":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/","name":"Automated Feature Engineering in Python: A Study - Entri Blog","isPartOf":{"@id":"https:\/\/entri.app\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage"},"image":{"@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage"},"thumbnailUrl":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png","datePublished":"2022-06-17T10:56:03+00:00","dateModified":"2023-05-23T07:04:41+00:00","description":"Automated Feature Engineering is a technique that pulls out useful and meaningful features using a framework that can be applied to any problem","breadcrumb":{"@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#primaryimage","url":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png","contentUrl":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2022\/06\/Automated-Feature-Engineering-in-Python-A-Study..png","width":820,"height":615},{"@type":"BreadcrumbList","@id":"https:\/\/entri.app\/blog\/automated-feature-engineering-in-python-a-study\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/entri.app\/blog\/"},{"@type":"ListItem","position":2,"name":"Python Programming","item":"https:\/\/entri.app\/blog\/category\/python-programming\/"},{"@type":"ListItem","position":3,"name":"Automated Feature Engineering in Python: A Study"}]},{"@type":"WebSite","@id":"https:\/\/entri.app\/blog\/#website","url":"https:\/\/entri.app\/blog\/","name":"Entri Blog","description":"","publisher":{"@id":"https:\/\/entri.app\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/entri.app\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/entri.app\/blog\/#organization","name":"Entri App","url":"https:\/\/entri.app\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/entri.app\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2019\/10\/Entri-Logo-1.png","contentUrl":"https:\/\/entri.app\/blog\/wp-content\/uploads\/2019\/10\/Entri-Logo-1.png","width":989,"height":446,"caption":"Entri App"},"image":{"@id":"https:\/\/entri.app\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/entri.me\/","https:\/\/x.com\/entri_app"]},{"@type":"Person","@id":"https:\/\/entri.app\/blog\/#\/schema\/person\/f036dab84abae3dcc9390a1110d95d36","name":"Feeba Mahin","url":"https:\/\/entri.app\/blog\/author\/feeba123\/"}]}},"_links":{"self":[{"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/posts\/25528945","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/users\/111"}],"replies":[{"embeddable":true,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/comments?post=25528945"}],"version-history":[{"count":13,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/posts\/25528945\/revisions"}],"predecessor-version":[{"id":25560530,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/posts\/25528945\/revisions\/25560530"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/media\/25528992"}],"wp:attachment":[{"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/media?parent=25528945"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/categories?post=25528945"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/entri.app\/blog\/wp-json\/wp\/v2\/tags?post=25528945"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}