Creating New Features: Mathematical Transforms, Binning, and Date/Time Features

Feature engineering is the process of creating useful input variables from raw data so that predictive models can learn better patterns. A strong feature can make a simple model perform well, while weak features can limit even advanced algorithms.

In this chapter, we will learn how to create new features using mathematical transformations, ratios, interaction features, binning, and date/time extraction.

What is Feature Engineering?

Feature engineering means transforming raw data into meaningful model inputs. These new or modified variables help machine learning algorithms understand hidden patterns more clearly.

For example, a raw date column like order_date may not be directly useful. But from it, we can create features such as order month, weekday, weekend flag, festival season, and days since last purchase.

Core Idea: Predictive models do not understand business context automatically. Feature engineering helps convert business understanding into numerical or categorical signals that models can learn from.

Why Creating New Features Matters

🧠
Reveals Hidden Patterns
New features can expose relationships that are not obvious in raw columns.
📈
Improves Model Accuracy
Better features usually improve prediction performance more than simply changing algorithms.
⚙️
Supports Simpler Models
Well-designed features can help linear models capture useful patterns more effectively.
🎯
Connects Data to Business Logic
Features such as customer tenure, average order value, and risk ratio reflect real business behaviour.

Feature Creation Workflow

Practical Feature Engineering Pipeline

Understand Business Problem
Explore Raw Variables
Create Candidate Features
Validate with EDA
Test Model Impact

Three Major Feature Creation Techniques

Feature Creation at a Glance

Mathematical Transform
Binning
Low
Medium
High
Date/Time Features

1. Mathematical Transformations

Mathematical transformations change the scale, shape, or meaning of numerical variables. They are useful when data is skewed, has large differences in magnitude, or contains relationships that become clearer after transformation.

Transformation What It Does Best Used When Example
Math
Log Transform
Compresses large values and reduces right skew. Data has long right tail or extreme high values. Transform income, sales, transaction amount, or house price.
Math
Square Root Transform
Moderately compresses large values. Count data or moderately skewed numerical variables. Transform number of visits or number of complaints.
Math
Power Transform
Changes the relationship between variable and target. Non-linear patterns are present. Create area squared for property price modelling.
Math
Ratio Feature
Compares two quantities meaningfully. Relative value is more useful than raw value. Debt-to-income ratio, profit margin, conversion rate.
Math
Difference Feature
Measures gap between two variables. The difference itself carries business meaning. Delivery delay = actual delivery date − promised delivery date.
Math
Interaction Feature
Combines two variables to capture joint effect. Impact of one feature depends on another. Discount × customer segment, price × quantity.

Log Transformation

Log transformation is useful when a variable has many small values and a few extremely large values. It reduces skewness and makes the distribution more manageable for many models.

New Feature = log(Original Value + 1)
Adding 1 helps avoid problems when the original value is zero.

For example, customer spending may range from ₹100 to ₹10,00,000. A log transform can reduce the dominance of extremely high spenders while preserving order.

Ratio Features

Ratio features are often powerful because they express relationships between two quantities. In many business problems, relative values are more meaningful than absolute values.

Debt-to-Income Ratio = Monthly Debt Payment / Monthly Income
This feature is often more useful for credit risk than income or debt alone.

2. Binning

Binning converts a continuous numerical variable into groups or intervals. Instead of using exact values, the model uses ranges such as low, medium, and high.

For example, instead of using exact age, we can create age groups such as 18–25, 26–35, 36–50, and 50+. This may make patterns easier to interpret and more stable.

Simple Explanation: Binning turns a continuous variable into categories so that the model can learn group-level behaviour.

Binning Method Meaning Example Best Used When
Binning
Equal Width Binning
Divides the value range into intervals of equal size. Age: 0–20, 21–40, 41–60, 61+ Range-based interpretation is simple and meaningful.
Binning
Equal Frequency Binning
Each bin contains approximately the same number of observations. Income divided into quartiles. Data is skewed and balanced bin sizes are desired.
Binning
Business Rule Binning
Bins are created using domain knowledge. Credit score: Poor, Fair, Good, Excellent. Business interpretation matters.
Binning
Target-Based Binning
Bins are chosen based on target behaviour. Age groups where churn rate changes significantly. Predictive separation is important, but leakage must be avoided.

When Binning is Useful

Improves Interpretability
  • Age groups are easier to understand than exact ages.
  • Risk bands are easier for business teams to use.
  • Segments can support dashboards and decision rules.
Handles Non-Linear Patterns
  • Risk may increase sharply after a threshold.
  • Customer behaviour may differ by income band.
  • Churn may be high only for very new customers.
Reduces Noise
  • Small fluctuations in exact values may not matter.
  • Grouping can make patterns more stable.
  • Useful when exact values are unreliable.
Potential Risk
  • Binning can lose detailed information.
  • Poorly chosen bins may hide useful patterns.
  • Target-based bins can cause leakage if created incorrectly.

3. Date and Time Features

Date and time columns are extremely valuable in predictive modelling. Raw dates are rarely useful by themselves, but they can be converted into powerful features that capture seasonality, recency, frequency, customer lifecycle, and time-based behaviour.

Date/Time Feature Meaning Example Use Case Why It Helps
Date/Time
Year
Extract year from date. Long-term sales or price trends. Captures annual growth or decline.
Date/Time
Month
Extract month number or month name. Retail demand forecasting. Captures seasonality and monthly demand cycles.
Date/Time
Day of Week
Extract weekday from date. Restaurant orders, website traffic, delivery demand. Captures weekday vs weekend behaviour.
Date/Time
Hour of Day
Extract hour from timestamp. Ride booking, call centre volume, app usage. Captures daily activity patterns.
Date/Time
Weekend Flag
Marks whether date is Saturday or Sunday. Retail, tourism, entertainment, food delivery. Weekend behaviour often differs from weekdays.
Date/Time
Holiday or Festival Flag
Marks special days or periods. Sales forecasting, demand planning. Captures demand spikes around events.
Date/Time
Days Since Last Event
Measures recency. Customer churn, repeat purchase, engagement prediction. Recent behaviour is often highly predictive.
Date/Time
Tenure
Time since customer joined or account opened. Churn prediction, loyalty analysis. Longer-tenure customers often behave differently from new customers.

Recency, Frequency, and Monetary Features

In customer analytics, one of the most useful feature engineering approaches is creating RFM features: Recency, Frequency, and Monetary value.

Recency
  • How recently did the customer act?
  • Example: Days since last purchase.
  • Useful for churn and repeat purchase prediction.
Frequency
  • How often does the customer act?
  • Example: Number of purchases in last 90 days.
  • Useful for engagement and loyalty prediction.
Monetary
  • How much value does the customer generate?
  • Example: Total spend or average order value.
  • Useful for customer value and targeting models.
Combined RFM Signal
  • Customers who bought recently, frequently, and with high value are often more valuable.
  • RFM features support segmentation and prediction.
  • They are widely used in marketing analytics.

Interaction Features

Interaction features are created when the effect of one variable depends on another variable. These features help models capture combined effects that may not be visible from individual variables alone.

Interaction Feature Original Variables Business Meaning
Price × Quantity Price and quantity sold. Total sales value.
Discount × Customer Segment Discount rate and customer type. Different customer groups may respond differently to discounts.
Income × Credit Score Income and credit score. Financial strength may depend on both income and repayment history.
Tenure × Complaint Count Customer tenure and complaints. Complaints may affect new and old customers differently.

Example: Feature Engineering for Customer Churn

Business Problem

A telecom company wants to predict whether a customer will churn. The raw dataset contains customer join date, monthly charges, support tickets, payment history, data usage, and churn status.

Raw Data New Feature Feature Type Why It Helps
Join Date Customer tenure in months. Date/Time New customers may churn more frequently than long-term customers.
Support Tickets Tickets per month. Ratio Normalizes complaints by customer tenure.
Monthly Charges Charge band: Low, Medium, High. Binning Helps detect price sensitivity groups.
Last Payment Date Days since last payment. Recency Recent payment behaviour may signal engagement or risk.
Data Usage Log of data usage. Transform Reduces the effect of extremely high usage values.

Example: Feature Engineering for Sales Forecasting

Business Problem

A retail company wants to forecast product demand. Raw sales data includes date, product ID, store location, price, discount, units sold, and inventory level.

  • Month: Captures seasonal buying behaviour.
  • Weekend flag: Captures higher weekend demand.
  • Festival flag: Captures demand spikes during holidays.
  • Discount percentage: Captures promotion impact.
  • Previous week sales: Captures recent demand momentum.
  • Stockout flag: Helps explain zero or unusually low sales.

These features convert raw transaction data into signals that reflect real buying behaviour.

Feature Engineering and Data Leakage

Feature engineering must be done carefully to avoid data leakage. Leakage happens when a feature uses information that would not be available at the time of prediction.

High-Risk Example: If you are predicting whether a customer will churn next month, you cannot use “cancellation date” or “reason for cancellation” as features because these values are known only after churn happens.

Feature Safe or Leakage? Reason
Number of complaints before prediction date Safe Available before prediction.
Cancellation reason Leakage Known only after customer has churned.
Sales from previous month Safe Past information used to predict future.
Sales from next month Leakage Future information used incorrectly.

How to Evaluate New Features

Not every new feature improves a model. Some features add noise, duplicate existing information, or cause leakage. Every engineered feature should be validated using business logic, EDA, and model performance.

Feature Validation Process

Create Feature
Check Business Meaning
Inspect Distribution
Check Leakage Risk
Test Model Performance

Common Mistakes in Feature Creation

Mistake Why It Is Harmful Better Approach
Creating too many random features Adds noise and increases overfitting risk. Create features guided by business logic and EDA.
Using future information Causes data leakage and unrealistic model performance. Use only information available at prediction time.
Binning without reason Can lose useful numerical detail. Use binning when it improves interpretability or captures thresholds.
Ignoring feature distribution New features may be skewed, sparse, or full of missing values. Perform EDA on every engineered feature.
Not testing model impact A feature may look meaningful but not improve prediction. Compare model performance with and without the feature.

Best Practices for Creating New Features

Feature Engineering Checklist

  • Start with business understanding: Create features that reflect real drivers of the outcome.
  • Use EDA findings: Let distributions, trends, and target relationships guide feature ideas.
  • Transform skewed variables: Use log or square root transformations when appropriate.
  • Create ratios carefully: Ratios often capture stronger business meaning than raw values.
  • Use binning when useful: Bins can capture thresholds and improve interpretability.
  • Extract date/time signals: Month, weekday, tenure, recency, and seasonality are often powerful.
  • Check leakage: Use only information available before the prediction moment.
  • Validate every feature: Inspect distribution, missingness, target relationship, and model impact.
  • Keep the feature set manageable: More features are not always better.

Why Feature Engineering is a Core Modelling Skill

Feature engineering is where data science meets domain understanding. Algorithms learn from the features we provide. If the features are weak, noisy, or poorly designed, the model may struggle. If the features are meaningful, clean, and predictive, the model can perform much better.

In real-world predictive analytics, thoughtful feature creation often makes the difference between an average model and a useful business solution.

Practical Insight: The best features are not always complicated. Often, simple features like customer tenure, days since last purchase, average order value, and complaint frequency are extremely powerful.

Key Takeaways

  • Feature engineering creates useful model inputs from raw data.
  • Mathematical transformations help handle skewness, scale, ratios, and non-linear patterns.
  • Binning converts continuous variables into meaningful groups or bands.
  • Date/time features capture seasonality, recency, frequency, tenure, and time-based behaviour.
  • Interaction features capture combined effects between variables.
  • Feature engineering should be guided by business logic, EDA, and validation performance.
  • Data leakage must be avoided by using only information available at prediction time.
  • Good features can significantly improve predictive model performance and interpretability.