Technologies
- Python
- Pandas
- Scikit-learn
- Logistic Regression
- ColumnTransformer
- Pipelines
- OneHotEncoder
- OS module
Skills
- Machine Learning
- Data Preprocessing
- Feature Engineering
- Pipeline Design
- Data Manipulation
- Model Training and Evaluation
- Statistical Analysis
data:image/s3,"s3://crabby-images/80b50/80b509e9554558075cef9ed9d95b9a6ef447e2fb" alt="Slide 0"
data:image/s3,"s3://crabby-images/29412/2941283ef2e3352e797519e8105fe964d7cbbc78" alt="Slide 1"
data:image/s3,"s3://crabby-images/a5310/a531018f13d08e8502292c6dc9b643f5b43f21c3" alt="Slide 2"
AI Loan Approval
This project is a loan approval prediction system that leverages machine learning to assess the likelihood of a loan application being approved based on historical data. It processes applicant details such as income, credit history, marital status, and loan amount to train a logistic regression model for classification. The system follows a structured pipeline that includes data preprocessing, model training, and evaluation, ensuring that both categorical and numerical features are transformed appropriately before feeding them into the classifier. The ultimate goal is to provide financial institutions with a data-driven approach to streamline and automate the loan approval process, reducing manual effort and increasing efficiency.
The preprocessing pipeline is a critical component of this system, ensuring that data is cleaned, imputed, and encoded for optimal model performance. The dataset is divided into nominal, ordinal, and numerical features, each undergoing different transformations such as One-Hot Encoding, Ordinal Encoding, and Standard Scaling. Missing values are handled using SimpleImputer, replacing categorical and ordinal values with the most frequent entry and numerical values with the mean. A ColumnTransformer consolidates these transformations into a single step, allowing seamless data preprocessing. The model is then trained using an 80-20 train-test split, ensuring it generalizes well to unseen loan applications.
Once trained, the system evaluates the model’s performance on test data, providing an accuracy score that reflects its predictive capabilities. The complete machine learning pipeline, which includes both preprocessing and classification, can be easily retrieved and utilized for making real-time predictions on new loan applications. The structured approach ensures that financial institutions can make data-driven decisions, minimizing risk while improving the efficiency of loan processing. With further improvements, such as incorporating more advanced machine learning models or fine-tuning hyperparameters, this system could become an essential tool in modern financial services.