
The main aim of this project is to develop a machine learning-based system that can detect phishing websites by examining their domain names. With cyber threats on the rise, phishing attacks are a common method used by malicious actors to steal personal data. This system aims to enhance user safety and promote cybersecurity awareness by classifying potentially harmful websites and alerting users before they interact with them. The model will be trained using a dataset containing features of legitimate and phishing domains, enabling it to accurately predict whether a given URL is safe. This project not only serves to protect users from scams but also educates students on real-world cybersecurity threats and the role of machine learning in combating them.
The project will be carried out over twelve weeks with structured milestones. Initially, students will learn about phishing techniques and how domain features can be used for threat detection. They will collect and preprocess a dataset of phishing and legitimate domains, and configure their development environment using Python and tools like Anaconda Navigator or Google Colab.
In the next phase, students will build a machine learning classification model using Decision Trees and Support Vector Machines. They will train the model on labeled data and validate its performance using metrics such as accuracy, precision, and recall. To improve the system’s reliability, they will test it with new datasets and refine it accordingly. Toward the final stages, students will finalize the full application, document their process and findings, and deliver a presentation demonstrating their phishing detection system. Ethical development, privacy compliance, and consistent documentation practices must be maintained throughout the project.