Practical_Exam_Work

YASH PATEL
Nov 16, 2021

--

Dataset: https://archive.ics.uci.edu/ml/datasets/Spambase

Task-1:
Dataset Description using Orange tool.
What is need to be done to improve the accuracy of the classification result of the given dataset? Get the maximum classification accuracy possible by performing the following methods.
→ Pre-processing
o Encoding
o Normalization
o Missing value handling
o Feature Selection

Task-2:
Generate the Dashboard of the preprocessed dataset from task-1.
Find the Maximum data insights by plotting Bar chart, Boxplot, Pie Plot, Stack Plot using PowerBI dashboard visualization.

  1. Provide a screenshot of the data description and explain in brief.

In this dataset, there are 4601 rows & 56 Columns. The dataset is multivariate and contains only integer values.

2. Provide screenshot (s) of data pre-processing steps showing their significance.

Work-Flow
Target X.57 Column
Bar Plot
Scatter-Plot between X-Axis: X.0, Y-Axis: X.1
Scatter-Plot between X-Axis: X.0, Y-Axis: X.7
Scatter-Plot between X-Axis: X.0, Y-Axis: X.11

3. Provide a screenshot showing accuracy before and after pre-processing.

Accuracy Before and After

4. Provide a screenshot of the PowerBI dashboard with a description.

PowerBI dashboard visualization
Bar-Plot
Box-Plot
Pie-Plot
Stack-Plot

--

--