IT546 Assignment 4 - Data Science
Data Science
Fall 2020_IT546
Assignment 4
1- Consider the data in the following table:
|
TID
|
Home Owner |
Marital Status |
Annual Income |
Defaulted Borrower |
|
1 |
Yes |
Single |
[120 - < 150K] |
No |
|
2 |
No |
Married |
[90 - < 120K] |
No |
|
3 |
No |
Single |
[60 - < 90K] |
No |
|
4 |
Yes |
Married |
[120 - < 150K] |
No |
|
5 |
No |
Divorced |
[90 - < 120K] |
Yes |
|
6 |
No |
Married |
[60 - < 90K] |
No |
|
7 |
Yes |
Divorced |
[120 - < 150K] |
No |
|
8 |
No |
Single |
[90 - < 120K] |
Yes |
|
9 |
No |
Married |
[60 - < 90K] |
No |
|
10 |
No |
Single |
[90 - < 120K] |
Yes |
Let Defaulted Borrower be the class label attribute.
a) Given a data tuple X = (Home Owner= No, Marital Status= Married, Income= $120K). What would a naive Bayesian classification of the Defaulted Borrower for the tuple be?
2- Consider the training example in the following table for a binary classification problem.
|
Customer ID |
Gender |
Car Type |
Shirt Size |
Class |
|
1 |
M |
Family |
S |
C0 |
|
2 |
M |
Sports |
M |
C0 |
|
3 |
M |
Sports |
M |
C0 |
|
4 |
M |
Sports |
L |
C0 |
|
5 |
M |
Sports |
XL |
C0 |
|
6 |
M |
Sports |
XL |
C0 |
|
7 |
F |
Sports |
S |
C0 |
|
8 |
F |
Sports |
S |
C0 |
|
9 |
F |
Sports |
M |
C0 |
|
10 |
F |
Luxury |
L |
C0 |
|
11 |
M |
Family |
L |
C1 |
|
12 |
M |
Family |
XL |
C1 |
|
13 |
M |
Family |
M |
C1 |
|
14 |
M |
Luxury |
XL |
C1 |
|
15 |
F |
Luxury |
S |
C1 |
|
16 |
F |
Luxury |
S |
C1 |
|
17 |
F |
Luxury |
M |
C1 |
|
18 |
F |
Luxury |
M |
C1 |
|
19 |
F |
Luxury |
M |
C1 |
|
20 |
F |
Luxury |
L |
C1 |
a) Find the gain for Gender, Car Type, and Shirt Size.
b) Which attribute will be selected as the splitting attribute?
2
-
Rating:
/5
Solution: IT546 Assignment 4 - Data Science