CS628 Week 11 Assignment - Data Science

CS628 - Data Science
Project # 5: Visualization Assignment
Monroe College
For the following questions (1 to 3), work with the bank_marketing_training data set. Use Python.
Attached is the file. Start with the following code.
1. # import required package 2. import pandas as pd 3. 4. # read the csv bank_train data using the pandas package 5. bank_train = pd.read_csv("/Users/edeki/Desktop/Website Data Sets/bank_marketing_training")
1. Create a bar graph of the previous_outcome variable, with response overlay.
2. Create a normalized bar graph of previous_outcome variable with response overlay. Describe the relationship between previous_outcome and response.
3.Examine the non?normalized and normalized histograms of duration, with overlay of response.
Identify cutoff point(s) for duration, which separate low values of response from high values.
Define a new categorical variable, duration_binned, using the cutoff points you identified.
For the following questions (4 to 5), work with the adult_ch3_training data set
1. # read the csv adult_ch3_training data using the pandas package 2. adult_ch3_train = pd.read_csv("/Users/edeki/Desktop/Website Data Sets/adult_ch3_training")
4.Consider capital?loss. Identify the outliers in capital?loss using the Z?score method. How many outliers are there?
5.Construct a bar graph of Income for these outlier records.

-
Rating:
5/
Solution: CS628 Week 11 Assignment - Data Science