Week 15 Assignment – 100 points
Objective: Using SQL query in R data frame. Given the Consumer_Complaints.csv, use sqldf( ) to query the database for the following reports:
1) Query for all the states where “Movement Mortgage” are present – (columns are Company & State)
2) Query for all the states where “Movement Mortgage” OR “Hewitt-Capital” are present – (columns are Company & State). Sort the by Company.
3) Query for the Product of “Consumer Loan” and the various means of Submitting the complain relating to Consumer Loan (Submission via). Count each mean of submission (Submission via). (Columns are Product, ‘Submitted via’, ‘Count Submitted Via’
4) Query for company called “ACS Education Services” and its various Company Response. Count each Company Response. Columns are Company, CompanyResponse & ‘Count Company Response’.
5) Query for all Product and ‘Submitted via’ and count each of the ‘Submitted via’. Sort by Product and ‘Submitted via’. Your columns are Product, Submittedvia, & ‘Count Submitted via’
> sqldf(“se Mortgage'”)
1 360 Mortgage MD
2 360 Mortgage CA
3 360 Mortgage FL
4 360 Mortgage AZ
5 360 Mortgage TX
6 360 Mortgage UT
7 360 Mortgage TX
We start by loading the appropriate libraries in R: sqldf package as follows:
> install.packages(“sqldf”) > library(sqldf)
Next, we load the csv file and check the statistical properties of the csv File as follow:
> setwd(“C:/RData”) # your working directory
> str(myData) # check the properties of the file
. . . continue from here!
You need to rename the column with space to remove the space. For example, you should rename ‘Submitted via’ to SubmittedVia before you begin your query because R programming does not like either space or dot in-between words.
Your paper must have cover-page and reference-page.
There will be penalty for late assignment. No any screenshot is allowed in this assignment.