HIM 210 PROJECT
Step 1: Business Understanding
1. There are wide discrepancies of charges and payments between institutions
a. Larger hospitals charge more and receive higher payments
b. Urban hospitals charge more, but do not receive higher payments
2. Are the variations due to excessive charging or lower payments?
a. Excess Charge = Charge/Payment
b. Cost-to-charge ratio = Payment/Charge
Step 2: Data Understanding
1. IPPS Data
a. Medicare Provider Utilization and Payment Data: Inpatient
i. Total Discharges
ii. Average Covered Charges
iii. Average Total Payments
2. Census Data- because of the size of this file this has been limited to NY ONLY – this file is text -CSV so it will have to be opened in EXCEL first.
a. 2010 ZCTA to Metropolitan and Micropolitan Statistical Areas Relationship File
i. Zipcode
ii. CBSA
Step 3: Data Preparation
- Filter the IPPS file to only include NY
- Add the CBSA from the Census Data file to the IPPS Data fileCopy the CBSA column and Paste Special as values only
- Use VLOOKUP
- Remove #N/A values- Use Find/Replace
- Insert a new column
- In the new column, use the IF function to recategorize the hospital geography
- If the hospital has an identified CBSA, recategorize that hospital as urban
- If the hospital does not have a CBSA, recategorize that hospital as rural
- Copy the Geography column and Paste Special as values only
- Calculate Excessive charges= Charge-Payment
- Calculate Cost-to-charge Ratio (CRR) = Payment/Charge
- Copy the Excess Charge and CCR columns and Paste Special as values only
- Save the file as a .csv
- Also, save a version of the file as a .xlsx
- In the .xlsx version, click in any of the cells, format as a table (HOME – “Format as Table”)
- In the .xlsx version, name the table (DESIGN – “Table Name” – enter “DRG”)
- Save
Step 4: Modeling
1. Create a PIVOT TABLE of the count of hospitals for each geographic region (INSERT- PivotTable). REMEMBER: click the checkbox “Add this data to the Data Model”
2. Create a PIVOT TABLE to calculate the following for each geographic region:
a. Average Total discharges
b. Average Covered charges
c. Average Total Payments
d. Average Medicare Payments
e. Average Excess charges
f. Average Cost-to-charge ratio (CCR)
3. Use COUNTIF to count the number of rural and urban hospitals (compare these results to what is provided in a PIVOT TABLE
=COUNTIF(DRG[Geo],”Urban”)
=COUNTIF(DRG[Geo],”Rural”)
4. Use SUMPRODUCT to count the number of rural and urban hospitals that have a cost-to-charge ratio greater than or equal to 0.5 and those less than 0.5 (How should we normalize these results? Calculate the proportion!).
=SUMPRODUCT((DRG[Geo]=”Urban”)*(DRG[CCR]<0.5))
=SUMPRODUCT((DRG[Geo]=”Urban”)*(DRG[CCR]>=0.5))
=SUMPRODUCT((DRG[Geo]=”Rural”)*(DRG[CCR]<0.5))
=SUMPRODUCT((DRG[Geo]=”Rural”)*(DRG[CCR]>=0.5))
5. Create a PIVOT TABLE of the count of each MS-DRG
6. Create graphs to depict the above information (INSERT – CHARTS)
7. Open R
8. Open R commander
a. Type the following into R:
library(Rcmdr)
9. Import the data into R Commander using the following script:
dataset<- read.csv(file.choose())
Locate the IPPS csv data file and click “OK”
10. Activate the dataset in R commander
a. Click <No active dataset> and find “dataset”
b. Confirm the number of rows and columns as compared to the original dataset
11. Obtain a summary of the following numeric data (Statistics – Summaries – Numeric Summaries – Hold down Ctrl and click the variable names shown below – Click OK):
a. Average Covered charges
b. Average Total Payments
c. Average Medicare Payments
d. Excess charges
e. Cost-to-charge ratio (CCR)
12. Create two graphs of the “Plot of means” to compare Total Average Charges, Total Average Payment, Excess Charge, and CRR by geographic location
13. Use a two-sample T-test to determine if there are significant differences in the following data between rural and urban hospitals:
a. Count of hospitals
b. Total discharges
c. Covered charges
d. Total Payments
e. Medicare Payments
f. Excess charges
g. Cost-to-charge ratio (CCR)
Step 5: Evaluation
1. Summarize the findings
a. Are there confounding variables that we should have considered in our analysis?
i. Hint: Frequency of MS-DRG codes for each geographic location
Step 6: Deployment
1. How would these findings be relevant to your organization and what might your organization do with this sort of information?