Overview

Dataset Statistics

Number of Variables 17
Number of Rows 2283
Missing Cells 20468
Missing Cells (%) 52.7%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 925.9 KB
Average Row Size in Memory 415.3 B
Variable Types
  • Categorical: 7
  • Numerical: 10

Dataset Insights

AETNA and CIGNA have similar distributions Similar Distribution
UHC MEDICARE and HUMANA MEDICARE have similar distributions Similar Distribution
UHC MEDICARE and MEDCOST have similar distributions Similar Distribution
HUMANA MEDICARE and MEDCOST have similar distributions Similar Distribution
AETNA has 1276 (55.89%) missing values Missing
AETNA MEDICARE has 2283 (100.0%) missing values Missing
UHC has 812 (35.57%) missing values Missing
UHC MEDICARE has 2146 (94.0%) missing values Missing
BCBS has 808 (35.39%) missing values Missing
BCBS MEDICARE has 2283 (100.0%) missing values Missing
HUMANA has 1517 (66.45%) missing values Missing
HUMANA MEDICARE has 2198 (96.28%) missing values Missing
CIGNA has 1466 (64.21%) missing values Missing
CIGNA MEDICARE has 2283 (100.0%) missing values Missing
TRICARE has 1165 (51.03%) missing values Missing
MEDCOST has 2201 (96.41%) missing values Missing
Gross Charge is skewed Skewed
AETNA is skewed Skewed
UHC is skewed Skewed
UHC MEDICARE is skewed Skewed
BCBS is skewed Skewed
HUMANA is skewed Skewed
HUMANA MEDICARE is skewed Skewed
CIGNA is skewed Skewed
TRICARE is skewed Skewed
MEDCOST is skewed Skewed
CPT/MS-DRG has a high cardinality: 2092 distinct values High Cardinality
Procedure Description has a high cardinality: 1986 distinct values High Cardinality
system has constant value "FIRST" Constant
system has constant length 5 Constant Length
AETNA MEDICARE has all distinct values Unique
BCBS MEDICARE has all distinct values Unique
CIGNA MEDICARE has all distinct values Unique
  • 1
  • 2
  • 3
  • 4

Variables

CPT/MS-DRG

categorical

Approximate Distinct Count 2092
Approximate Unique (%) 91.6%
Missing 0
Missing (%) 0.0%
Memory Size 155.2 KB

Length

Mean 4.5935
Standard Deviation 0.805
Median 5
Minimum 3
Maximum 5

Sample

1st row 0011A
2nd row 0012A
3rd row 0042T
4th row 065
5th row 074

Letter

Count 463
Lowercase Letter 0
Space Separator 0
Uppercase Letter 463
Dash Punctuation 0
Decimal Number 10024
  • CPT/MS-DRG contains many words: 2092 words

Procedure Description

categorical

Approximate Distinct Count 1986
Approximate Unique (%) 87.4%
Missing 10
Missing (%) 0.4%
Memory Size 267.2 KB

Length

Mean 55.3964
Standard Deviation 39.6883
Median 49
Minimum 8
Maximum 875

Sample

1st row Immunization admin...
2nd row Immunization admin...
3rd row Computed tomograph...
4th row INTRACRANIAL HEMOR...
5th row CRANIAL & PERIPHER...

Letter

Count 105735
Lowercase Letter 80888
Space Separator 15393
Uppercase Letter 24847
Dash Punctuation 349
Decimal Number 1220
  • Procedure Description contains many words: 2419 words

Gross Charge

numerical

Approximate Distinct Count 2241
Approximate Unique (%) 99.0%
Missing 20
Missing (%) 0.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 35.4 KB
Mean 14612.2881
Minimum 0.01
Maximum 792984
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Gross Charge is skewed right (γ1 = 8.8824)

Quantile Statistics

Minimum 0.01
5-th Percentile 75.011
Q1 261.435
Median 1751.37
Q3 16599.775
95-th Percentile 65198.293
Maximum 792984
Range 792983.99
IQR 16338.34

Descriptive Statistics

Mean 14612.2881
Standard Deviation 32825.2689
Variance 1.0775e+09
Sum 3.3068e+07
Skewness 8.8824
Kurtosis 159.6316
Coefficient of Variation 2.2464
  • Gross Charge is not normally distributed (p-value 1.6776774050522364e-24)
  • Gross Charge has 264 outliers

AETNA

numerical

Approximate Distinct Count 931
Approximate Unique (%) 92.5%
Missing 1276
Missing (%) 55.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 15.7 KB
Mean 7498.893
Minimum 0.01
Maximum 249048
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • AETNA is skewed right (γ1 = 5.3851)

Quantile Statistics

Minimum 0.01
5-th Percentile 41.936
Q1 171.75
Median 571.61
Q3 3627.375
95-th Percentile 43858
Maximum 249048
Range 249047.99
IQR 3455.625

Descriptive Statistics

Mean 7498.893
Standard Deviation 21525.9003
Variance 4.6336e+08
Sum 7.5514e+06
Skewness 5.3851
Kurtosis 37.7334
Coefficient of Variation 2.8705
  • AETNA is not normally distributed (p-value 6.058808841946309e-25)
  • AETNA has 154 outliers

AETNA MEDICARE

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 151.6 KB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan

Letter

Count 6849
Lowercase Letter 6849
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • AETNA MEDICARE has words of constant length

UHC

numerical

Approximate Distinct Count 1323
Approximate Unique (%) 89.9%
Missing 812
Missing (%) 35.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 23.0 KB
Mean 14516.7092
Minimum 0.03
Maximum 430912
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • UHC is skewed right (γ1 = 4.9414)

Quantile Statistics

Minimum 0.03
5-th Percentile 30.99
Q1 203.59
Median 994.38
Q3 8026.665
95-th Percentile 79687.885
Maximum 430912
Range 430911.97
IQR 7823.075

Descriptive Statistics

Mean 14516.7092
Standard Deviation 36252.5344
Variance 1.3142e+09
Sum 2.1354e+07
Skewness 4.9414
Kurtosis 34.5164
Coefficient of Variation 2.4973
  • UHC is not normally distributed (p-value 7.17745895832377e-25)
  • UHC has 269 outliers

UHC MEDICARE

numerical

Approximate Distinct Count 126
Approximate Unique (%) 92.0%
Missing 2146
Missing (%) 94.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 2.1 KB
Mean 1364.4079
Minimum 0.01
Maximum 30238.77
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • UHC MEDICARE is skewed right (γ1 = 4.9556)

Quantile Statistics

Minimum 0.01
5-th Percentile 8.38
Q1 51.84
Median 187.38
Q3 719
95-th Percentile 7943.122
Maximum 30238.77
Range 30238.76
IQR 667.16

Descriptive Statistics

Mean 1364.4079
Standard Deviation 3720.4975
Variance 1.3842e+07
Sum 186923.88
Skewness 4.9556
Kurtosis 29.492
Coefficient of Variation 2.7268
  • UHC MEDICARE is not normally distributed (p-value 1.3123265065938637e-24)
  • UHC MEDICARE has 20 outliers

BCBS

numerical

Approximate Distinct Count 1282
Approximate Unique (%) 86.9%
Missing 808
Missing (%) 35.4%
Infinite 0
Infinite (%) 0.0%
Memory Size 23.0 KB
Mean 8495.5555
Minimum 0.16
Maximum 575971
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • BCBS is skewed right (γ1 = 13.5055)

Quantile Statistics

Minimum 0.16
5-th Percentile 67.919
Q1 233.06
Median 1013.75
Q3 7467.51
95-th Percentile 34156
Maximum 575971
Range 575970.84
IQR 7234.45

Descriptive Statistics

Mean 8495.5555
Standard Deviation 26502.3734
Variance 7.0238e+08
Sum 1.2531e+07
Skewness 13.5055
Kurtosis 258.3061
Coefficient of Variation 3.1196
  • BCBS is not normally distributed (p-value 1.2409566666375579e-24)
  • BCBS has 192 outliers

BCBS MEDICARE

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 151.6 KB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan

Letter

Count 6849
Lowercase Letter 6849
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • BCBS MEDICARE has words of constant length

HUMANA

numerical

Approximate Distinct Count 693
Approximate Unique (%) 90.5%
Missing 1517
Missing (%) 66.5%
Infinite 0
Infinite (%) 0.0%
Memory Size 12.0 KB
Mean 14911.1447
Minimum 0.01
Maximum 792984
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • HUMANA is skewed right (γ1 = 8.3322)

Quantile Statistics

Minimum 0.01
5-th Percentile 16.8675
Q1 125.9175
Median 577.055
Q3 5192.7075
95-th Percentile 79905.4375
Maximum 792984
Range 792983.99
IQR 5066.79

Descriptive Statistics

Mean 14911.1447
Standard Deviation 49559.1957
Variance 2.4561e+09
Sum 1.1422e+07
Skewness 8.3322
Kurtosis 100.0832
Coefficient of Variation 3.3236
  • HUMANA is not normally distributed (p-value 5.763772131575278e-25)
  • HUMANA has 138 outliers

HUMANA MEDICARE

numerical

Approximate Distinct Count 79
Approximate Unique (%) 92.9%
Missing 2198
Missing (%) 96.3%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.3 KB
Mean 1249.5459
Minimum 1.2
Maximum 25665.49
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • HUMANA MEDICARE is skewed right (γ1 = 5.2736)

Quantile Statistics

Minimum 1.2
5-th Percentile 7.744
Q1 52.34
Median 191.41
Q3 764
95-th Percentile 4367.448
Maximum 25665.49
Range 25664.29
IQR 711.66

Descriptive Statistics

Mean 1249.5459
Standard Deviation 3953.9223
Variance 1.5634e+07
Sum 106211.4
Skewness 5.2736
Kurtosis 28.3338
Coefficient of Variation 3.1643
  • HUMANA MEDICARE is not normally distributed (p-value 2.9097932480784182e-24)
  • HUMANA MEDICARE has 10 outliers

CIGNA

numerical

Approximate Distinct Count 750
Approximate Unique (%) 91.8%
Missing 1466
Missing (%) 64.2%
Infinite 0
Infinite (%) 0.0%
Memory Size 12.8 KB
Mean 3033.7042
Minimum 0.02
Maximum 106740.75
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • CIGNA is skewed right (γ1 = 6.1763)

Quantile Statistics

Minimum 0.02
5-th Percentile 52.32
Q1 171.75
Median 451.72
Q3 1834.89
95-th Percentile 15446.852
Maximum 106740.75
Range 106740.73
IQR 1663.14

Descriptive Statistics

Mean 3033.7042
Standard Deviation 7819.8733
Variance 6.115e+07
Sum 2.4785e+06
Skewness 6.1763
Kurtosis 55.4426
Coefficient of Variation 2.5777
  • CIGNA is not normally distributed (p-value 8.674158922574665e-25)
  • CIGNA has 133 outliers

CIGNA MEDICARE

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 151.6 KB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan

Letter

Count 6849
Lowercase Letter 6849
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • CIGNA MEDICARE has words of constant length

TRICARE

numerical

Approximate Distinct Count 946
Approximate Unique (%) 84.6%
Missing 1165
Missing (%) 51.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 17.5 KB
Mean 3761.1225
Minimum 0.03
Maximum 162722
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • TRICARE is skewed right (γ1 = 7.546)

Quantile Statistics

Minimum 0.03
5-th Percentile 13.1195
Q1 104.88
Median 430.93
Q3 2621.3975
95-th Percentile 13598.1325
Maximum 162722
Range 162721.97
IQR 2516.5175

Descriptive Statistics

Mean 3761.1225
Standard Deviation 12576.953
Variance 1.5818e+08
Sum 4.2049e+06
Skewness 7.546
Kurtosis 69.8537
Coefficient of Variation 3.3439
  • TRICARE is not normally distributed (p-value 1.3000270621046005e-24)
  • TRICARE has 126 outliers

MEDCOST

numerical

Approximate Distinct Count 75
Approximate Unique (%) 91.5%
Missing 2201
Missing (%) 96.4%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.3 KB
Mean 247.9885
Minimum 5.5
Maximum 4608.75
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • MEDCOST is skewed right (γ1 = 5.244)

Quantile Statistics

Minimum 5.5
5-th Percentile 12.4235
Q1 28.005
Median 66.78
Q3 191.3125
95-th Percentile 1047.0125
Maximum 4608.75
Range 4603.25
IQR 163.3075

Descriptive Statistics

Mean 247.9885
Standard Deviation 638.9421
Variance 408247.0066
Sum 20335.06
Skewness 5.244
Kurtosis 29.6917
Coefficient of Variation 2.5765
  • MEDCOST is not normally distributed (p-value 2.4388924080358592e-23)
  • MEDCOST has 8 outliers

Filename

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 186.4 KB
  • The largest value (first-health-moore) is over 7.49 times larger than the second largest value (first-health-montgomery)

Length

Mean 18.5891
Standard Deviation 1.6124
Median 18
Minimum 18
Maximum 23

Sample

1st row first-health-montg...
2nd row first-health-montg...
3rd row first-health-montg...
4th row first-health-montg...
5th row first-health-montg...

Letter

Count 37873
Lowercase Letter 37873
Space Separator 0
Uppercase Letter 0
Dash Punctuation 4566
Decimal Number 0
  • The top 2 categories (first-health-moore, first-health-montgomery) take over 50.0%
  • The largest value (firsthealthmoore) is over 7.49 times larger than the second largest value (firsthealthmontgomery)

system

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 156.1 KB

Length

Mean 5
Standard Deviation 0
Median 5
Minimum 5
Maximum 5

Sample

1st row FIRST
2nd row FIRST
3rd row FIRST
4th row FIRST
5th row FIRST

Letter

Count 11415
Lowercase Letter 0
Space Separator 0
Uppercase Letter 11415
Dash Punctuation 0
Decimal Number 0
  • system has words of constant length

Interactions

Correlations

Missing Values