Overview

Dataset Statistics

Number of Variables 14
Number of Rows 2550
Missing Cells 5022
Missing Cells (%) 14.1%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 848.2 KB
Average Row Size in Memory 340.6 B
Variable Types
  • Categorical: 5
  • Numerical: 9

Dataset Insights

AETNA and UHC have similar distributions Similar Distribution
AETNA and CIGNA have similar distributions Similar Distribution
BCBS and HUMANA MEDICARE have similar distributions Similar Distribution
De-identified Maximum has 296 (11.61%) missing values Missing
De-identified Minimum has 296 (11.61%) missing values Missing
AETNA has 736 (28.86%) missing values Missing
UHC has 722 (28.31%) missing values Missing
BCBS has 730 (28.63%) missing values Missing
HUMANA has 751 (29.45%) missing values Missing
HUMANA MEDICARE has 747 (29.29%) missing values Missing
CIGNA has 743 (29.14%) missing values Missing
Self Pay is skewed Skewed
Gross Charge is skewed Skewed
De-identified Maximum is skewed Skewed
De-identified Minimum is skewed Skewed
AETNA is skewed Skewed
UHC is skewed Skewed
BCBS is skewed Skewed
HUMANA MEDICARE is skewed Skewed
CIGNA is skewed Skewed
CPT/MS-DRG has a high cardinality: 1833 distinct values High Cardinality
Filename has constant value "wakemed" Constant
HUMANA has constant value "0.0" Constant
system has constant value "WAKEMED" Constant
Filename has constant length 7 Constant Length
HUMANA has constant length 3 Constant Length
system has constant length 7 Constant Length
AETNA has 1042 (40.86%) zeros Zeros
UHC has 1170 (45.88%) zeros Zeros
BCBS has 1424 (55.84%) zeros Zeros
HUMANA MEDICARE has 1421 (55.73%) zeros Zeros
CIGNA has 1008 (39.53%) zeros Zeros
  • 1
  • 2
  • 3
  • 4

Variables

CPT/MS-DRG

categorical

Approximate Distinct Count 1833
Approximate Unique (%) 71.9%
Missing 0
Missing (%) 0.0%
Memory Size 183.3 KB

Length

Mean 8.5925
Standard Deviation 8.0306
Median 5
Minimum 3
Maximum 143

Sample

1st row 001, 002
2nd row 001,002
3rd row 0100
4th row 0100
5th row 0100

Letter

Count 68
Lowercase Letter 0
Space Separator 42
Uppercase Letter 68
Dash Punctuation 107
Decimal Number 20183
  • CPT/MS-DRG contains many words: 1850 words

Patient Type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 186.7 KB
  • The largest value (Outpatient) is over 24.5 times larger than the second largest value (Inpatient)

Length

Mean 9.9608
Standard Deviation 0.1941
Median 10
Minimum 9
Maximum 10

Sample

1st row Inpatient
2nd row Inpatient
3rd row Inpatient
4th row Inpatient
5th row Inpatient

Letter

Count 25400
Lowercase Letter 22850
Space Separator 0
Uppercase Letter 2550
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Outpatient, Inpatient) take over 50.0%
  • The largest value (outpatient) is over 24.5 times larger than the second largest value (inpatient)

Self Pay

numerical

Approximate Distinct Count 2140
Approximate Unique (%) 83.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 39.8 KB
Mean 4910.1792
Minimum 0
Maximum 202120.4534
Zeros 13
Zeros (%) 0.5%
Negatives 0
Negatives (%) 0.0%
  • Self Pay is skewed right (γ1 = 7.27)

Quantile Statistics

Minimum 0
5-th Percentile 24.769
Q1 93.0675
Median 1334.71
Q3 5399.845
95-th Percentile 20311.462
Maximum 202120.4534
Range 202120.4534
IQR 5306.7775

Descriptive Statistics

Mean 4910.1792
Standard Deviation 11256.036
Variance 1.267e+08
Sum 1.2521e+07
Skewness 7.27
Kurtosis 89.3958
Coefficient of Variation 2.2924
  • Self Pay is not normally distributed (p-value 1.9873049086569957e-23)
  • Self Pay has 195 outliers

Gross Charge

numerical

Approximate Distinct Count 2153
Approximate Unique (%) 84.5%
Missing 1
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 39.8 KB
Mean 12926.5934
Minimum 0
Maximum 531895.93
Zeros 12
Zeros (%) 0.5%
Negatives 0
Negatives (%) 0.0%
  • Gross Charge is skewed right (γ1 = 7.2689)

Quantile Statistics

Minimum 0
5-th Percentile 65.234
Q1 246.07
Median 3518.82
Q3 14212.21
95-th Percentile 53461.142
Maximum 531895.93
Range 531895.93
IQR 13966.14

Descriptive Statistics

Mean 12926.5934
Standard Deviation 29625.8532
Variance 8.7769e+08
Sum 3.295e+07
Skewness 7.2689
Kurtosis 89.3683
Coefficient of Variation 2.2919
  • Gross Charge is not normally distributed (p-value 1.995139566258149e-23)
  • Gross Charge has 195 outliers

De-identified Maximum

numerical

Approximate Distinct Count 2040
Approximate Unique (%) 90.5%
Missing 296
Missing (%) 11.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 35.2 KB
Mean 5595.5045
Minimum 0
Maximum 180018.82
Zeros 34
Zeros (%) 1.3%
Negatives 0
Negatives (%) 0.0%
  • De-identified Maximum is skewed right (γ1 = 6.2154)

Quantile Statistics

Minimum 0
5-th Percentile 14.978
Q1 79.4675
Median 1397.57
Q3 6612.93
95-th Percentile 22718.292
Maximum 180018.82
Range 180018.82
IQR 6533.4625

Descriptive Statistics

Mean 5595.5045
Standard Deviation 12280.3903
Variance 1.5081e+08
Sum 1.2612e+07
Skewness 6.2154
Kurtosis 59.041
Coefficient of Variation 2.1947
  • De-identified Maximum is not normally distributed (p-value 1.0680469543834992e-23)
  • De-identified Maximum has 179 outliers

De-identified Minimum

numerical

Approximate Distinct Count 1911
Approximate Unique (%) 84.8%
Missing 296
Missing (%) 11.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 35.2 KB
Mean 2961.4619
Minimum 0
Maximum 153084.67
Zeros 34
Zeros (%) 1.3%
Negatives 0
Negatives (%) 0.0%
  • De-identified Minimum is skewed right (γ1 = 8.4583)

Quantile Statistics

Minimum 0
5-th Percentile 4.5285
Q1 26.41
Median 372.775
Q3 2888.7825
95-th Percentile 12183.502
Maximum 153084.67
Range 153084.67
IQR 2862.3725

Descriptive Statistics

Mean 2961.4619
Standard Deviation 7885.3095
Variance 6.2178e+07
Sum 6.6751e+06
Skewness 8.4583
Kurtosis 107.2686
Coefficient of Variation 2.6626
  • De-identified Minimum is not normally distributed (p-value 1.5034448487585392e-24)
  • De-identified Minimum has 257 outliers

Filename

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 179.3 KB

Length

Mean 7
Standard Deviation 0
Median 7
Minimum 7
Maximum 7

Sample

1st row wakemed
2nd row wakemed
3rd row wakemed
4th row wakemed
5th row wakemed

Letter

Count 17850
Lowercase Letter 17850
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • Filename has words of constant length

AETNA

numerical

Approximate Distinct Count 754
Approximate Unique (%) 41.6%
Missing 736
Missing (%) 28.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 28.3 KB
Mean 1399.1504
Minimum 0
Maximum 49650.265
Zeros 1042
Zeros (%) 40.9%
Negatives 0
Negatives (%) 0.0%
  • AETNA is skewed right (γ1 = 4.9825)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 299.7325
95-th Percentile 9115.4825
Maximum 49650.265
Range 49650.265
IQR 299.7325

Descriptive Statistics

Mean 1399.1504
Standard Deviation 4165.5808
Variance 1.7352e+07
Sum 2.5381e+06
Skewness 4.9825
Kurtosis 32.6702
Coefficient of Variation 2.9772
  • AETNA is not normally distributed (p-value 5.536523627052942e-25)
  • AETNA has 383 outliers

UHC

numerical

Approximate Distinct Count 514
Approximate Unique (%) 28.1%
Missing 722
Missing (%) 28.3%
Infinite 0
Infinite (%) 0.0%
Memory Size 28.6 KB
Mean 1393.2124
Minimum 0
Maximum 91737.01
Zeros 1170
Zeros (%) 45.9%
Negatives 0
Negatives (%) 0.0%
  • UHC is skewed right (γ1 = 7.9046)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 67.96
95-th Percentile 7976
Maximum 91737.01
Range 91737.01
IQR 67.96

Descriptive Statistics

Mean 1393.2124
Standard Deviation 5292.6383
Variance 2.8012e+07
Sum 2.5468e+06
Skewness 7.9046
Kurtosis 88.6282
Coefficient of Variation 3.7989
  • UHC is not normally distributed (p-value 5.048770258423612e-25)
  • UHC has 436 outliers

BCBS

numerical

Approximate Distinct Count 359
Approximate Unique (%) 19.7%
Missing 730
Missing (%) 28.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 28.4 KB
Mean 707.8927
Minimum 0
Maximum 180018.82
Zeros 1424
Zeros (%) 55.8%
Negatives 0
Negatives (%) 0.0%
  • BCBS is skewed right (γ1 = 21.8317)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 2672.42
Maximum 180018.82
Range 180018.82
IQR 0

Descriptive Statistics

Mean 707.8927
Standard Deviation 5615.78
Variance 3.1537e+07
Sum 1.2884e+06
Skewness 21.8317
Kurtosis 621.0169
Coefficient of Variation 7.9331
  • BCBS is not normally distributed (p-value 4.3022391946372155e-25)
  • BCBS has 396 outliers

HUMANA

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.1%
Missing 751
Missing (%) 29.4%
Memory Size 119.5 KB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 3598
  • HUMANA has words of constant length

HUMANA MEDICARE

numerical

Approximate Distinct Count 338
Approximate Unique (%) 18.8%
Missing 747
Missing (%) 29.3%
Infinite 0
Infinite (%) 0.0%
Memory Size 28.2 KB
Mean 475.1358
Minimum 0
Maximum 86787.12
Zeros 1421
Zeros (%) 55.7%
Negatives 0
Negatives (%) 0.0%
  • HUMANA MEDICARE is skewed right (γ1 = 16.8568)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 2344.756
Maximum 86787.12
Range 86787.12
IQR 0

Descriptive Statistics

Mean 475.1358
Standard Deviation 3011.2396
Variance 9.0676e+06
Sum 856669.768
Skewness 16.8568
Kurtosis 406.0486
Coefficient of Variation 6.3376
  • HUMANA MEDICARE is not normally distributed (p-value 4.462322127151517e-25)
  • HUMANA MEDICARE has 382 outliers

CIGNA

numerical

Approximate Distinct Count 776
Approximate Unique (%) 42.9%
Missing 743
Missing (%) 29.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 28.2 KB
Mean 1561.9656
Minimum 0
Maximum 91830.84
Zeros 1008
Zeros (%) 39.5%
Negatives 0
Negatives (%) 0.0%
  • CIGNA is skewed right (γ1 = 7.9872)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 221.56
95-th Percentile 8149.876
Maximum 91830.84
Range 91830.84
IQR 221.56

Descriptive Statistics

Mean 1561.9656
Standard Deviation 5666.7477
Variance 3.2112e+07
Sum 2.8225e+06
Skewness 7.9872
Kurtosis 87.6869
Coefficient of Variation 3.628
  • CIGNA is not normally distributed (p-value 5.325363681231574e-25)
  • CIGNA has 395 outliers

system

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 179.3 KB

Length

Mean 7
Standard Deviation 0
Median 7
Minimum 7
Maximum 7

Sample

1st row WAKEMED
2nd row WAKEMED
3rd row WAKEMED
4th row WAKEMED
5th row WAKEMED

Letter

Count 17850
Lowercase Letter 0
Space Separator 0
Uppercase Letter 17850
Dash Punctuation 0
Decimal Number 0
  • system has words of constant length

Interactions

Correlations

Missing Values