Overview

Dataset Statistics

Number of Variables 23
Number of Rows 85839
Missing Cells 7560
Missing Cells (%) 0.4%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 252.0 MB
Average Row Size in Memory 3.0 KB
Variable Types
  • Categorical: 21
  • Numerical: 2

Dataset Insights

Internal Proc Code has 3780 (4.4%) missing values Missing
Other Code has 3780 (4.4%) missing values Missing
Internal Proc Code is skewed Skewed
HCPCS/ CPT/NDC Code has a high cardinality: 7566 distinct values High Cardinality
Procedure Name has a high cardinality: 83493 distinct values High Cardinality
Charge has a high cardinality: 15130 distinct values High Cardinality
Medicare Adv - Aetna has a high cardinality: 14152 distinct values High Cardinality
Medicare Adv - BCBS has a high cardinality: 13452 distinct values High Cardinality
Medicare Adv - Humana has a high cardinality: 13498 distinct values High Cardinality
Medicare Adv - United has a high cardinality: 13452 distinct values High Cardinality
Medicare Adv - Other has a high cardinality: 13452 distinct values High Cardinality
Aetna has a high cardinality: 15609 distinct values High Cardinality
BCBS has a high cardinality: 15272 distinct values High Cardinality
Cigna has a high cardinality: 15188 distinct values High Cardinality
Exchange/Ambetter has a high cardinality: 14592 distinct values High Cardinality
Medcost has a high cardinality: 14875 distinct values High Cardinality
United has a high cardinality: 15093 distinct values High Cardinality
Other Mgd Care has a high cardinality: 14934 distinct values High Cardinality
Tricare has a high cardinality: 13394 distinct values High Cardinality
Workers Comp has a high cardinality: 14501 distinct values High Cardinality
Self Pay has a high cardinality: 14571 distinct values High Cardinality
De-identified Maximum has a high cardinality: 15038 distinct values High Cardinality
De-identified Minimum has constant value " - " Constant
Filename has constant value "56-2070036_DRH_standardcharges_cdm.csv" Constant
De-identified Minimum has constant length 5 Constant Length
Filename has constant length 38 Constant Length
  • 1
  • 2
  • 3

Variables

HCPCS/ CPT/NDC Code

categorical

Approximate Distinct Count 7566
Approximate Unique (%) 8.8%
Missing 0
Missing (%) 0.0%
Memory Size 5.8 MB
  • The largest value (C1713) is over 2.17 times larger than the second largest value (27800169)

Length

Mean 6.4088
Standard Deviation 2.1069
Median 5
Minimum 5
Maximum 47

Sample

1st row 25000325
2nd row 96360
3rd row 96361
4th row 96365
5th row 96366

Letter

Count 56305
Lowercase Letter 0
Space Separator 2449
Uppercase Letter 56305
Dash Punctuation 7276
Decimal Number 481651
  • HCPCS/ CPT/NDC Code contains many words: 7493 words
  • The largest value (c1713) is over 2.17 times larger than the second largest value (27800169)

Internal Proc Code

numerical

Approximate Distinct Count 7911
Approximate Unique (%) 9.6%
Missing 3780
Missing (%) 4.4%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.3 MB
Mean 2.8332e+07
Minimum 2.5e+07
Maximum 9.99e+07
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Internal Proc Code is skewed right (γ1 = 9.9197)

Quantile Statistics

Minimum 2.5e+07
5-th Percentile 2.78e+07
Q1 2.78e+07
Median 2.78e+07
Q3 2.78e+07
95-th Percentile 2.788e+07
Maximum 9.99e+07
Range 7.49e+07
IQR 185

Descriptive Statistics

Mean 2.8332e+07
Standard Deviation 4.6878e+06
Variance 2.1976e+13
Sum 2.3249e+12
Skewness 9.9197
Kurtosis 102.6919
Coefficient of Variation 0.1655
  • Internal Proc Code is not normally distributed (p-value 4.274919599004023e-25)
  • Internal Proc Code has 14257 outliers

Other Code

numerical

Approximate Distinct Count 78123
Approximate Unique (%) 95.2%
Missing 3780
Missing (%) 4.4%
Infinite 0
Infinite (%) 0.0%
Memory Size 1.3 MB
Mean 179097.5699
Minimum 250
Maximum 357962
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Other Code is skewed right (γ1 = 0.3532)

Quantile Statistics

Minimum 250
5-th Percentile 761
Q1 109485.28
Median 168648.72
Q3 195207.12
95-th Percentile 346567.2
Maximum 357962
Range 357712
IQR 85721.84

Descriptive Statistics

Mean 179097.5699
Standard Deviation 97486.2313
Variance 9.5036e+09
Sum 1.4697e+10
Skewness 0.3532
Kurtosis -0.6052
Coefficient of Variation 0.5443
  • Other Code has 13713 outliers

Procedure Name

categorical

Approximate Distinct Count 83493
Approximate Unique (%) 97.3%
Missing 0
Missing (%) 0.0%
Memory Size 8.2 MB

Length

Mean 35.0869
Standard Deviation 7.1396
Median 36
Minimum 6
Maximum 93

Sample

1st row HC NITRIC OXIDE PE...
2nd row HC IV INF HYDRAT I...
3rd row HC IV INF HYDRAT A...
4th row HC IV INF THER INI...
5th row HC IV INF THER ADD...

Letter

Count 2132321
Lowercase Letter 8035
Space Separator 397638
Uppercase Letter 2124286
Dash Punctuation 27769
Decimal Number 313874
  • Procedure Name contains many words: 37793 words
  • The largest value (screw) is over 2.77 times larger than the second largest value (plate)

Charge

categorical

Approximate Distinct Count 15130
Approximate Unique (%) 17.6%
Missing 0
Missing (%) 0.0%
Memory Size 5.9 MB

Length

Mean 7.1262
Standard Deviation 1.6082
Median 8
Minimum 1
Maximum 10

Sample

1st row 626
2nd row 485
3rd row 113
4th row 479
5th row 104

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 475759
  • Charge contains many words: 14814 words

Medicare Adv - Aetna

categorical

Approximate Distinct Count 14152
Approximate Unique (%) 16.5%
Missing 0
Missing (%) 0.0%
Memory Size 15.9 MB

Length

Mean 128.933
Standard Deviation 2.1616
Median 129
Minimum 125
Maximum 135

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 7639671
Lowercase Letter 7639671
Space Separator 1630941
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1081397
  • Medicare Adv - Aetna contains many words: 23634 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Medicare Adv - BCBS

categorical

Approximate Distinct Count 13452
Approximate Unique (%) 15.7%
Missing 0
Missing (%) 0.0%
Memory Size 14.4 MB

Length

Mean 110.9697
Standard Deviation 1.0843
Median 111
Minimum 109
Maximum 114

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 669667
  • Medicare Adv - BCBS contains many words: 13458 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Medicare Adv - Humana

categorical

Approximate Distinct Count 13498
Approximate Unique (%) 15.7%
Missing 0
Missing (%) 0.0%
Memory Size 14.4 MB

Length

Mean 110.9784
Standard Deviation 1.0876
Median 111
Minimum 109
Maximum 114

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 670221
  • Medicare Adv - Humana contains many words: 13504 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Medicare Adv - United

categorical

Approximate Distinct Count 13452
Approximate Unique (%) 15.7%
Missing 0
Missing (%) 0.0%
Memory Size 14.4 MB

Length

Mean 110.9697
Standard Deviation 1.0843
Median 111
Minimum 109
Maximum 114

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 669667
  • Medicare Adv - United contains many words: 13458 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Medicare Adv - Other

categorical

Approximate Distinct Count 13452
Approximate Unique (%) 15.7%
Missing 0
Missing (%) 0.0%
Memory Size 14.4 MB

Length

Mean 110.9697
Standard Deviation 1.0843
Median 111
Minimum 109
Maximum 114

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 669667
  • Medicare Adv - Other contains many words: 13458 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Aetna

categorical

Approximate Distinct Count 15609
Approximate Unique (%) 18.2%
Missing 0
Missing (%) 0.0%
Memory Size 16.0 MB

Length

Mean 130.5295
Standard Deviation 2.3261
Median 131
Minimum 125
Maximum 137

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 7639671
Lowercase Letter 7639671
Space Separator 1630941
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1172795
  • Aetna contains many words: 27927 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

BCBS

categorical

Approximate Distinct Count 15272
Approximate Unique (%) 17.8%
Missing 0
Missing (%) 0.0%
Memory Size 16.0 MB

Length

Mean 131.0481
Standard Deviation 2.4591
Median 133
Minimum 125
Maximum 137

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 7639671
Lowercase Letter 7639671
Space Separator 1630941
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1201586
  • BCBS contains many words: 28894 words
  • The largest value (services) is over 2.0 times larger than the second largest value (circumstances)

Cigna

categorical

Approximate Distinct Count 15188
Approximate Unique (%) 17.7%
Missing 0
Missing (%) 0.0%
Memory Size 16.0 MB

Length

Mean 130.986
Standard Deviation 2.4847
Median 133
Minimum 125
Maximum 137

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 7639671
Lowercase Letter 7639671
Space Separator 1630941
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1198072
  • Cigna contains many words: 28607 words
  • The largest value (services) is over 2.0 times larger than the second largest value (circumstances)

Exchange/Ambetter

categorical

Approximate Distinct Count 14592
Approximate Unique (%) 17.0%
Missing 0
Missing (%) 0.0%
Memory Size 14.5 MB

Length

Mean 111.5688
Standard Deviation 1.2403
Median 111
Minimum 109
Maximum 115

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 703917
  • Exchange/Ambetter contains many words: 14598 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Medcost

categorical

Approximate Distinct Count 14875
Approximate Unique (%) 17.3%
Missing 0
Missing (%) 0.0%
Memory Size 14.5 MB

Length

Mean 111.9461
Standard Deviation 1.2496
Median 113
Minimum 109
Maximum 115

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 724939
  • Medcost contains many words: 14881 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

United

categorical

Approximate Distinct Count 15093
Approximate Unique (%) 17.6%
Missing 0
Missing (%) 0.0%
Memory Size 16.1 MB

Length

Mean 131.0731
Standard Deviation 2.4141
Median 133
Minimum 125
Maximum 137

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 7639671
Lowercase Letter 7639671
Space Separator 1630941
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1203230
  • United contains many words: 28802 words
  • The largest value (services) is over 2.0 times larger than the second largest value (circumstances)

Other Mgd Care

categorical

Approximate Distinct Count 14934
Approximate Unique (%) 17.4%
Missing 0
Missing (%) 0.0%
Memory Size 14.5 MB

Length

Mean 111.9643
Standard Deviation 1.2469
Median 113
Minimum 109
Maximum 115

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 726116
  • Other Mgd Care contains many words: 14940 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Tricare

categorical

Approximate Distinct Count 13394
Approximate Unique (%) 15.6%
Missing 0
Missing (%) 0.0%
Memory Size 14.4 MB

Length

Mean 110.958
Standard Deviation 1.0818
Median 111
Minimum 109
Maximum 114

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 668921
  • Tricare contains many words: 13400 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Workers Comp

categorical

Approximate Distinct Count 14501
Approximate Unique (%) 16.9%
Missing 0
Missing (%) 0.0%
Memory Size 14.4 MB

Length

Mean 111.5129
Standard Deviation 1.2368
Median 111
Minimum 109
Maximum 115

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 700773
  • Workers Comp contains many words: 14507 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

Self Pay

categorical

Approximate Distinct Count 14571
Approximate Unique (%) 17.0%
Missing 0
Missing (%) 0.0%
Memory Size 14.5 MB

Length

Mean 111.6055
Standard Deviation 1.2415
Median 111
Minimum 109
Maximum 115

Sample

1st row $0.00 for routine...
2nd row $0.00 for routine...
3rd row $0.00 for routine...
4th row $0.00 for routine...
5th row $0.00 for routine...

Letter

Count 6867120
Lowercase Letter 6867120
Space Separator 1545102
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 705957
  • Self Pay contains many words: 14577 words
  • The largest value (services) is over 2.0 times larger than the second largest value (000)

De-identified Minimum

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 5.7 MB

Length

Mean 5
Standard Deviation 0
Median 5
Minimum 5
Maximum 5

Sample

1st row -
2nd row -
3rd row -
4th row -
5th row -

Letter

Count 0
Lowercase Letter 0
Space Separator 343356
Uppercase Letter 0
Dash Punctuation 85839
Decimal Number 0

De-identified Maximum

categorical

Approximate Distinct Count 15038
Approximate Unique (%) 17.5%
Missing 0
Missing (%) 0.0%
Memory Size 5.9 MB

Length

Mean 7.1212
Standard Deviation 1.3071
Median 8
Minimum 2
Maximum 10

Sample

1st row 464.49
2nd row 359.87
3rd row 83.85
4th row 355.42
5th row 77.17

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 476918
  • De-identified Maximum contains many words: 14820 words

Filename

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 8.4 MB

Length

Mean 38
Standard Deviation 0
Median 38
Minimum 38
Maximum 38

Sample

1st row 56-2070036_DRH_sta...
2nd row 56-2070036_DRH_sta...
3rd row 56-2070036_DRH_sta...
4th row 56-2070036_DRH_sta...
5th row 56-2070036_DRH_sta...

Letter

Count 2060136
Lowercase Letter 1802619
Space Separator 0
Uppercase Letter 257517
Dash Punctuation 85839
Decimal Number 772551
  • Filename has words of constant length

Interactions

Correlations

Missing Values