Overview

Dataset Statistics

Number of Variables 15
Number of Rows 2478
Missing Cells 7744
Missing Cells (%) 20.8%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.4 MB
Average Row Size in Memory 574.2 B
Variable Types
  • Categorical: 8
  • Numerical: 7

Dataset Insights

Cigna and Humana have similar distributions Similar Distribution
CMS Shoppable has 2407 (97.13%) missing values Missing
De-identified Minimum has 46 (1.86%) missing values Missing
De-identified Maximum has 46 (1.86%) missing values Missing
Aetna has 810 (32.69%) missing values Missing
BCBS has 1083 (43.7%) missing values Missing
Cigna has 1402 (56.58%) missing values Missing
Humana has 1416 (57.14%) missing values Missing
UHC has 522 (21.07%) missing values Missing
De-identified Minimum is skewed Skewed
De-identified Maximum is skewed Skewed
Aetna is skewed Skewed
BCBS is skewed Skewed
Cigna is skewed Skewed
Humana is skewed Skewed
UHC is skewed Skewed
CPT/MS-DRG has a high cardinality: 1848 distinct values High Cardinality
Procedure Description has a high cardinality: 1788 distinct values High Cardinality
Rev Code has constant value "2019281" Constant
CMS Shoppable has constant value "Y" Constant
Filename has constant value "560789196_CatawbaValleyMedicalCenter_StandardCharges.csv" Constant
system has constant value "CATAWBA" Constant
Rev Code has constant length 7 Constant Length
CMS Shoppable has constant length 1 Constant Length
Filename has constant length 56 Constant Length
system has constant length 7 Constant Length
De-identified Minimum has 2165 (87.37%) zeros Zeros
Aetna has 195 (7.87%) zeros Zeros
Cigna has 284 (11.46%) zeros Zeros
Humana has 282 (11.38%) zeros Zeros
UHC has 158 (6.38%) zeros Zeros
  • 1
  • 2
  • 3
  • 4

Variables

Rev Code

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 174.2 KB

Length

Mean 7
Standard Deviation 0
Median 7
Minimum 7
Maximum 7

Sample

1st row 2019281
2nd row 2019281
3rd row 2019281
4th row 2019281
5th row 2019281

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 17346
  • Rev Code has words of constant length

CPT/MS-DRG

categorical

Approximate Distinct Count 1848
Approximate Unique (%) 74.6%
Missing 0
Missing (%) 0.0%
Memory Size 168.4 KB

Length

Mean 4.6021
Standard Deviation 0.7986
Median 5
Minimum 3
Maximum 5

Sample

1st row 00100
2nd row 00103
3rd row 00120
4th row 00140
5th row 00142

Letter

Count 298
Lowercase Letter 0
Space Separator 0
Uppercase Letter 298
Dash Punctuation 0
Decimal Number 11106
  • CPT/MS-DRG contains many words: 1848 words

Procedure Description

categorical

Approximate Distinct Count 1788
Approximate Unique (%) 72.5%
Missing 12
Missing (%) 0.5%
Memory Size 289.7 KB

Length

Mean 55.2798
Standard Deviation 34.5165
Median 50
Minimum 9
Maximum 875

Sample

1st row Anesthesia for pro...
2nd row Anesthesia for pro...
3rd row Anesthesia for bio...
4th row Anesthesia for pro...
5th row Anesthesia for len...

Letter

Count 114474
Lowercase Letter 88664
Space Separator 17331
Uppercase Letter 25810
Dash Punctuation 348
Decimal Number 1138
  • Procedure Description contains many words: 2131 words
  • The largest value (with) is over 1.51 times larger than the second largest value (using)

Code_Type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 176.2 KB
  • The largest value (CPT/HCPCS) is over 4.03 times larger than the second largest value (DRG)

Length

Mean 7.8063
Standard Deviation 2.3957
Median 9
Minimum 3
Maximum 9

Sample

1st row CPT/HCPCS
2nd row CPT/HCPCS
3rd row CPT/HCPCS
4th row CPT/HCPCS
5th row CPT/HCPCS

Letter

Count 17359
Lowercase Letter 0
Space Separator 0
Uppercase Letter 17359
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (CPT/HCPCS, DRG) take over 50.0%
  • The largest value (cpthcpcs) is over 4.03 times larger than the second largest value (drg)

CMS Shoppable

categorical

Approximate Distinct Count 1
Approximate Unique (%) 1.4%
Missing 2407
Missing (%) 97.1%
Memory Size 4.6 KB

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row Y
2nd row Y
3rd row Y
4th row Y
5th row Y

Letter

Count 71
Lowercase Letter 0
Space Separator 0
Uppercase Letter 71
Dash Punctuation 0
Decimal Number 0
  • CMS Shoppable has words of constant length

Package/Line_Level

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 168.8 KB
  • The largest value (Line) is over 2.93 times larger than the second largest value (Package)

Length

Mean 4.7627
Standard Deviation 1.3066
Median 4
Minimum 4
Maximum 7

Sample

1st row Line
2nd row Line
3rd row Line
4th row Line
5th row Line

Letter

Count 11802
Lowercase Letter 9324
Space Separator 0
Uppercase Letter 2478
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Line, Package) take over 50.0%
  • The largest value (line) is over 2.93 times larger than the second largest value (package)

De-identified Minimum

numerical

Approximate Distinct Count 229
Approximate Unique (%) 9.4%
Missing 46
Missing (%) 1.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 38.0 KB
Mean 714.8077
Minimum 0
Maximum 214556.65
Zeros 2165
Zeros (%) 87.4%
Negatives 0
Negatives (%) 0.0%
  • De-identified Minimum is skewed right (γ1 = 27.9121)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 2948.0125
Maximum 214556.65
Range 214556.65
IQR 0

Descriptive Statistics

Mean 714.8077
Standard Deviation 5330.9809
Variance 2.8419e+07
Sum 1.7384e+06
Skewness 27.9121
Kurtosis 1069.8028
Coefficient of Variation 7.4579
  • De-identified Minimum is not normally distributed (p-value 4.365690103789856e-25)
  • De-identified Minimum has 267 outliers

De-identified Maximum

numerical

Approximate Distinct Count 2042
Approximate Unique (%) 84.0%
Missing 46
Missing (%) 1.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 38.0 KB
Mean 15604.9126
Minimum 0
Maximum 318319.89
Zeros 11
Zeros (%) 0.4%
Negatives 0
Negatives (%) 0.0%
  • De-identified Maximum is skewed right (γ1 = 4.1767)

Quantile Statistics

Minimum 0
5-th Percentile 63.58
Q1 583.9275
Median 3753
Q3 15141.0575
95-th Percentile 70826.5745
Maximum 318319.89
Range 318319.89
IQR 14557.13

Descriptive Statistics

Mean 15604.9126
Standard Deviation 31895.1975
Variance 1.0173e+09
Sum 3.7951e+07
Skewness 4.1767
Kurtosis 22.8127
Coefficient of Variation 2.0439
  • De-identified Maximum is not normally distributed (p-value 8.973001561119284e-24)
  • De-identified Maximum has 286 outliers

Aetna

numerical

Approximate Distinct Count 1161
Approximate Unique (%) 69.6%
Missing 810
Missing (%) 32.7%
Infinite 0
Infinite (%) 0.0%
Memory Size 26.1 KB
Mean 8534.2873
Minimum 0
Maximum 318319.89
Zeros 195
Zeros (%) 7.9%
Negatives 0
Negatives (%) 0.0%
  • Aetna is skewed right (γ1 = 5.58)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 115.765
Median 874.16
Q3 4800.1725
95-th Percentile 44256.9935
Maximum 318319.89
Range 318319.89
IQR 4684.4075

Descriptive Statistics

Mean 8534.2873
Standard Deviation 24030.2969
Variance 5.7746e+08
Sum 1.4235e+07
Skewness 5.58
Kurtosis 40.9653
Coefficient of Variation 2.8157
  • Aetna is not normally distributed (p-value 7.1954585861473795e-25)
  • Aetna has 245 outliers

BCBS

numerical

Approximate Distinct Count 1125
Approximate Unique (%) 80.7%
Missing 1083
Missing (%) 43.7%
Infinite 0
Infinite (%) 0.0%
Memory Size 21.8 KB
Mean 3643.2061
Minimum 0
Maximum 65362.72
Zeros 25
Zeros (%) 1.0%
Negatives 0
Negatives (%) 0.0%
  • BCBS is skewed right (γ1 = 3.6656)

Quantile Statistics

Minimum 0
5-th Percentile 38.362
Q1 267.02
Median 1211.6
Q3 3973.85
95-th Percentile 16161.977
Maximum 65362.72
Range 65362.72
IQR 3706.83

Descriptive Statistics

Mean 3643.2061
Standard Deviation 6223.3968
Variance 3.8731e+07
Sum 5.0823e+06
Skewness 3.6656
Kurtosis 20.0624
Coefficient of Variation 1.7082
  • BCBS is not normally distributed (p-value 9.8046406103213e-23)
  • BCBS has 167 outliers

Cigna

numerical

Approximate Distinct Count 521
Approximate Unique (%) 48.4%
Missing 1402
Missing (%) 56.6%
Infinite 0
Infinite (%) 0.0%
Memory Size 16.8 KB
Mean 3837.2657
Minimum 0
Maximum 274245.31
Zeros 284
Zeros (%) 11.5%
Negatives 0
Negatives (%) 0.0%
  • Cigna is skewed right (γ1 = 8.604)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 102.63
Q3 810.67
95-th Percentile 17968.86
Maximum 274245.31
Range 274245.31
IQR 810.67

Descriptive Statistics

Mean 3837.2657
Standard Deviation 15452.451
Variance 2.3878e+08
Sum 4.1289e+06
Skewness 8.604
Kurtosis 107.1871
Coefficient of Variation 4.0269
  • Cigna is not normally distributed (p-value 4.846311406946973e-25)
  • Cigna has 192 outliers

Humana

numerical

Approximate Distinct Count 414
Approximate Unique (%) 39.0%
Missing 1416
Missing (%) 57.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 16.6 KB
Mean 6426.3242
Minimum 0
Maximum 317509.21
Zeros 282
Zeros (%) 11.4%
Negatives 0
Negatives (%) 0.0%
  • Humana is skewed right (γ1 = 7.1503)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 170.055
Q3 1331.75
95-th Percentile 41514.9275
Maximum 317509.21
Range 317509.21
IQR 1331.75

Descriptive Statistics

Mean 6426.3242
Standard Deviation 24042.8084
Variance 5.7806e+08
Sum 6.8248e+06
Skewness 7.1503
Kurtosis 66.7739
Coefficient of Variation 3.7413
  • Humana is not normally distributed (p-value 4.754356613110384e-25)
  • Humana has 175 outliers

UHC

numerical

Approximate Distinct Count 1362
Approximate Unique (%) 69.6%
Missing 522
Missing (%) 21.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 30.6 KB
Mean 9991.2508
Minimum 0
Maximum 243917.53
Zeros 158
Zeros (%) 6.4%
Negatives 0
Negatives (%) 0.0%
  • UHC is skewed right (γ1 = 4.5842)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 181.925
Median 1230.155
Q3 6239.64
95-th Percentile 52514.0725
Maximum 243917.53
Range 243917.53
IQR 6057.715

Descriptive Statistics

Mean 9991.2508
Standard Deviation 25353.4851
Variance 6.428e+08
Sum 1.9543e+07
Skewness 4.5842
Kurtosis 25.076
Coefficient of Variation 2.5376
  • UHC is not normally distributed (p-value 1.1967110352062828e-24)
  • UHC has 296 outliers

Filename

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 292.8 KB

Length

Mean 56
Standard Deviation 0
Median 56
Minimum 56
Maximum 56

Sample

1st row 560789196_CatawbaV...
2nd row 560789196_CatawbaV...
3rd row 560789196_CatawbaV...
4th row 560789196_CatawbaV...
5th row 560789196_CatawbaV...

Letter

Count 109032
Lowercase Letter 94164
Space Separator 0
Uppercase Letter 14868
Dash Punctuation 0
Decimal Number 22302
  • Filename has words of constant length

system

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 174.2 KB

Length

Mean 7
Standard Deviation 0
Median 7
Minimum 7
Maximum 7

Sample

1st row CATAWBA
2nd row CATAWBA
3rd row CATAWBA
4th row CATAWBA
5th row CATAWBA

Letter

Count 17346
Lowercase Letter 0
Space Separator 0
Uppercase Letter 17346
Dash Punctuation 0
Decimal Number 0
  • system has words of constant length

Interactions

Correlations

Missing Values