Overview

Dataset Statistics

Number of Variables 6
Number of Rows 1406
Missing Cells 145
Missing Cells (%) 1.7%
Duplicate Rows 5
Duplicate Rows (%) 0.4%
Total Size in Memory 547.6 KB
Average Row Size in Memory 398.8 B
Variable Types
  • Categorical: 5
  • Numerical: 1

Dataset Insights

HCPCS/CPT Code has 47 (3.34%) missing values Missing
Gross Charge has 36 (2.56%) missing values Missing
Discounted Cash Price (Gross Charges) has 56 (3.98%) missing values Missing
Discounted Cash Price (Gross Charges) is skewed Skewed
Procedure ID has a high cardinality: 1377 distinct values High Cardinality
HCPCS/CPT Code has a high cardinality: 561 distinct values High Cardinality
Description has a high cardinality: 1341 distinct values High Cardinality
Gross Charge has a high cardinality: 853 distinct values High Cardinality
Filename has constant value "30-1114775_carepartners-outpatient-rehab_standardcharges.csv" Constant
Filename has constant length 60 Constant Length

Variables

Procedure ID

categorical

Approximate Distinct Count 1377
Approximate Unique (%) 97.9%
Missing 0
Missing (%) 0.0%
Memory Size 96.8 KB

Length

Mean 5.4829
Standard Deviation 2.387
Median 5
Minimum 2
Maximum 29

Sample

1st row 346
2nd row 494
3rd row 845
4th row 886
5th row 1196

Letter

Count 767
Lowercase Letter 654
Space Separator 51
Uppercase Letter 113
Dash Punctuation 0
Decimal Number 6887
  • Procedure ID contains many words: 1390 words
  • The largest value (other) is over 1.88 times larger than the second largest value (outpatient)

HCPCS/CPT Code

categorical

Approximate Distinct Count 561
Approximate Unique (%) 41.3%
Missing 47
Missing (%) 3.3%
Memory Size 104.8 KB
  • The largest value ( ) is over 93.38 times larger than the second largest value (0J1815 )

Length

Mean 13.9573
Standard Deviation 0.5955
Median 14
Minimum 4
Maximum 14

Sample

1st row
2nd row
3rd row
4th row
5th row

Letter

Count 919
Lowercase Letter 33
Space Separator 14870
Uppercase Letter 886
Dash Punctuation 0
Decimal Number 3177
  • The top 2 categories ( , 0J1815 ) take over 50.0%

Description

categorical

Approximate Distinct Count 1341
Approximate Unique (%) 95.8%
Missing 6
Missing (%) 0.4%
Memory Size 120.9 KB

Length

Mean 23.4629
Standard Deviation 2.8428
Median 24
Minimum 3
Maximum 24

Sample

1st row ACETAMIN/BUTALBITL...
2nd row ACETAZOLAMIDE 250M...
3rd row ACYCLOVIR 200MG CA...
4th row ACYCLOVIR 5% TOP 1...
5th row ALENDRONATE 70MG T...

Letter

Count 22277
Lowercase Letter 78
Space Separator 7644
Uppercase Letter 22199
Dash Punctuation 9
Decimal Number 2400
  • Description contains many words: 1632 words
  • The largest value (tab) is over 4.12 times larger than the second largest value (cap)

Gross Charge

categorical

Approximate Distinct Count 853
Approximate Unique (%) 62.3%
Missing 36
Missing (%) 2.6%
Memory Size 93.7 KB
  • The largest value (76.00) is over 1.55 times larger than the second largest value (64.00)

Length

Mean 5.0715
Standard Deviation 1.1527
Median 5
Minimum 3
Maximum 13

Sample

1st row 6.83
2nd row 10.37
3rd row 6.30
4th row 47.86
5th row 6.90

Letter

Count 62
Lowercase Letter 28
Space Separator 26
Uppercase Letter 34
Dash Punctuation 0
Decimal Number 5478
  • The largest value (7600) is over 1.55 times larger than the second largest value (6400)

Discounted Cash Price (Gross Charges)

numerical

Approximate Distinct Count 840
Approximate Unique (%) 62.2%
Missing 56
Missing (%) 4.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 21.1 KB
Mean 446.965
Minimum 0.42
Maximum 47687
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Discounted Cash Price (Gross Charges) is skewed right (γ1 = 13.7277)

Quantile Statistics

Minimum 0.42
5-th Percentile 6.23
Q1 7.08
Median 20.875
Q3 138
95-th Percentile 1997.8
Maximum 47687
Range 47686.58
IQR 130.92

Descriptive Statistics

Mean 446.965
Standard Deviation 2362.0731
Variance 5.5794e+06
Sum 603402.75
Skewness 13.7277
Kurtosis 222.064
Coefficient of Variation 5.2847
  • Discounted Cash Price (Gross Charges) is not normally distributed (p-value 4.594909959375575e-25)
  • Discounted Cash Price (Gross Charges) has 228 outliers

Filename

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 171.6 KB

Length

Mean 60
Standard Deviation 0
Median 60
Minimum 60
Maximum 60

Sample

1st row 30-1114775_carepar...
2nd row 30-1114775_carepar...
3rd row 30-1114775_carepar...
4th row 30-1114775_carepar...
5th row 30-1114775_carepar...

Letter

Count 63270
Lowercase Letter 63270
Space Separator 0
Uppercase Letter 0
Dash Punctuation 4218
Decimal Number 12654
  • Filename has words of constant length

Interactions

Correlations

Missing Values