Overview

Dataset Statistics

Number of Variables 16
Number of Rows 261776
Missing Cells 2.5433e+06
Missing Cells (%) 60.7%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 186.9 MB
Average Row Size in Memory 748.5 B
Variable Types
  • Categorical: 16

Dataset Insights

BCBS has 261776 (100.0%) missing values Missing
BCBS MEDICARE has 251975 (96.26%) missing values Missing
AETNA has 171513 (65.52%) missing values Missing
AETNA MEDICARE has 251975 (96.26%) missing values Missing
CIGNA has 199603 (76.25%) missing values Missing
CIGNA MEDICARE has 251975 (96.26%) missing values Missing
HUMANA has 216852 (82.84%) missing values Missing
HUMANA MEDICARE has 251975 (96.26%) missing values Missing
MEDCOST has 216852 (82.84%) missing values Missing
UHC has 216852 (82.84%) missing values Missing
UHC MEDICARE has 251975 (96.26%) missing values Missing
Procedure Description has a high cardinality: 30488 distinct values High Cardinality
CPT/MS-DRG has a high cardinality: 24644 distinct values High Cardinality
Gross Charge has a high cardinality: 33798 distinct values High Cardinality
BCBS MEDICARE has a high cardinality: 2671 distinct values High Cardinality
AETNA has a high cardinality: 20309 distinct values High Cardinality
AETNA MEDICARE has a high cardinality: 2881 distinct values High Cardinality
CIGNA has a high cardinality: 13584 distinct values High Cardinality
CIGNA MEDICARE has a high cardinality: 2651 distinct values High Cardinality
HUMANA has a high cardinality: 9675 distinct values High Cardinality
HUMANA MEDICARE has a high cardinality: 2654 distinct values High Cardinality
MEDCOST has a high cardinality: 9227 distinct values High Cardinality
UHC has a high cardinality: 17293 distinct values High Cardinality
UHC MEDICARE has a high cardinality: 2652 distinct values High Cardinality
system has constant value "ATRIUM" Constant
system has constant length 6 Constant Length
BCBS has all distinct values Unique
  • 1
  • 2
  • 3

Variables

Procedure Description

categorical

Approximate Distinct Count 30488
Approximate Unique (%) 11.7%
Missing 0
Missing (%) 0.0%
Memory Size 25.3 MB

Length

Mean 36.2027
Standard Deviation 81.1483
Median 31
Minimum 4
Maximum 16297

Sample

1st row 1-stage distal hyp...
2nd row 1-stage distal hyp...
3rd row 1-stage distal hyp...
4th row 1-stage distal hyp...
5th row 1-stage distal hyp...

Letter

Count 7264799
Lowercase Letter 5380614
Space Separator 1293343
Uppercase Letter 1884185
Dash Punctuation 64895
Decimal Number 592200
  • Procedure Description contains many words: 23753 words
  • The largest value (hc) is over 13.86 times larger than the second largest value (chg)

CPT/MS-DRG

categorical

Approximate Distinct Count 24644
Approximate Unique (%) 9.4%
Missing 0
Missing (%) 0.0%
Memory Size 24.6 MB
  • The largest value (CPT® C1713) is over 2.31 times larger than the second largest value (CPT C1713)

Length

Mean 9.7168
Standard Deviation 1.9521
Median 10
Minimum 4
Maximum 21

Sample

1st row 54328
2nd row 54328
3rd row 54322
4th row 54322
5th row 54322(80)

Letter

Count 882134
Lowercase Letter 29301
Space Separator 250129
Uppercase Letter 852833
Dash Punctuation 0
Decimal Number 1227966
  • CPT/MS-DRG contains many words: 11939 words
  • The largest value (cpt) is over 4.62 times larger than the second largest value (c1713)

Patient Type

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 18.6 MB

Length

Mean 9.5029
Standard Deviation 0.7501
Median 9
Minimum 8
Maximum 12

Sample

1st row Facility
2nd row Non-Facility
3rd row Facility
4th row Non-Facility
5th row Facility

Letter

Count 2477411
Lowercase Letter 2205405
Space Separator 0
Uppercase Letter 272006
Dash Punctuation 10230
Decimal Number 0
  • The top 2 categories (Inpatient, Outpatient) take over 50.0%

Gross Charge

categorical

Approximate Distinct Count 33798
Approximate Unique (%) 12.9%
Missing 0
Missing (%) 0.0%
Memory Size 18.1 MB

Length

Mean 7.6294
Standard Deviation 2.0549
Median 8
Minimum 1
Maximum 47

Sample

1st row $2,925.00
2nd row $2,925.00
3rd row $2,431.00
4th row $2,431.00
5th row $730.00

Letter

Count 285
Lowercase Letter 173
Space Separator 346060
Uppercase Letter 112
Dash Punctuation 6
Decimal Number 1301773
  • Gross Charge contains many words: 22109 words

BCBS

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17.0 MB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row nan
2nd row nan
3rd row nan
4th row nan
5th row nan

Letter

Count 785328
Lowercase Letter 785328
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0
  • BCBS has words of constant length

BCBS MEDICARE

categorical

Approximate Distinct Count 2671
Approximate Unique (%) 27.3%
Missing 251975
Missing (%) 96.3%
Memory Size 699.2 KB
  • The largest value ( 114.43 ) is over 2.48 times larger than the second largest value ( 35.09 )

Length

Mean 8.0518
Standard Deviation 1.4763
Median 8
Minimum 6
Maximum 12

Sample

1st row 98.11
2nd row 6,868.24
3rd row 6,603.98
4th row 8,774.98
5th row 2,669.00

Letter

Count 0
Lowercase Letter 0
Space Separator 19602
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 47104
  • BCBS MEDICARE contains many words: 2671 words
  • The largest value (11443) is over 2.48 times larger than the second largest value (3509)

AETNA

categorical

Approximate Distinct Count 20309
Approximate Unique (%) 22.5%
Missing 171513
Missing (%) 65.5%
Memory Size 6.2 MB

Length

Mean 6.982
Standard Deviation 1.8397
Median 7
Minimum 3
Maximum 12

Sample

1st row 9,619.53
2nd row 9,619.53
3rd row 8,877
4th row 8,877
5th row 98.81

Letter

Count 32
Lowercase Letter 0
Space Separator 180526
Uppercase Letter 32
Dash Punctuation 0
Decimal Number 373813
  • AETNA contains many words: 19765 words

AETNA MEDICARE

categorical

Approximate Distinct Count 2881
Approximate Unique (%) 29.4%
Missing 251975
Missing (%) 96.3%
Memory Size 698.8 KB

Length

Mean 8.015
Standard Deviation 1.4906
Median 8
Minimum 6
Maximum 12

Sample

1st row 101.54
2nd row 7,108.62
3rd row 6,835.12
4th row 9,082.10
5th row 2,762.42

Letter

Count 0
Lowercase Letter 0
Space Separator 19602
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 46766
  • AETNA MEDICARE contains many words: 2881 words

CIGNA

categorical

Approximate Distinct Count 13584
Approximate Unique (%) 21.9%
Missing 199603
Missing (%) 76.2%
Memory Size 4.4 MB

Length

Mean 8.5904
Standard Deviation 1.2619
Median 8
Minimum 6
Maximum 12

Sample

1st row 9,896.04
2nd row 9,896.04
3rd row 199.75
4th row 144.00
5th row 199.75

Letter

Count 32
Lowercase Letter 0
Space Separator 124346
Uppercase Letter 32
Dash Punctuation 0
Decimal Number 324969
  • CIGNA contains many words: 13584 words

CIGNA MEDICARE

categorical

Approximate Distinct Count 2651
Approximate Unique (%) 27.1%
Missing 251975
Missing (%) 96.3%
Memory Size 699.4 KB
  • The largest value ( 120.15 ) is over 2.48 times larger than the second largest value ( 36.84 )

Length

Mean 8.0756
Standard Deviation 1.4909
Median 8
Minimum 6
Maximum 12

Sample

1st row 103.02
2nd row 7,211.65
3rd row 6,934.18
4th row 9,213.73
5th row 2,802.48

Letter

Count 0
Lowercase Letter 0
Space Separator 19602
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 47317
  • CIGNA MEDICARE contains many words: 2651 words
  • The largest value (12015) is over 2.48 times larger than the second largest value (3684)

HUMANA

categorical

Approximate Distinct Count 9675
Approximate Unique (%) 21.5%
Missing 216852
Missing (%) 82.8%
Memory Size 3.2 MB

Length

Mean 8.7271
Standard Deviation 1.2628
Median 8
Minimum 6
Maximum 12

Sample

1st row 13,679.82
2nd row 13,679.82
3rd row 199.75
4th row 199.75
5th row 343.85

Letter

Count 32
Lowercase Letter 0
Space Separator 89848
Uppercase Letter 32
Dash Punctuation 0
Decimal Number 238987
  • HUMANA contains many words: 9675 words

HUMANA MEDICARE

categorical

Approximate Distinct Count 2654
Approximate Unique (%) 27.1%
Missing 251975
Missing (%) 96.3%
Memory Size 699.3 KB
  • The largest value ( 116.72 ) is over 2.48 times larger than the second largest value ( 35.79 )

Length

Mean 8.0595
Standard Deviation 1.4788
Median 8
Minimum 6
Maximum 12

Sample

1st row 100.07
2nd row 7,005.60
3rd row 6,736.06
4th row 8,950.48
5th row 2,722.41

Letter

Count 0
Lowercase Letter 0
Space Separator 19602
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 47174
  • HUMANA MEDICARE contains many words: 2654 words
  • The largest value (11672) is over 2.48 times larger than the second largest value (3579)

MEDCOST

categorical

Approximate Distinct Count 9227
Approximate Unique (%) 20.5%
Missing 216852
Missing (%) 82.8%
Memory Size 3.1 MB

Length

Mean 8.5021
Standard Deviation 1.2595
Median 8
Minimum 6
Maximum 12

Sample

1st row 9,750.51
2nd row 9,750.51
3rd row 204.00
4th row 204.00
5th row 351.17

Letter

Count 32
Lowercase Letter 0
Space Separator 89848
Uppercase Letter 32
Dash Punctuation 0
Decimal Number 231639
  • MEDCOST contains many words: 9227 words

UHC

categorical

Approximate Distinct Count 17293
Approximate Unique (%) 38.5%
Missing 216852
Missing (%) 82.8%
Memory Size 3.1 MB

Length

Mean 8.4954
Standard Deviation 1.2589
Median 8
Minimum 5
Maximum 12

Sample

1st row 9,008.31
2nd row 9,604.98
3rd row 192.95
4th row 195.71
5th row 332.15

Letter

Count 32
Lowercase Letter 0
Space Separator 89894
Uppercase Letter 32
Dash Punctuation 23
Decimal Number 231392
  • UHC contains many words: 17292 words

UHC MEDICARE

categorical

Approximate Distinct Count 2652
Approximate Unique (%) 27.1%
Missing 251975
Missing (%) 96.3%
Memory Size 699.4 KB
  • The largest value ( 120.51 ) is over 2.48 times larger than the second largest value ( 36.95 )

Length

Mean 8.0763
Standard Deviation 1.4916
Median 8
Minimum 6
Maximum 12

Sample

1st row 103.32
2nd row 7,232.94
3rd row 6,954.65
4th row 9,240.93
5th row 2,810.73

Letter

Count 0
Lowercase Letter 0
Space Separator 19602
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 47322
  • UHC MEDICARE contains many words: 2652 words
  • The largest value (12051) is over 2.48 times larger than the second largest value (3695)

system

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17.7 MB

Length

Mean 6
Standard Deviation 0
Median 6
Minimum 6
Maximum 6

Sample

1st row ATRIUM
2nd row ATRIUM
3rd row ATRIUM
4th row ATRIUM
5th row ATRIUM

Letter

Count 1570656
Lowercase Letter 0
Space Separator 0
Uppercase Letter 1570656
Dash Punctuation 0
Decimal Number 0
  • system has words of constant length

Interactions

Correlations

Missing Values