Overview

Dataset Statistics

Number of Variables 19
Number of Rows 142760
Missing Cells 637338
Missing Cells (%) 23.5%
Duplicate Rows 501
Duplicate Rows (%) 0.4%
Total Size in Memory 159.1 MB
Average Row Size in Memory 1.1 KB
Variable Types
  • Categorical: 19

Dataset Insights

CPT/MS-DRG has 76528 (53.61%) missing values Missing
BCBS Medicare has 104578 (73.25%) missing values Missing
Cigna Medicare has 104578 (73.25%) missing values Missing
Humana Medicare has 104578 (73.25%) missing values Missing
UHC Medicare has 104578 (73.25%) missing values Missing
Tricare has 142200 (99.61%) missing values Missing
Procedure Description has a high cardinality: 40324 distinct values High Cardinality
CPT/MS-DRG has a high cardinality: 7450 distinct values High Cardinality
Gross Charge has a high cardinality: 9591 distinct values High Cardinality
Aetna has a high cardinality: 14378 distinct values High Cardinality
Aetna Medicare has a high cardinality: 13214 distinct values High Cardinality
BCBS Medicare has a high cardinality: 4563 distinct values High Cardinality
BCBS has a high cardinality: 15641 distinct values High Cardinality
Cigna has a high cardinality: 14236 distinct values High Cardinality
Cigna Medicare has a high cardinality: 4563 distinct values High Cardinality
Humana has a high cardinality: 11629 distinct values High Cardinality
Humana Medicare has a high cardinality: 4562 distinct values High Cardinality
UHC has a high cardinality: 3553 distinct values High Cardinality
UHC Medicare has a high cardinality: 4772 distinct values High Cardinality
MedCost has a high cardinality: 11616 distinct values High Cardinality
De-identified Minimum has a high cardinality: 1682 distinct values High Cardinality
De-identified Maximum has a high cardinality: 7398 distinct values High Cardinality
Tricare has constant value " * " Constant
system has constant value "NOVANT" Constant
Tricare has constant length 3.0 Constant Length
system has constant length 6 Constant Length
  • 1
  • 2
  • 3

Variables

Procedure Description

categorical

Approximate Distinct Count 40324
Approximate Unique (%) 28.3%
Missing 298
Missing (%) 0.2%
Memory Size 14.7 MB

Length

Mean 43.4754
Standard Deviation 17.7046
Median 47
Minimum 5
Maximum 230

Sample

1st row HC IP PRIVATE
2nd row HC IP L&D
3rd row HC BEH MED PRIVATE
4th row HC INPATIENT REHAB
5th row HC BEH MED SEMIPRI...

Letter

Count 3580634
Lowercase Letter 613504
Space Separator 871005
Uppercase Letter 2967130
Dash Punctuation 209389
Decimal Number 1315609
  • Procedure Description contains many words: 73890 words
  • The largest value (hc) is over 2.62 times larger than the second largest value (po)

CPT/MS-DRG

categorical

Approximate Distinct Count 7450
Approximate Unique (%) 11.2%
Missing 76528
Missing (%) 53.6%
Memory Size 4.4 MB
  • The largest value (*) is over 11.52 times larger than the second largest value (G0480)

Length

Mean 4.3709
Standard Deviation 1.354
Median 5
Minimum 1
Maximum 10

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 7882
Lowercase Letter 0
Space Separator 0
Uppercase Letter 7882
Dash Punctuation 0
Decimal Number 273623
  • CPT/MS-DRG contains many words: 7449 words
  • The largest value (g0480) is over 2.04 times larger than the second largest value (80307)

Gross Charge

categorical

Approximate Distinct Count 9591
Approximate Unique (%) 6.7%
Missing 0
Missing (%) 0.0%
Memory Size 9.3 MB
  • The largest value (**) is over 15.3 times larger than the second largest value (0.05)

Length

Mean 3.5089
Standard Deviation 1.1617
Median 3
Minimum 1
Maximum 8

Sample

1st row 1472
2nd row 1472
3rd row 1780
4th row 2092
5th row 1780

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 418180
  • Gross Charge contains many words: 8588 words
  • The largest value (005) is over 2.4 times larger than the second largest value (6560)

Aetna

categorical

Approximate Distinct Count 14378
Approximate Unique (%) 10.1%
Missing 0
Missing (%) 0.0%
Memory Size 9.9 MB
  • The largest value ( ** ) is over 3.11 times larger than the second largest value ( * )

Length

Mean 7.7199
Standard Deviation 2.4744
Median 9
Minimum 3
Maximum 12

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 515276
  • Aetna contains many words: 14376 words
  • The largest value (003) is over 2.87 times larger than the second largest value (006)

Aetna Medicare

categorical

Approximate Distinct Count 13214
Approximate Unique (%) 9.3%
Missing 0
Missing (%) 0.0%
Memory Size 9.5 MB

Length

Mean 4.6104
Standard Deviation 1.9842
Median 5
Minimum 1
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 19612
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 513151
  • Aetna Medicare contains many words: 11275 words

BCBS Medicare

categorical

Approximate Distinct Count 4563
Approximate Unique (%) 11.9%
Missing 104578
Missing (%) 73.2%
Memory Size 2.7 MB
  • The largest value ( * ) is over 5.24 times larger than the second largest value ( $117.68 )

Length

Mean 8.5514
Standard Deviation 3.0829
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 76364
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 166173
  • BCBS Medicare contains many words: 4562 words

BCBS

categorical

Approximate Distinct Count 15641
Approximate Unique (%) 11.0%
Missing 0
Missing (%) 0.0%
Memory Size 9.9 MB
  • The largest value ( ** ) is over 2.49 times larger than the second largest value ( * )

Length

Mean 8.0216
Standard Deviation 2.5027
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 18
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 18
Dash Punctuation 0
Decimal Number 552603
  • BCBS contains many words: 15639 words
  • The largest value (003) is over 3.11 times larger than the second largest value (004)

Cigna

categorical

Approximate Distinct Count 14236
Approximate Unique (%) 10.0%
Missing 0
Missing (%) 0.0%
Memory Size 9.9 MB
  • The largest value ( N/A ) is over 2.27 times larger than the second largest value ( * )

Length

Mean 7.9108
Standard Deviation 2.392
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 36144
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 36144
Dash Punctuation 0
Decimal Number 521952
  • Cigna contains many words: 14234 words
  • The largest value (na) is over 9.72 times larger than the second largest value (003)

Cigna Medicare

categorical

Approximate Distinct Count 4563
Approximate Unique (%) 11.9%
Missing 104578
Missing (%) 73.2%
Memory Size 2.7 MB
  • The largest value ( * ) is over 5.24 times larger than the second largest value ( $115.44 )

Length

Mean 8.5514
Standard Deviation 3.0829
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 76364
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 166173
  • Cigna Medicare contains many words: 4562 words

Humana

categorical

Approximate Distinct Count 11629
Approximate Unique (%) 8.1%
Missing 0
Missing (%) 0.0%
Memory Size 9.9 MB
  • The largest value ( ** ) is over 3.11 times larger than the second largest value ( * )

Length

Mean 7.8379
Standard Deviation 2.5487
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 527891
  • Humana contains many words: 11627 words
  • The largest value (004) is over 2.99 times larger than the second largest value (472320)

Humana Medicare

categorical

Approximate Distinct Count 4562
Approximate Unique (%) 11.9%
Missing 104578
Missing (%) 73.2%
Memory Size 2.7 MB
  • The largest value ( * ) is over 5.24 times larger than the second largest value ( $112.08 )

Length

Mean 8.5425
Standard Deviation 3.0765
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 76364
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 165908
  • Humana Medicare contains many words: 4561 words

UHC

categorical

Approximate Distinct Count 3553
Approximate Unique (%) 2.5%
Missing 0
Missing (%) 0.0%
Memory Size 9.8 MB

Length

Mean 6.4517
Standard Deviation 2.541
Median 7
Minimum 1
Maximum 12

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 13044
Lowercase Letter 0
Space Separator 19612
Uppercase Letter 13044
Dash Punctuation 0
Decimal Number 754875
  • The top 2 categories (3925.33, 4378.0032) take over 50.0%
  • UHC contains many words: 3508 words

UHC Medicare

categorical

Approximate Distinct Count 4772
Approximate Unique (%) 12.5%
Missing 104578
Missing (%) 73.2%
Memory Size 2.6 MB
  • The largest value (*) is over 5.38 times larger than the second largest value (112.08)

Length

Mean 5.5147
Standard Deviation 2.5747
Median 6
Minimum 1
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 6004
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 162920
  • UHC Medicare contains many words: 4583 words

Tricare

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.2%
Missing 142200
Missing (%) 99.6%
Memory Size 37.2 KB

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 1120
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 0

MedCost

categorical

Approximate Distinct Count 11616
Approximate Unique (%) 8.1%
Missing 0
Missing (%) 0.0%
Memory Size 9.9 MB
  • The largest value ( ** ) is over 3.09 times larger than the second largest value ( * )

Length

Mean 7.8394
Standard Deviation 2.5444
Median 9
Minimum 3
Maximum 13

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 26
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 26
Dash Punctuation 0
Decimal Number 528256
  • MedCost contains many words: 11614 words

De-identified Minimum

categorical

Approximate Distinct Count 1682
Approximate Unique (%) 1.2%
Missing 0
Missing (%) 0.0%
Memory Size 9.8 MB
  • The largest value ( ** ) is over 2.46 times larger than the second largest value ( $1,140.00 )

Length

Mean 7.1029
Standard Deviation 2.5523
Median 8
Minimum 3
Maximum 12

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 433236
  • De-identified Minimum contains many words: 1680 words
  • The largest value (114000) is over 7.56 times larger than the second largest value (100)

De-identified Maximum

categorical

Approximate Distinct Count 7398
Approximate Unique (%) 5.2%
Missing 0
Missing (%) 0.0%
Memory Size 10.2 MB

Length

Mean 10.2479
Standard Deviation 1.9965
Median 11
Minimum 3
Maximum 20

Sample

1st row *
2nd row *
3rd row *
4th row *
5th row *

Letter

Count 0
Lowercase Letter 0
Space Separator 285520
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 791056
  • De-identified Maximum contains many words: 7397 words

Filename

categorical

Approximate Distinct Count 12
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 17.8 MB

Length

Mean 65.4341
Standard Deviation 2.7539
Median 68
Minimum 61
Maximum 70

Sample

1st row 56-0547479_NovantH...
2nd row 56-0547479_NovantH...
3rd row 56-0547479_NovantH...
4th row 56-0547479_NovantH...
5th row 56-0547479_NovantH...

Letter

Count 7485486
Lowercase Letter 6764391
Space Separator 0
Uppercase Letter 721095
Dash Punctuation 142760
Decimal Number 1284840

system

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 9.7 MB

Length

Mean 6
Standard Deviation 0
Median 6
Minimum 6
Maximum 6

Sample

1st row NOVANT
2nd row NOVANT
3rd row NOVANT
4th row NOVANT
5th row NOVANT

Letter

Count 856560
Lowercase Letter 0
Space Separator 0
Uppercase Letter 856560
Dash Punctuation 0
Decimal Number 0
  • system has words of constant length

Missing Values