代码之家  ›  专栏  ›  技术社区  ›  Ajay Kumar

如何在数据帧中仅编码分类数据

  •  -5
  • Ajay Kumar  · 技术社区  · 6 年前

    enter image description here

    如何在数据帧中仅编码分类数据

    Income  Length of Residence Median House Value  Number of Vehicles  Percentage Asian    Percentage Black    Percentage English Speaking Percentage Hispanic Percentage White    MakeDescr   SeriesDescr Msrp
    1   90000   15.0    F   4   1   1   71  6   81  HYUNDAI Sonata-4 Cyl.   19395.0
    2   125000  7.0 H   1   11  1   91  1   81  JEEP    Grand Cherokee-V6   29135.0
    3   90000   8.0 F   1   1   1   71  6   86  JEEP    Liberty 20700.0
    4   125000  8.0 F   3   1   1   86  6   86  VOLKSWAGEN  Passat-V6   28750.0
    5   90000   8.0 F   1   1   1   71  6   81  JEEP    Wrangler    20210.0
    6   110000  7.0 G   5   6   6   71  6   76  HYUNDAI Santa Fe-V6 25645.0
    7   110000  7.0 G   3   11  6   71  6   71  HYUNDAI Sonata-4 Cyl.   15999.0
    8   125000  8.0 G   1   1   11  81  6   76  HYUNDAI Santa Fe-V6 23645.0
    9   125000  9.0 G   1   6   1   91  1   86  CHEVROLET TRUCK Trailblazer EXT 32040.0
    10  110000  8.0 E   2   6   46  81  16  26  JEEP    Wrangler-V6 18660.0
    11  125000  11.0    G   3   6   1   76  1   86  CHEVROLET TRUCK Silverado 2500 HD   31775.0
    12  125000  12.0    G   2   11  6   66  1   71  CHEVROLET   Cobalt  13675.0
    13  125000  13.0    G   2   1   16  95  6   71  HYUNDAI Veracruz-V6 28600.0
    15  110000  11.0    F   5   6   41  61  11  41  HYUNDAI Santa Fe    22499.0
    16  125000  9.0 F   2   1   6   91  1   81  HYUNDAI Santa Fe    22499.0
    17  125000  8.0 G   2   11  11  66  1   66  MITSUBISHI  Endeavor-V6 32602.0
    18  110000  12.0    E   1   6   46  81  16  26  HYUNDAI Accent-4 Cyl.   10899.0
    19  90000   9.0 F   4   1   6   71  6   81  JEEP    Grand Cherokee-6 Cyl.   29080.0
    21  125000  8.0 G   1   6   1   76  1   86  MITSUBISHI  Endeavor-V6 29302.0
    22  110000  12.0    F   2   6   26  66  11  51  HYUNDAI Santa Fe    22499.0
    23  90000   9.0 F   1   6   6   66  6   76  HYUNDAI Santa Fe-V6 20995.0
    24  125000  9.0 H   1   6   1   91  1   81  HYUNDAI Sonata-V6   18799.0
    25  90000   14.0    F   2   1   6   71  11  81  HYUNDAI Elantra-4 Cyl.  13299.0
    26  125000  9.0 G   3   1   11  81  6   76  JEEP    Grand Cherokee-6 Cyl.   29080.0
    27  125000  8.0 H   5   6   1   91  1   81  CHEVROLET TRUCK Trailblazer 29395.0
    28  110000  12.0    E   4   6   41  61  11  36  HYUNDAI Sonata-4 Cyl.   15999.0
    29  110000  10.0    E   1   6   41  61  11  36  HYUNDAI Santa Fe-V6 20995.0
    30  125000  10.0    F   2   6   1   71  6   86  CHEVROLET TRUCK Tahoe   37000.0
    32  90000   10.0    F   1   1   1   71  6   86  MITSUBISHI  Galant-V6   19997.0
    33  125000  12.0    F   1   1   1   86  6   86  CHEVROLET TRUCK Trailblazer 28175.0
    ... ... ... ... ... ... ... ... ... ... ... ... ...
    4451    110000  9.0 F   3   6   41  61  11  36  NISSAN  Sentra-4 Cyl.   17990.0
    4452    125000  11.0    G   2   1   11  81  6   76  CHEVROLET TRUCK Tahoe   39515.0
    4453    125000  8.0 H   1   6   1   91  1   81  HYUNDAI Elantra-4 Cyl.  15195.0
    4454    110000  10.0    F   3   6   41  61  11  41  HYUNDAI Genesis-4 Cyl.  26750.0
    4455    125000  7.0 H   4   11  1   76  1   76  HYUNDAI Sonata-4 Cyl.   19695.0
    4456    125000  9.0 G   5   6   1   76  1   86  NISSAN  Altima  22500.0
    4457    110000  11.0    E   1   6   46  81  16  26  GMC LIGHT DUTY  Denali  51935.0
    4458    125000  6.0 H   1   11  1   76  1   76  JEEP    Liberty-V6  24865.0
    4459    125000  12.0    G   3   1   16  95  6   71  HONDA   Accord-V6   26700.0
    4460    125000  7.0 F   1   1   1   86  6   86  HYUNDAI Veloster-4 Cyl. 17300.0
    4461    90000   10.0    F   2   6   11  66  6   71  CADILLAC    SRX-V6  42210.0
    4463    110000  8.0 F   3   6   26  61  11  56  GMC LIGHT DUTY  Acadia  42390.0
    4468    125000  8.0 G   1   1   1   91  1   86  HONDA   Pilot-V6    40820.0
    4469    125000  10.0    H   5   11  1   91  1   81  TOYOTA  Highlander-V6   30695.0
    4470    110000  12.0    F   1   6   41  61  11  41  HYUNDAI Elantra-4 Cyl.  15195.0
    4473    110000  13.0    F   1   6   21  66  6   61  ACURA   TSX 32910.0
    4476    125000  9.0 G   1   6   1   76  1   86  BMW X3  36750.0
    4482    125000  10.0    H   1   6   1   91  1   81  SUBARU  Forester-4 Cyl. 21195.0
    4486    125000  11.0    H   2   6   1   91  1   81  GMC LIGHT DUTY  Yukon XL    44315.0
    4492    125000  10.0    H   2   6   1   91  1   81  BMW 5 Series    53400.0
    4493    110000  12.0    G   2   6   6   71  6   76  ACURA   TL  33725.0
    4494    125000  12.0    F   3   1   1   86  6   86  ACURA   TL  33725.0
    4495    125000  12.0    F   3   1   1   86  6   86  ACURA   TL  33725.0
    4496    125000  7.0 G   5   1   11  81  6   76  ACURA   TL  33325.0
    4497    125000  9.0 G   1   6   1   76  1   86  ACURA   TL  33725.0
    4498    125000  12.0    G   3   1   11  81  6   76  ACURA   TL  33725.0
    4499    110000  14.0    G   8   11  6   71  6   71  ACURA   TL  33725.0
    4501    125000  9.0 G   3   11  6   66  1   71  FORD    Taurus-V6   20050.0
    4502    110000  2.0 G   4   11  6   71  6   71  DODGE   Stratus-4 Cyl.  15910.0
    4503    125000  8.0 F   1   1   1   86  6   86  DODGE   Stratus-4 Cyl.  19145.0
    
    1 回复  |  直到 6 年前
        1
  •  0
  •   Victor Valente    6 年前
    # Using standard scikit-learn label encoder.
    from sklearn.preprocessing import LabelEncoder
    le = LabelEncoder()
    
    # Encode all string columns. Assuming all categoricals are of type str.
    for c in df.select_dtypes(['object']):
        print "Encoding column " + c
        df[c] = le.fit_transform(df[c])