代码之家 › 专栏 › 技术社区 › Andy

集合中的Python字符串给出了奇怪的结果

string-comparison comparison set python

Andy · 技术社区 · 3 年前

我的代码是读取csv文件的标题,并将其转换为column_name=>列索引:

class CSVOutput:
  def __init__(self, csv_file, required_columns):
    csv_reader = csv.reader(csv_file)

    # Construct lookup table for header
    self.header = {}
    for idx, column in enumerate(next(csv_reader)):
      print(f"{column.lower().strip()} == key: {column.lower().strip() == 'key'}")
      print(f"{column.lower().strip()} is key: {column.lower().strip() is 'key'}")
      self.header[column.lower().strip()] = idx

    print(self.header)

     # Load the row data into memory/index it against key
     key_idx = self.header['key']

with open("test.csv") as csv_file:
    data = CSVOutput(csv_file, {})

当我运行此程序时,我会得到以下输出和错误:

{'key': 0, 'col1': 1, 'col2': 2}

key == key: False
key is key: False
col1 == key: False
col1 is key: False
col2 == key: False
col2 is key: False

Traceback (most recent call last):
  File "D:\compare.py", line 74, in <module>
    actual_data = CSVOutput(act_csv, required_columns)
  File "D:\compare.py", line 40, in __init__
    key_idx = self.header['key']
KeyError: 'key'

基本上,字面上的“key”和从文件中加载的“key”之间似乎存在不等价性。我试着在记事本++中查看源文件,并显示所有符号,但我没有看到任何区别。我刚刚在十六进制编辑器中查看了csv文件,我可以看到它的开头看起来是这样的:Key,是EF BB BF。我不确定这是否是我的问题的根源,但如果是,为什么strip()不去掉它,我该如何处理?

有什么想法吗?

1 回复 | 直到 3 年前

Daweo 3 年前

EF BB BF

这是 UTF-8 BOM ,您可能会使用 utf-8-sig 编码以处理这样的文件。使用 encoding 属于 open 函数跟随方式

with open("test.csv",encoding="utf-8-sig") as csv_file:

推荐文章

Codeforces Fan · 如何在c中的STL集合中添加lambda函数或执行自定义操作++

2 年前

tryingmybest09 · 设置出生日期排序不正确

2 年前

David · 在c++中,将集合的向量创建为类成员会产生错误[重复]

7 年前

user5335342 · 对于带set的循环:key

7 年前

Joan Pastor · C++基于参数更改设置的默认顺序

7 年前

LucSpan · 让Python找到满足条件的两个整数

7 年前

Nick Ginanto · 当Set是Hashmap的值时,添加要设置的元素

7 年前

Thompson007 · 如何使用magic方法\uu set查看给定的数组名称

7 年前

jsstuball · 与两次添加相同对象的Python集混淆

7 年前

user6822657 · 如何在集合中查找特定元素?

7 年前