代码之家  ›  专栏  ›  技术社区  ›  Stacey

使用Python扫描目录树并将.csv文件读取到数据帧中

  •  1
  • Stacey  · 技术社区  · 7 年前

    我试图遍历目录树,对于遍历过程中遇到的每个csv,我希望打开文件并将第0列和第15列读取到数据框中(之后我将处理并移动到下一个文件)。我可以使用以下方法遍历目录树:

    rootdir = r'C:/Users/stacey/Documents/Alco/auditopt/'
    for dirName,sundirList, fileList in os.walk(rootdir):
             print('Found directory: %s' % dirName)
             for fname in fileList:
                 print('\t%s' % fname)
                 df = pd.read_csv(fname, header=1, usecols=[0,15],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'total_pnl_per_pos'])
                 print(df)
    

    但我收到了错误信息:

    FileNotFoundError: File b'auditopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv' does not exist.
    

    我正在尝试读取确实存在的文件。它们是MS Excel.csv格式的,所以我不知道这是否是一个问题-如果是,请有人告诉我如何将MS Excel.csv读入数据框。

    Found directory: C:/Users/stacey/Documents/Alco/auditopt/
    Found directory: C:/Users/stacey/Documents/Alco/auditopt/roll_597_oe_2017-03-10
            tradeopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv
    Traceback (most recent call last):
    
      File "<ipython-input-24-3753e367432d>", line 1, in <module>
        runfile('C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py', wdir='C:/Users/stacey/Documents/scripts')
    
      File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
        execfile(filename, namespace)
    
      File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
        exec(compile(f.read(), filename, 'exec'), namespace)
    
      File "C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py", line 49, in <module>
        main()
    
      File "C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py", line 36, in main
        df = pd.read_csv(fname, header=1, usecols=[0,15],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'total_pnl_per_pos'])
    
      File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 646, in parser_f
        return _read(filepath_or_buffer, kwds)
    
      File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 389, in _read
        parser = TextFileReader(filepath_or_buffer, **kwds)
    
      File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 730, in __init__
        self._make_engine(self.engine)
    
      File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 923, in _make_engine
        self._engine = CParserWrapper(self.f, **self.options)
    
      File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 1390, in __init__
        self._reader = _parser.TextReader(src, **kwds)
    
      File "pandas\parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:4184)
    
      File "pandas\parser.pyx", line 667, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:8449)
    
    FileNotFoundError: File b'tradeopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv' does not exist
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   cs95 abhishek58g    7 年前

    os.walk 默认情况下,不提供完整路径。你需要自己供应。

    使用 os.path.join 让这变得容易。

    import os
    full_path = os.path.join(dirName, file)
    df = pd.read_csv(full_path, ...)