以前有人有这个问题吗?关于原因有什么建议吗?
脚本创建包含基因组序列的文件,但它出现在过程的末尾。
我的脚本中的行
File "scripts/list_ncbi_download_genome_vs_02.py", line 97, in <module>
SeqIO.write(SeqIO.parse(genbank_file, "genbank"), genome_file, "fasta")
出现的警告:
File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 481, in write
count = writer_class(fp).write_file(sequences)
File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/Interfaces.py", line 209, in write_file
count = self.write_records(records)
File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/Interfaces.py", line 193, in write_records
for record in records:
File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 600, in parse
for r in i:
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 478, in parse_records
record = self.parse(handle, do_features)
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 462, in parse
if self.feed(handle, consumer, do_features):
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 434, in feed
self._feed_feature_table(consumer, self.parse_features(skip=False))
File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 159, in parse_features
raise ValueError("Premature end of line during features table")
我可以接受这一点,但完成一个过程并不是那么美好,它会在之后出现。
该文件可在以下网址下载:
https://github.com/felipelira/files_to_test/blob/master/GCF_000302915.1_Pav631_1.0_genomic.gbff
我的脚本中调用该命令的块是:
## rename and move files to the output directory created in the command line:
genome_dict = {}
genome_list = []
for genbank_file in list_uncompressed:
organism = genbank_file.split('/')[0]
file_name = genbank_file.split('/')[-1]
genome_file = organism +'_'+ file_name.split('_')[0] +'_'+ file_name.split('_')[1]+'.fna'
genome_list.append(genome_file)
genome_dict[genome_file.replace('.fna', '')] = organism
#print genome_dict
print "Dealing with GenBank record %s" % genome_file
SeqIO.write(SeqIO.parse(genbank_file, "genbank"), os.path.join(outdir, genome_file), "fasta")
print "Genome saved %s" % genome_file