本团队提供生物医学领域专业的AI(机器学习、深度学习)技术支持服务。如果您有需求,请扫描文末二维码关注我们。
Python读取fasta
格式数据成为字典形式。
def read_fasta(file_path):
"""
读取FASTA格式文件,并返回一个包含序列ID和序列的字典。
:param file_path: FASTA文件的路径
:return: 一个字典,键为序列ID,值为相应的序列
"""
sequences = {}
sequence_id = None
sequence_lines = []
with open(file_path, 'r') as file:
for line in file:
line = line.strip()
if line.startswith('>'):
if sequence_id is not None:
sequences[sequence_id] = ''.join(sequence_lines)
sequence_id = line[1:] # 去掉 '>' 并获取序列ID
sequence_lines = []
else:
sequence_lines.append(line)
# Don't forget to add the last sequence after the loop
if sequence_id is not None:
sequences[sequence_id] = ''.join(sequence_lines)
return sequences