Wednesday, January 31, 2024

[FIXED] How to append a new list to an existing CSV file?

January 31, 2024 csv, python, python-2.x No comments

Issue

I already have a CSV file created from a list using CSV writer. I want to append another list created through a for loop columnwise to a CSV file.

The first code to create a CSV file is as follows:

with open("output.csv", "wb") as f:
    writer = csv.writer(f)
    for row in zip(master_lst):
        writer.writerow(row)

I created the CSV file using the list master_lst and the output is as follows:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG

Then I create another list (ind_lst) through a for loop and the contents of the list has to be appended columnwise to the CSV file created in the previous step. I used the following code:

with open("output.csv", "ab") as f:
    writer = csv.writer(f)
    for row in zip(ind_lst):
        writer.writerow(row)

The output I obtained is as follows:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG
sample1
3
3
1
sample2
4
4
1

However I need the output columnwise as follows:

read                         sample1     sample2
ACACCUGGGCUCUCCGGGUACC         3            4
ACGGCUACCUUCACUGCCACCC         3            4
AGGCAGUGUGGUUAGCUGGUUG         1            1

I checked for solutions but I can find only solutions for appending row wise, but I need to append it columnwise: append new row to old csv file python

I used writer.writerows instead of writer.writerow but I get this error:

_csv.Error: sequence expected

The output is as follow:

read
ACACCUGGGCUCUCCGGGUACC
ACGGCUACCUUCACUGCCACCC
AGGCAGUGUGGUUAGCUGGUUG
s                        a   m   p  l  e 1

As you can see, it prints the first element of the list in each cell and terminates thereafter with an error. I am a beginner in python, so if anyone could help solve this issue that would be awesome.

EDIT:

The master_lst is created using the following code:

 infile= open(sys.argv[1], "r")
 lines = infile.readlines()[1:]
 master_lst = ["read"]
 for line in lines:
  line= line.strip().split(',')
  fourth_field = line [3]
  master_lst.append(fourth_field)

the ind_lst is created using the following code:

for file in files:
 ind_lst = []   
 if file.endswith('.fa'):
  first = file.split(".")
  first_field = first [0]
  ind_lst.append(first_field)
  fasta= open(file)
  individual_dict= {}
  for line in fasta:
   line= line.strip()
   if line == '':
    continue
   if line.startswith('>'):
    header = line.lstrip('>')
    individual_dict[header]= ''
   else:
    individual_dict[header] += line
 for i in master_lst[1:]:
   a = 0
   if key in individual_dict.keys():
     a = individual_dict[key]
   else:
    a = 0
   ind_lst.append(a)

Solution

You're actually trying to append several columns to the existing file, even if the data for these new columns is all stored in a single list. It might be better to arrange the data in the ind_lst differently. but since you haven't showed how that's done, the code below works with the format in your question.

Since modifying CSV files is tricky—since they're really just text file—it would be much easier to simply create a new file with the merged data, and then rename that file to match the original after deleting the original (you've now been warned).

import csv
from itertools import izip  # Python 2
import os
import tempfile

master_lst = [
    'read',
    'ACACCUGGGCUCUCCGGGUACC',
    'ACGGCUACCUUCACUGCCACCC',
    'AGGCAGUGUGGUUAGCUGGUUG'
]

ind_lst = [
    'sample1',
    '3',
    '3',
    '1',
    'sample2',
    '4',
    '4',
    '1'
]

csv_filename = 'output.csv'

def grouper(n, iterable):
    's -> (s0,s1,...sn-1), (sn,sn+1,...s2n-1), (s2n,s2n+1,...s3n-1), ...'
    return izip(*[iter(iterable)]*n)

# first create file to update
with open(csv_filename, 'wb') as f:
    writer = csv.writer(f)
    writer.writerows(((row,) for row in master_lst))

# Rearrange ind_lst so it's a list of pairs of values.
# The number of resulting pairs should be equal to length of the master_lst.
# Result for example data:  [('sample1', 'sample2'), ('3', '4'), ('3', '4'), ('1', '1')]
new_cols = (zip(*grouper(len(master_lst), ind_lst)))
assert len(new_cols) == len(master_lst)

with open(csv_filename, 'rb') as fin, tempfile.NamedTemporaryFile('r+b') as temp_file:
    reader = csv.reader(fin)
    writer = csv.writer(temp_file)
    nc = iter(new_cols)
    for row in reader:
        row.extend(next(nc))  # add new columns to each row
        writer.writerow(row)
    else:  # for loop completed, replace original file with temp file
        fin.close()
        os.remove(csv_filename)
        temp_file.flush()  # flush the internal file buffer
        os.fsync(temp_file.fileno())  # force writing of all data in temp file to disk
        os.rename(temp_file.name, csv_filename)

print('done')

Contents of file after creation followed by update:

read,sample1,sample2
ACACCUGGGCUCUCCGGGUACC,3,4
ACGGCUACCUUCACUGCCACCC,3,4
AGGCAGUGUGGUUAGCUGGUUG,1,1

Answered By - martineau

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, January 31, 2024

[FIXED] How to append a new list to an existing CSV file?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels