当前位置：首页 > news >正文

对多个 CSV 文件做关联

news 来源：原创 2024/5/2 17:36:12

【问题】
I have two CSV files as follows:

CSV1:

**ID Name Address Ph**
 1 Mr.C dsf 142
 2 Ms.N asd 251
 4 Mr.V fgg 014
 12 Ms.S trw 547

CSV2:

 **ID Name Service Day**
1 Mr.C AAA Mon
 2 Ms.N AAA Mon
 2 Ms.N BBB Tue
 2 Ms.N AAA Sat

As you can see very quickly CSV1 file is unique in having only 1 instance of every ID whilst CSV2 has repeats.

I am trying to match two CSV files based on ID and then wherever they match adding to CSV2 file the Address and Ph fields from CSV1. This is then saved as a new output file leaving the two original CSV files intact.

I have written a code but here’s what’s happening:

Either all the entries from CSV1 get added against the last row of CSV2
Or all the entries from CSV2 get the same address details appended against them

Here’s what I have done so far.

import csv
csv1=open('C:\csv1file.csv')
csv2=open('C:\csv2file.csv')
csv1reader=csv.reader(csv1)
csv2reader=csv.reader(csv2)
outputfile=open('C:\mapped.csv', 'wb')
csvwriter=csv.writer(outputfile)
counter=0
header1=csv1reader.next()
header2=csv2reader.next()
csvwriter.writerow(header2+header1[2:4])
for row1 in csv1reader:
 for row2 in csv2reader:
 if row1[0]==row2[0]:
 counter=counter+1
 csvwriter.writerow(row2+row1[2:4])

I am running this code in Python 2.7. As you might have guessed the two different results that I am getting are based on the indentation of the csvwriter statement in the above code. I feel I am quite close to the answer and understand the logic but somehow the loop doesn’t loop very well.

Can any one of you please assist?

Thanks.

别人给出的解法，有 bug 未解决

csvwriter.writerow(header2+header1[2:4])
csv2copy=[]
for row2 in csv2reader: csv2copy.append(row2)
for row1 in csv1reader:
 for row2 in csv2copy:
 print row1,row2,counter
 if row1[0]==row2[0]:
 counter=counter+1
 csvwriter.writerow(row2+row1[2:4])

【回答】

这是典型的左连接，python 要循环实现，比较麻烦。可以用 SPL 实现，代码简单易懂：

	A
1	=file("D:\\csv1.csv").import@t()
2	=file("D:\\csv2.csv").import@t()
3	>A2.switch(ID,A1:ID)
4	=A2.new(ID.ID,Name,Service,Day,ID.Address,ID.Ph)
5	=file("D:\\result.txt").export@t(A4)

A3：将 A2 中的 ID 切换为指引字段