This can be done by simply parsing the file line by line and checking if the current asset
is the same as the previous one.
First, let us store the assets and tags in a list inside transformed_data
, to access tags easily, like so:
[ [ asset1, tag1 ], [ asset2, tag2 ], ... ]
Note that I assume the file contains only asset and tag in each row.
# Some constants to improve readability
ASSET_FIELD = 0
TAG_FIELD = 1
# Open the file to parse
with open('data.csv') as csv_file:
transformed_data = list()
# Skip the headers
for line in csv_file.readlines()[1:]:
# Extract the asset and its tag
asset, tag = line.split()
# if asset is same as last asset of transformed data, ie the previous asset read
if transformed_data and transformed_data[-1][ASSET_FIELD] == asset:
# Append to previous tag
transformed_data[-1][TAG_FIELD] += ', ' + tag
# Else, simply append it
else:
transformed_data.append([asset, tag])
And this gives:
[['Laptop,', '856231'], ['Desktop,', '665786, 125548'], ['Laptop,', '657843']]
Now, if we want, we can convert it back to a list of strings:
# Join each row into a string
transformed_data = [ ' '.join(row) for row in transformed_data]
print(transformed_data)
And, this shows:
['Laptop, 856231', 'Desktop, 665786, 125548', 'Laptop, 657843']
You can do whatever you want with it, and even write it back to the file. Remember to reattach the headers!
Edit: If you are getting \n
in the strings, simply do:
# Join each row into a string
transformed_data = [ ' '.join(row).replace('\n','') for row in transformed_data]
print(transformed_data)
5
solved Python. Parse csv file. Append value to previous row if next row have duplicate value [closed]