This could be done easily using a regular expression but I am guessing you do not wish to use that.
Instead the problem can be solved by reading the file in a line at a time and deciding if the line starts with a number followed by a .
. If it does, start you start building up a list of lines until you find the next number.
Using Python’s int()
function will attempt to convert a string into a number. The find('.')
function attempts to locate the end of the number.
If the returned string is not a number, it causes a ValueError
exception to be raised. In this case, add the line to the list of lines.
If there was a number, first write any existing entry to the csv
file, and then start a new entry.
At the end, there will not be a final number line to trigger the next write, so add another call to write the final row to the csv.
For example:
import csv
with open('text.txt') as f_input, open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
entry = []
for line in f_input:
line = line.strip() # Remove the trailing newline
if len(line): # Does the line containing anything?
try:
number = int(line[:line.find('.')])
if len(entry):
csv_output.writerow(entry)
entry = [number]
except ValueError:
entry.append(line)
csv_output.writerow(entry)
Python’s csv
library is used to take a list and automatically add the necessary commas between entries when writing to the csv output file. If an entry contains a comma, it will automatically add quotes.
solved Python script for reformatting a text file into a csv using Python