[Solved] tokenizing and parsing with python


To retrieve the data from the file.txt:

data = {}
with open('file.txt', 'r') as f: # opens the file
    for line in f: # reads line by line
        key, value = line.split(' : ') # retrieves the key and the value
        data[key.lower()] = value.rstrip() # key to lower case and removes end-of-line '\n'

Then, data['name'] returns 'Sid'.

EDIT:
As the question has been updated this is the new solution:

data = {}
with open('file.txt', 'r') as f:
    header, *descriptions = f.read().split('\n\n')
    for line in header.split('\n'):
        key, value = line.split(' : ')
        data[key.lower()] = value.rstrip()
    for description in descriptions:
        key, value = description.split('\n', 1)
        data[key[1:]] = value
print(data)

You might have to adapt this if there are some whitespaces between lines or at the end of the keys…

A shorter way to do this might be to use a regex and the method re.group().

3

solved tokenizing and parsing with python