[Solved] How to write correct regex format for finding and replacing lines in a file

Question

In general case one would parse and write the BDF files with something like the pyNastran.

However in this specific case using your approach is not that wrong; though your regular expressions are wrong, though the principle works here. Notice also, you need to use raw strings or escape \ in the paths; the use of unescaped \ is being deprecated and can lead to hard-to-find errors.

import re

# must use raw strings for paths, otherwise we need to
# escape \ characters
input1 = r"C:\Users\sony\Desktop\PBUSH1.BDF"
input2 = r"C:\Users\sony\Desktop\PBUSH2.BDF"

output = r"C:\Users\sony\Desktop\OUTPUT.BDF"

with open(path1) as f1, open(path2) as f2:
    dat1 = f1.read()
    dat2 = f2.read()

# use finditer instead of findall so that we will get 
# a match object for each match.
#
# For each matching line we also have one subgroup, containing the
# "PBUSH   NNN     " part, whereas the whole regex matches until
# the next end of line
matches = re.finditer('^(PBUSH\s+[0-9]+\s+).*$', dat1, flags=re.MULTILINE)

for match in matches:
    # for each match we construct a regex that looks like
    # "^PBUSH   123      .*$", then replace all matches thereof
    # with the contents of the whole line
    dat2 = re.sub('^{}.*$'.format(match.group(1)), match.group(0), dat2, flags=re.MULTILINE)

with open(output) as outf:
    outf.write(dat2)

Accepted Answer

In general case one would parse and write the BDF files with something like the pyNastran.

However in this specific case using your approach is not that wrong; though your regular expressions are wrong, though the principle works here. Notice also, you need to use raw strings or escape \ in the paths; the use of unescaped \ is being deprecated and can lead to hard-to-find errors.

import re

# must use raw strings for paths, otherwise we need to
# escape \ characters
input1 = r"C:\Users\sony\Desktop\PBUSH1.BDF"
input2 = r"C:\Users\sony\Desktop\PBUSH2.BDF"

output = r"C:\Users\sony\Desktop\OUTPUT.BDF"

with open(path1) as f1, open(path2) as f2:
    dat1 = f1.read()
    dat2 = f2.read()

# use finditer instead of findall so that we will get 
# a match object for each match.
#
# For each matching line we also have one subgroup, containing the
# "PBUSH   NNN     " part, whereas the whole regex matches until
# the next end of line
matches = re.finditer('^(PBUSH\s+[0-9]+\s+).*$', dat1, flags=re.MULTILINE)

for match in matches:
    # for each match we construct a regex that looks like
    # "^PBUSH   123      .*$", then replace all matches thereof
    # with the contents of the whole line
    dat2 = re.sub('^{}.*$'.format(match.group(1)), match.group(0), dat2, flags=re.MULTILINE)

with open(output) as outf:
    outf.write(dat2)