[Solved] Can it work together head, sed and regex into one bash script?

Question

Like this (taking one for the team :)? Using awk (Notice: it creates files like Abc 1:2 or whatever is between <b> and <sup>):

$ awk '
BEGIN {
    FS="<sup>"                 # split at this delimiter
}
{
    if($1==p) {                # if first part equals first part of previous split
        b=b "     " $0         # append to the output buffer
    }
    else {                     # if first part differs, do stuff
        if(NR>1) {             # first line needs not printing
            print b >> t[n]
            # close t[n]       # uncomment if if needed
        }
        n=split($1,t,/<b>/)    # get the changing part
        b=$0                   # reset buffer
    }
    p=$1                       # create previous to compare on next round
}
END {
    print b >> t[n]            # flush the rest of the buffer
}' file

Output of cat Abc\ 1\:2:

<p><nsup></nsup> <b>Abc 1:2<sup>varied text     <p><nsup></nsup> <b>Abc 1:2<sup>varied text

Depending on the awk flavor used, if you start running out of file descriptors, add a close(t[n]) after the print >>s.

Accepted Answer

Like this (taking one for the team :)? Using awk (Notice: it creates files like Abc 1:2 or whatever is between <b> and <sup>):

$ awk '
BEGIN {
    FS="<sup>"                 # split at this delimiter
}
{
    if($1==p) {                # if first part equals first part of previous split
        b=b "     " $0         # append to the output buffer
    }
    else {                     # if first part differs, do stuff
        if(NR>1) {             # first line needs not printing
            print b >> t[n]
            # close t[n]       # uncomment if if needed
        }
        n=split($1,t,/<b>/)    # get the changing part
        b=$0                   # reset buffer
    }
    p=$1                       # create previous to compare on next round
}
END {
    print b >> t[n]            # flush the rest of the buffer
}' file

Output of cat Abc\ 1\:2:

<p><nsup></nsup> <b>Abc 1:2<sup>varied text     <p><nsup></nsup> <b>Abc 1:2<sup>varied text

Depending on the awk flavor used, if you start running out of file descriptors, add a close(t[n]) after the print >>s.