[Solved] Finding all possible common substrings from a file consisting of strings using c++


Basically for each line, compare it with the next line to see if the next line is shorter or if the next line’s substring is not equal to the current line. If this is true, the line is unique. This can be done with a single linear pass because the list is sorted: any entry which contains a substring of the entry will follow that entry.

A non-algorithmic optimization (micro-optimization) is to avoid the use of substr which creates a new string. We can simply compare the other string as though it was truncated without actually creating a truncated string.

vector<string> unique_lines;
for (unsigned int j=0; j < lines.size() - 2; ++j)
{
    const string& line = lines[j];
    const string& next_line = lines[j + 1];

    // If the line is not a substring of the next line,
    // add it to the list of unique lines.
    if (line.size() >= next_line.size() || 
        line != next_line.substr(0, line .size()))
        unique_lines.push_back(line);
}

// The last line is guaranteed to not be a substring of any
// previous line as the lines are sorted.
unique_lines.push_back(lines.back());

// The desired output will be contained in 'unique_lines'.

solved Finding all possible common substrings from a file consisting of strings using c++