[Solved] Removing Code by using rereplace


Scott is right, and Leigh was right before, when you asked a similar question, jSoup is your best option.

As to a regex solution. This is possible with regex but there are problems that regex cannot always solve. For instance, if the first or second table contains a nested table, this regex would trip. (Note that text is not required between the tables, I’m just demonstrating that things can be between the tables)

(If there is always a nested table, regex can handle it, but if there is sometimes a nested table, in other words: unknown), it gets a lot messier.)

<cfsavecontent variable="sampledata">
<body>
<table cellpadding="4"></table>stuff
is <table border="5" cellspacing="7"></table>between
<table border="3"></table>the
<table border="2"></table>tables
<table></table>
</body>
</cfsavecontent>

<cfset sampledata = rereplace(sampledata,"(?s)(.*?<table.*?>.*?<\/table>.*?)(<table.*?>.*?<\/table>)(.*)","\1\3","ALL") />
<cfoutput><pre>#htmleditformat(sampledata)#</pre></cfoutput>

What this does is

(?s) sets . to match newlines as well.
(.*?<table.*?>.*?<\/table>.*?) Matches everything before the first table, the first table, and everything between it and the second table and sets it as capture group 1.
(<table.*?>.*?<\/table>) Matches the second table and creates capture group 2.
(.*) matches everything after the second table and creates capture group 3.

And then the third paramters \1\3 picks up the first and third capture groups.

If you have control of the source document, you can create html comments like

<!-- table1 -->
  <table>...</table>
<!-- /table1 -->

And then use that in the regex and end up with a more regex-friendly document.

However, still, Scott said it best, not using the proper tool for the task is:

That is like telling a carpenter, build me a house, but don’t use a hammer.

These tools are created because programmers frequently run into precisely the problem you’re having, and so they create a tool, and often freely share it, because it does the job much better.

solved Removing Code by using rereplace