This should do what you want. Removes everything between the <> including the <> and leaves just the content (aka innerHTML).
Data HTMLData;
filename INDEXIN URL "http://www.zug.com/";
input;
textline = _INFILE_;
/*-- Clear out the HTML text --*/
re1 = prxparse("s/<(.|\n)*?>//");
call prxchange(re1, -1, textline);
run;
2
solved How do I remove HTML from the SAS URL access method?