[Solved] how can I extract data from html file using R [closed]

Question

Usually when questions like this are asked some effort needs to be shown. So please take consideration to state the exact problem with at least some effort on what you have attempted next time. To get you started here is an example using the XML package and applying XPath along with strsplit to grab the desired result.

library(XML)
doc <- htmlParse("http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM410750")
x <- xpathSApply(doc, "//td[@style="text-align: justify"]/text()[preceding-sibling::br][1]",
    function(X) { strsplit(xmlValue(X), ': ')[[1]][2]
})
# [1] "Uninfected"

Accepted Answer

Usually when questions like this are asked some effort needs to be shown. So please take consideration to state the exact problem with at least some effort on what you have attempted next time. To get you started here is an example using the XML package and applying XPath along with strsplit to grab the desired result.

library(XML)
doc <- htmlParse("http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM410750")
x <- xpathSApply(doc, "//td[@style="text-align: justify"]/text()[preceding-sibling::br][1]",
    function(X) { strsplit(xmlValue(X), ': ')[[1]][2]
})
# [1] "Uninfected"