how to extract data from multiple iframes using jsoup and write it to one xml file -
i'm using jsoup extract data html page. i'm able extract data if page has 1 iframe. but, if page has links open iframe how extract data second iframe , write data 1 xml file. please me on this.
one approach parse parent website iframe tags , extract "src". "src"-values can used download each iframe content , parse it, if necessary combine them.
string url = "http://example.com/"; document document = jsoup.connect("url").get(); elements es = document.select("iframe"); string[] iframesrc; int iframecount = es.size(); iframesrc = new string [iframecount]; //extract iframe sources: int i=0; for(element e : es) { iframesrc[i] = e.getelementsbytag("iframe").attr("src"); i++; } //get iframe content document [] iframedoc; iframedoc = new document[iframecount]; int j = 0; (string s : iframesrc){ iframedoc[j] = jsoup.connect("url"+iframesrc[j]).get(); //pay attention correct url built @ point!!! j++; } /*now got parent site iframe "childs" documents. i've no experience in combining documents. if nothing works may try document.tostring()*/
to write documents file use code:
import java.io.bufferedwriter; import java.io.filewriter; import java.io.ioexception; import org.jsoup.nodes.document; public class write2file { public static void savefile(document xmlcontent, string savelocation) throws ioexception { filewriter filewriter = new filewriter(savelocation); bufferedwriter bufferedwriter = new bufferedwriter(filewriter); bufferedwriter.write(xmlcontent.tostring()); bufferedwriter.close(); system.out.println("file writing completed."); } }
Comments
Post a Comment