c# - Using HTMLAgilityPack Extract text, which is not between tags and comes after specific node -


html code:

 <b> car </b>     <br></br>   car can drive.     <br></br>     <br></br> 

c# code:

        htmlagilitypack.htmldocument doc = new htmlweb().load("http://website.com/x.html");          if (doc != null)         {             htmlnode link = doc.documentnode.selectsinglenode("//b[contains(text(), 'car')]");              webbrowser1.documenttext = link.innertext;             webbrowser1.allownavigation = true;              webbrowser1.scripterrorssuppressed = true;             webbrowser1.visible = true;         } 

what manage get: car

i need get:
car
car can drive.

any suggestions? have tried adding next nodes, gave nullreferenceexceptions : "//b[contains(text(), 'car')/br]" , "//b[contains(text(), 'car')/br/br]"

thanks in advance. ps.i avoid regex..

xpath case-sensitive (see here more on this: is possible ignore case using xpath , c#? ) plus second phrase contains 'car' not child b element. have work this:

htmldocument doc = new htmlweb().load("http://website.com/x.html"); foreach (htmlnode node in doc.documentnode.selectnodes("//text()[contains(translate(., 'abcdefghijklmnopqrstuvwxyz', 'abcdefghijklmnopqrstuvwxyz'), 'car')]")) {     console.writeline(node.innertext); } 

in console application, output this:

 car    car can drive. 

Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -