html parsing - Python eTree Parser isn't appending an element -


look @ log , see how says row i'm getting postgres has been turned string element (and print string, print element, print iselement boolean!) , yet when try append it, error it's not element. huff, puff.

import sys htmlparser import htmlparser xml.etree import celementtree etree import xml.etree.elementtree et xml.etree.elementtree import element, subelement, tostring import psycopg2 import psycopg2.extras  def main():     # connect existing database     conn = psycopg2.connect(dbname="**", user="**", password="**", host="/tmp/", port="**")      # open cursor perform database operations     cur = conn.cursor(cursor_factory = psycopg2.extras.realdictcursor)      cur.execute("select * landingpagedata;")     rows = cur.fetchall()      class linksparser(htmlparser):       def __init__(self):           htmlparser.__init__(self)           self.tb = etree.treebuilder()        def handle_starttag(self, tag, attributes):           self.tb.start(tag, dict(attributes))        def handle_endtag(self, tag):           self.tb.end(tag)        def handle_data(self, data):           self.tb.data(data)        def close(self):           htmlparser.close(self)           return self.tb.close()      template = 'template.html'        # parser.feed(open('landingindex.html').read()) #for testing     # root = parser.close()      row in rows:         parser = linksparser()          parser.feed(open(template).read())         root = parser.close()             #title         title = root.find(".//title")         title.text = row['title']          #headline         h1_id_headline = root.find(".//h1")         h1_id_headline.text = row['h1_id_headline']         # print row['h1_id_headline']          #intro         p_class_intro = root.find(".//p[@class='intro']")         p_class_intro.text = row['p_class_intro']         # print row['p_class_intro'] 

here problems occur!

        #recommended         p_class_recommendedbackground = root.find(".//div[@class='recommended_background_div']")         print p_class_recommendedbackground         p_class_recommendedbackground.clear()         newelement = et.fromstring(row['p_class_recommendedbackground'])         print row['p_class_recommendedbackground']         print et.iselement(newelement)         p_class_recommendedbackground.append(newelement)          html = tostring(root)         f = open(row['page_name'], 'w').close()         f = open(row['page_name'], 'w')         f.write(html)         f.close()         # f = ''         # html = ''         parser.reset()         root = ''      # close communication database     cur.close()     conn.close()  if __name__ == "__main__":   main() 

my log this:

{background: url(/images/courses/azrealestate.png) center no-repeat;} <element 'div' @ 0x10a999720> <p class="recommended_background">materials are aimed aspiring real estate sales associates wish obtain arizona real estate salesperson license, provided <a href="http://www.re.state.az.us/" style="text-decoration: underline;">arizona department of real estate</a>.</p> true traceback (most recent call last):   file "/users/morgan13/programming/landingpagebuilder/landingpages/landingbuildertest.py", line 108, in <module> main()   file "/users/morgan13/programming/landingpagebuilder/landingpages/landingbuildertest.py", line 84, in main     p_class_recommendedbackground.append(newelement) typeerror: must element, not element [finished in 0.1s exit code 1] 

i can reproduce error message way:

from xml.etree import celementtree etree import xml.etree.elementtree et  croot = etree.element('root') child = et.element('child') croot.append(child) # typeerror: must element, not element 

the root cause of problem mixing celementtree implementation of elementtree xml.etree.elementtree implementation of elementtree. never twain should meet.

so fix pick one, etree, , replace occurrences of other (e.g. replace et etree).


Comments

Popular posts from this blog

linux - xterm copying to CLIPBOARD using copy-selection causes automatic updating of CLIPBOARD upon mouse selection -

c++ - qgraphicsview horizontal scrolling always has a vertical delta -