I tried using PyQuery, but when I extract all the inner html of the class item_detail with PyQuery,
d = PyQuery(htmlstr)
detail = d('.item_detail').html()
I did it, but when I displayed the acquisition result in Chrome, it was quite strange, the reason was that there was an iframe inside, but originally it was <iframe src = XXX> </ iframe>
, but PyQuery If you extract it with, it looks like <iframe src = XXX />
, and it becomes XML. Apparently, because of this guy, the subsequent display is strange. If you take a closer look, they say <br>
or <br />
and they are all in XML! !! I just want to extract a part, but I wonder if you can do that without permission! !! !! Do you feel like Fujiyoshiro! !! !! !!
So, I just wanted html, but ... I thought, and when I looked at the official document, it was like this.
d = PyQuery(htmlstr)
detail = d('.item_detail').html(method='html')
That's it. I was a little lost, so I made a memorandum. That's it.