Discussion:
[Tutor] String Problem
Crusier
2015-07-06 07:44:32 UTC
Permalink
Dear All,

I have used the urllib.request and download some of the information from a
site.

I am currently using Python 3.4. My program is as follows:

import urllib.request

response = urllib.request.urlopen('
http://www.hkex.com.hk/eng/ddp/Contract_Details.asp?PId=175')

saveFile = open('HKEX.txt','w')
saveFile.write(str(response.read()))
saveFile.close()



And the result is as follows:

d align="right"> - </td><td align="right">0</td><td
align="right">8.56</td><td align="right">N/A</td><td
align="right">1</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr56"
class="tableHdrB1" align="center"><td align="centre">C Jul-15 -
23.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right">0.01</td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">0.01</td><td align="right">N/A</td><td
align="right">467</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr57"
class="tableHdrB2" align="center"><td align="centre">P Jul-15 -
23.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right"> - </td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">9.56</td><td align="right">N/A</td><td
align="right">0</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr58"
class="tableHdrB1" align="center"><td align="centre">C Jul-15 -
24.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right">0.01</td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">0.01</td><td align="right">N/A</td><td
align="right">156</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr59"
class="tableHdrB2" align="center"><td align="centre">P Jul-15 -
24.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right"> - </td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">10.56</td><td align="right">N/A</td><td
align="right">0</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr60"
class="tableHdrB1" align="center"><td align="centre">C Jul-15 -
25.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right">0.01</td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">0.01</td><td align="right">N/A</td><td
align="right">6</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr61"
class="tableHdrB2" align="center"><td align="centre">P Jul-15 -
25.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right"> - </td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">11.56</td><td align="right">N/A</td><td
align="right">0</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr62"
class="tableHdrB1" align="center"><td align="centre">C Aug-15 -
8.75</td><td align="right"> - </td><td align="right"> - </td><td
align="right"> - </td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">4.71</td><td align="right">N/A</td><td
align="right">0</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr63"
class="tableHdrB2" align="center"><td align="centre">P Aug-15 -
8.75</td><td align="right"> - </td><td align="right">0.03</td><td
align="right">0.05</td><td align="right"> - </td><td align="right"> -
</td><td align="right"> - </td><td align="right">0</td><td
align="right">0.01</td><td align="right">N/A</td><td
align="right">35</td></tr>\r\n\t\t\t\t\t\t\t\t<tr id="tr64"
class="tableHdrB1" align="center"><td align="centre">C Aug-15 -
9.00</td><td align="right"> - </td><td align="right"> - </td><td
align="right"> - </td><td align="right"> - </td><t

Please let me know how to deal with this string. I hope I could put onto a
table first. Eventually, I am hoping that I can able to put all this
database. I need some guidance of which area of coding I should look into.

Thank you
Hank
_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Mark Lawrence
2015-07-06 08:25:08 UTC
Permalink
Post by Crusier
Dear All,
I have used the urllib.request and download some of the information from a
site.
import urllib.request
response = urllib.request.urlopen('
http://www.hkex.com.hk/eng/ddp/Contract_Details.asp?PId=175')
saveFile = open('HKEX.txt','w')
saveFile.write(str(response.read()))
saveFile.close()
[snipped]
Post by Crusier
Please let me know how to deal with this string. I hope I could put onto a
table first. Eventually, I am hoping that I can able to put all this
database. I need some guidance of which area of coding I should look into.
Thank you
Hank
Start here
https://docs.python.org/3/library/html.parser.html#example-html-parser-application
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Cameron Simpson
2015-07-06 08:19:45 UTC
Permalink
Post by Crusier
Dear All,
I have used the urllib.request and download some of the information from a
site.
import urllib.request
response = urllib.request.urlopen('
http://www.hkex.com.hk/eng/ddp/Contract_Details.asp?PId=175')
saveFile = open('HKEX.txt','w')
saveFile.write(str(response.read()))
saveFile.close()
d align="right"> - </td><td align="right">0</td><td
[...]
Post by Crusier
Please let me know how to deal with this string. I hope I could put onto a
table first. Eventually, I am hoping that I can able to put all this
database. I need some guidance of which area of coding I should look into.
Look into the BeautifulSoup library, which will parse HTML. That will let you
locate the TABLE element and extract the content by walking the rows (TR) and
cells (TD).

Start here:

http://www.crummy.com/software/BeautifulSoup/bs4/doc/

You can install bs4 using pip, or in other ways:

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-beautiful-soup

Cheers,
Cameron Simpson <***@zip.com.au>

30 years ago, I made some outrageous promises about AI. I didn't deliver.
Neither did you. This is all your fault. - Marvin Minsky, IJCAI'91 (summary)
_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Loading...