BlogPump: Blog Post Client with Web Crawler(1) – big picture

imageBig Picture is:

1) Module A: Interface with supported Weblog Server to post/retrieve web page, article and others;

2) Module B: Container to support editor or list for data;

3) Module C: Interface with to grasp pages you wanted or articles relevant information against popular search engines;

4) Module D: Profile management for source, patterns and destination combination flexibility;

5) Module E: Data persistent module to store/read locally;

Actions are permanent for hard code like “Do Crawl”, “Do Post” and “Do Save/Read”. The how and where depend on plugin and profile.

language:

First stage target server: hosted websites

HTML parser: beautiful soap or lxml

Protocol for post: XML-RPC

GUI:

Share and Enjoy:
  • Print this article!
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • LinkedIn
  • Live
  • MySpace
  • RSS
  • Slashdot
  • Technorati
  • TwitThis

Related posts:

  1. 开始Python — Dictionary
  2. Python 3 简介
  3. Python programming- List extend() and append()
  4. Python用SGMLParser抓取网页连接的改进
  5. Python HTML Parser Performance
  6. Python Programming – Sqlite for data persistence
  7. Core Python Programming(1) - Basic
  8. C::B IDE的命令行参数
  9. CB introduction and features
  10. Linux下Python网络编程框架-Twisted安装手记

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word

Contact us

Admin: Bryan Wu