Package elisa :: Package plugins :: Package ugly :: Package stage6 :: Module stage_media :: Class StageParser
[hide private]
[frames] | no frames]

Class StageParser

source code

This class implements a parser to retrieve video titles and URL from a Stage6 HTML page

Instance Methods [hide private]
 
__init__(self, string_to_parse) source code
list of strings
get_tags(self)
Returns a list of tags as strings.
source code
 
get_pages(self)
Returns an integer representing the last page.
source code
 
get_watchlist(self)
Returns a list of dictionaries, which look like this: {'label' : '', 'href' : '', 'img': ''}
source code
 
get_videos(self)
Returns a list of videos, with their name, URL and thumbnail.
source code
Class Variables [hide private]
  reg_href = re.compile(r'href="(.*)"')
  reg_href_avatar = re.compile(r'href="(.*)"><acronym')
  reg_img = re.compile(r'alt="(.*)" src="(.*)"')
  reg_time = re.compile(r'<img (.*)/></acronym>(.*)')
  reg_img_avatar = re.compile(r'src="(.*)" alt=')
  reg_img_title = re.compile(r'<acronym title="(.*)" class="no-b...
  reg_title = re.compile(r'title="(.*)">(.*)</a></')
  reg_video_id = re.compile(r'video/(.*)/')
  reg_pages = re.compile(r'>(.*)</a>')
  reg_watch_type = re.compile(r'<div class="user-watch" id="(.*)...
Method Details [hide private]

__init__(self, string_to_parse)
(Constructor)

source code 
Parameters:
  • string_to_parse (string) - the HTML code to parse

get_tags(self)

source code 
Returns a list of tags as strings. This is parsing the HTML data we have in self._to_parse
Returns: list of strings

get_pages(self)

source code 
Returns an integer representing the last page. If there are no pages we return a zero

get_videos(self)

source code 

Returns a list of videos, with their name, URL and thumbnail. This is parsing the HTML data we have in self._to_parse

@rtype : list of (string, string, string)

Class Variable Details [hide private]

reg_img_title

Value:
re.compile(r'<acronym title="(.*)" class="no-border">')

reg_watch_type

Value:
re.compile(r'<div class="user-watch" id="(.*)">')