Zapzilla export

Draft 0.2

Contents

Introduction

With Zapzilla you can take snapshots of Teletext and Closed Caption pages in a plurality of formats. To save the page displayed in a window, select Export or Zapzilla / Export from the context menu and the desired format. Note the page is saved as displayed, this includes for example highlighted text after a search operation, but no scaling. To save text with scaling or in JPEG format, please see the Screenshot plugin.

Export Dialog

The Export dialogZapping composes a file name for you, made of the last directory entered, the station name if known, the page number and an extension suitable for the format. Naturally you can modify the name, choose one from the history or browse your file system.

The options are dynamically added, depending on the format. All settings are saved across sessions. See below for a detailed description.

HTML - HyperText Markup Language

HTML specification

Exports the page as a HTML page, that is preformatted text in a fixed spacing typewriter font. Zapzilla picks an encoding which can represent the page most efficiently, for example Greek: ISO-8859-7, other characters are encoded as Unicode references.

For those characters not representable even in Unicode, i. e. all sorts of graphics characters, you can enter a replacement. A single character (" " - space) stands for itself, alternatively you can enter the Unicode number in decimal ("32") or hexadecimal ("0x20") format.

HTML can preserve a number of text attributes: Underlined, bold, italic style, blinking, foreground and background colors. These are stored by embedding CSS information. You must enable Style Sheets in your browser to actually see color. Zapzilla creates anchors for URLs and e-mail adresses found in Teletext pages, page numbers are not linked.

The HTML file will include a title displaying the station name and page number. To concatenate HTML pages into a larger file the dialog offers an option to omit the page header.

PPM - Portable PixMap

PPM manual page

Export the page as PPM image, in raw (P6) format.

The font used to render Teletext and Closed Caption pages attempts to mimic a real TV. One implication is that a real TV doubles the vertical resolution by overlaying the same image on both fields of the picture. The dialog includes the option to create a picture with correct aspect ratio, however this will duplicate every line of the image, doubling its size without adding any new information.

On a further note, a real TV has a pixel clock unlike computer monitors, creating rectangular instead of square pixels. In other words, the image looks slightly taller than it would on a TV.

Another method to grab the page in this format, you can select a region to be copied to the clipboard and paste the image into your favourite GNU image manipulation program.

PNG - Portable Network Graphics

PNG manual page Export the page as PNG image, using an 8 bit palette with alpha channel.

This format contains the same information as PPM. Additionally transparency and overlay attributes are preserved and the file is much smaller. A title is included, displaying the station name and page number. The aspect ratio issue discussed in the PPM section applies here as well.

Text

Exports the page as plain ASCII text, stripped of all styles and colors. In the nature of things one cannot translate Greek, Cyrillic, Hebrew and Arabic Teletext pages or accented characters to ASCII, but new since Zapping version 0.6.3 you can choose from a variety of character sets including Latin-1 and Unicode UTF-8. For graphics characters you can specify a replacement character as described in the HTML section.

To preserve color and a few text styles you can enable the insertion of VT 100/200 control codes. Since terminals support only eight of the up to 4096 colors possible with Teletext Level 2.5 ("HiText"), they will be approximated.

As another method to grab the text of a page you can select a rectangular or wrapping region to be copied to the clipboard. However this mode supports ASCII only and color will be lost.

VTX - VideoTeXt format

This is the format used by the VideoteXt Teletext viewer, and the vbidecode program included with the bttv driver, to store Teletext pages. It has been added for tools importing VTX pages. If you wonder, "Videotext" is the label of the Teletext service in Germany, actually a trademark.

VTX stores pages in raw Teletext Level 1.0 format. This will impair Greek, Cyrillic, Hebrew and Arabic pages, strip off any national characters transmitted at Level 1.5 and the Level 2.5/3.5 additional colors, styles and graphics. The Zapping hyperlinks, i. e. page numbers, URLs and e-mail addresses, are not preserved.