Readme.defs 

 Nav   
Home
Download
Screenshots
Supported Strips

 Docs   
Install
Readme
Readme.local
Readme.defs
Readme.dailystrips-clean
Changelog
Bugs
Todo
Contributors
 Readme.defs   

This file describes in further detail how the strips.def file is contsructed.

Strips can be defined in one of two ways. The first is standalone. Also, strips can be provide the image URL by generating it (as from the current date) or by searching a web page for a URL. Let's look at an example of generating first:

 strip badtech
 	name Badtech
 	artist James Sharman
 	homepage http://www.badtech.com/
 	type generate
 	imageurl http://www.badtech.com/a/%-y/%-m/%-d.jpg
 	provides any
 end
 

In the first line, we specify the short name of the strip that will be used to refer to it on the command line. This must be unique. Next, "name" specifies the name of the strip to display in the HTML output. "artist" specifies the strip artist's name, which will be displayed on the same line as the name of the strip. "homepage" is the address of the strip's homepage, use for the link in the output. "type" can be either "generate" or "search". Here we are using "generate" to generate a URL. "imageurl" is the address of the image. You are allowed to use a number of special variables. Single letters preceeded by the "%" symbol, such as "%Y", "%d", "%m", etc. are interpreted as date variables and passed to the strftime function for conversion. "date --help" provides a reference that is compatible. You can also use a "$" followed by any of the above variables, such as "$homepage". This will simply subsititute "http://www.badtech.com" in place of "$homepage".

The other type of URL generation, searching, is as follows:

 strip joyoftech
 	name The Joy of Tech
 	homepage http://www.joyoftech.com/joyoftech/
 	type search
 	searchpattern <IMG.+?src="(joyimages/\d+\.gif)\"
 	matchpart 1
 	baseurl http://www.joyoftech.com/joyoftech/
 	provides latest
 end
 

"strip", "name", and "homepage" all function as above. The difference is the "type search" line and the lines that follow. "searchpattern" is a Perl regular expression that must be written to match the strip's URL. Not shown is "searchpage", which would ordinarily go above "searchpattern". This is a URL to a web page and is only needed if the URL to the strip image is not found on the homepage. The same special variables listed above for "imageurl" may also be used here. "matchpart" tells the script witch parenthetical section (there must be at least one) will contain the desired URL (see man perlre on $n variables for more). Note that this line is only mandatory for values other than 1. "baseurl" only needs to be specified if the "searchpattern" regular expression does not match a full URL (that is, it does not start with http:// and contain the host). If specified, it is prepended to whatever "searchpattern" matched. Not shown is "urlsuffix", which will be appended to the what "searchpattern" matched. Finally, "imageurl" can also be used here in place of "baseurl" and "urlsuffix". It is useful when address to the strip image must be constructed from a known portion and a variable portion that is searched for. Simply specify the URL template and insert "$match" wherever you wish the result of the search to be put. See the strips "8bit" and "pgs" for examples of how this works.

The "provides" line indicates which type of strips the definition can provide: either "any" for a definition that can provide the strip for any given date or "latest" for a definition that can only provide the current strip. This is used so that the program can skip definitons that only provide the latest strip when running with the --date option.

Two additional variables are not shown. First is "referer". Some webservers insist that the HTTP_REFERER header be set to the address of the HTML page that the image is on, or they will not return the image. This is to prevent other sites from linking to the image and (presumably) scripts like this from functioning. What the script does by default is set the HTTP_REFERER header to the searchpage (if specified), or the homepage (if no specific searchpage was specified). If the webserver for some reason needs a referer other than the searchpage or homepage, it can be specified with this variable. The second keyword is "prefetch". This was added because it seemed at one point that sfgate required a certain page to be downloaded immediately before the strip images could be loaded. The syntax is simply "prefetch [URL]". Any URL specified will be downloaded immediately before the strip image. If this URL cannot be retrieved (error 404 from the webserver, etc), no attempt will be made to download the strip image.

New feature: you can now put little snippets of Perl code right into the definition. For example, the definition for The Norm uses this to generate the day number for 14 days ago. The Norm website uses Javascript to generate the image URL, so it couldn't be searched for and previously there was no way to work with dates other than the current date. Here's how it works: just insert <code:Perl code>. No need to quote the code, just put it where "Perl code" is. Just don't forget to escape any > that may happen to be in your code.

The other method of specifying strips is to use classes. This method is used when there are serveral strips provided by the same webserver that all have an identical definition, except for some strip-specific elements. Classes work as follows:

First, the class is declared:

 class ucomics-srch
 	homepage http://www.ucomics.com/%strip/view%1.htm
 	type search
 	searchpattern (/%1/(\d+)/%1(\d+)\.(gif|jpg))
 	matchpart 1
 	baseurl http://images.ucomics.com/comics
 	provides latest
 end
 

This is just like a strip definition, except "class" is the first line. The value for "class" must be unique among other classes but will not conflict with the names of strips. Strip-specific elements are specified using special variables "$x", where "x" is a number from 0 to 9. When the definition file is parsed, these variables are retrieved from the strip definition, shown below.

 strip calvinandhobbes
 	name Calvin and Hobbes
 	useclass ucomics-srch
 	%1 ch
 end
 

This definition is like a normal definition except the second line is "useclass" followed by the name of the class to use. Below that, the strip-specific "$x" variables must be specified. Values already declared in the class can be overridden (if necessary) by simply specifying them in the strip definition.

For your convenience, "groups" of strips may also be defined. These allow you to use a single keyword on the command line to refer to a whole set of strips. The construct is as follows:

 group favorites
 	desc My Favorite Comics
 	include peanuts
 	include foxtrot, userfriendly
 end
 

The group name must be unique among all groups, but will not conflict with strips or classes (in fact it might be useful to have a group for each class - I might even make that automatic). "desc" is a description of the strip that will be shown by --list. Everything after an "include" is added to the list of strips. You may specify one or more strips per "include" line, whatever you prefer. Groups are referenced on the command line by an "@" symbol followed by the name (bare words not preceeded by an "@" symbol will be considered strips). Finally, you may use "exclude [strips]" lines instead of "include" lines to make the group contain everystrip except those specified.

Notes:

  • As of 1.0.10, only date variables use the "%" symbol - everything else now uses "$"
  • For classes, variables declared in the strip definition take precedence over those specified in the class, if there is any conflict
  • You cannot use "%varaiablename" to refer to a variable below the current line (assuming that you use the standard order) if the referenced variable is a reference itself - the script only parses "%variablename" references once. This is a bug and is scheduled to be fixed.
  • If no "searchpage" is specified for definitions of type "search", the value of "homepage" is used.
  • If no "referer" is specified, the value of "searchpage" is used. If this has not been set (in the case of definitions that generate the URL or search definitions that use the homepage as the searchpage), the value of "homepage" is used.
  • There may be additional problems lurking in the defintion file-parsing code. It currently works fairly well, but needs to be re-written properly.
  • Group, strip, and class names can contain pretty much any character except semicolon, space, and pipe.

 Links   

Perl
Freshmeat
Slashdot
SourceForge Project

SourceForge Logo


Copyright ©2001-2003 Andrew Medico <amedico @ amedico . dhs . org>