| Eric Lindsay 2005-12-09, 8:00 am |
| I need to update my antiquated manual (originally vi) web page
production methods for my amateur site, plus a few other sites (about
400 pages). I particularly want to ensure my resultant files have very
little in the body before the h1 and first paragraph, to assist with
search engine placement. I also want to ensure the pages are standard
compliant (or very close) even before validating. When writing the
pages by hand I tend to make a fair number of typos, and sometimes miss
closing tags, forget alts, and that sort of error.
The sites all use static pages, and I plan to generate revised pages at
home, and automate the uploads using find -newer
My first cut is to separate out the DTD, {title, keywords, description},
and the rest of the head into separate boilerplate files. I'm also
planning to separate out the body of the page from the navigation, the
footer, the bread crumb trail, and any regular sidebars, all of which
will go into boilerplate files.
Each part of the web sites are in separate directories, so each will
have their own set of boilerplate (normally fairly similar, of course).
The directory structure should be something like:
topic1 - texts - boilerplate
- images
topic2 - texts - boilerplate
- images
etc.
The topic directories will contain the resulting .html files, plus
topic.css, favicon, robots.txt etc., as required. I'd hope to end up
with a site wide .css, and specific topic .css files as required. The
texts directories will contain the main body text. I was thinking of
going through all the topic directories, looping through all the .txt
files in the texts directory and generating the final pages in the
topics directories.
Because different topics may need different sidebars or different
numbers of boilerplate files, I was going to control the process with
the equivalent of a make file in each text directory. This will just
contain a line for each web page in a format like:
output1.html dtd title head output1.txt nav footer
output1.html dtd title head output1.txt sidebar1 nav footer breadcrumb
That way pages or topics that need different boilerplate contents can be
accommodated. Plus I can detect when files are missing.
At the moment I plan to keep the main text files with their regular html
markup included (at least between the body and the sidebar tags). I'm
planning to generate new pages with a script that prompts for material,
and includes most of the html automatically. That is mainly to cut down
my typos on entering html. I would be editing the resulting txt files
with a regular text editor, or an html editor if I find one I really
like. Lots of my files need regular updates as new material arrives.
One minor departure from the strict sequence of assembling the final
file just by concatenating boilerplate and main is I plan to stick the
title, keywords and description for the head at the end of the main text
files. I'll be generating these all automatically from h1 and the first
paragraph when using the script for new files. Naturally this won't be
perfect so I'll expect to do some editing of them later. The assembly
of the final html will involve extracting the title, keywords and
description from the main file, sticking them in the correct place in
the head, and not passing them along within the body of the text.
At the moment I'm prototyping in Bash shell, mostly on the basis it is
available to me, and I can remember some of it.
Any suggestions about items I have forgotten would be most welcome, as
would suggestions for a better technique. I have looked at a bunch of
the blog generating programs, but they didn't seem to be aimed at what I
want to do.
--
http://www.ericlindsay.com
|