2. Getting Started

2.1 Main Structural Tags

You create a web page in a tree structure fashion using text files that provide the framework for the content. This tree structure can then be further developed into a matrix of linked pages, not just linking to each other but also to other pages on the Internet. The first page that is encountered by the user when they enter a Universal Resource Locator (URL) such as https://www.rhyshaden.com/ into their web browser, is the top page of the tree for that particular website. The name of this first page can vary depending on the web server configuration, the most commonly used names for this HTML file are index.html, index.htm, default.html and welcome.html. By default, a web browser pointing at a directory will return a directory listing. This is not very secure, so the server would normally be configured with the name of the index file. This index file e.g. index.html, will open if the browser is directed at the directory in which this file resides. The directory is denoted by the final / in the URL that was typed in the client browser.

The first step is to open a text editor such as Notepad (in Windows) and create a new file called, say, Web1.htm. HTML works by way of tag pairs which form containers. A browser looks for these tag pairs and acts on the text contained with the pairs. The parameters within the tags define attributes that instruct the parser within the browser, how to treat that text or the content. There are some tags that do not have 'container terminators' and are standalone tags. One example is the <HR> tag, others include <BR> and <IMG>, we will come on to these later.

Many of the tags can have Attributes, and some of these have values assigned to them. These attributes determine how the tag behaves. The format of an attribute with a value is ATTRIBUTE_NAME = "value". The value can be a number (decimal or hexadecimal denoted by #) or a percentage, or even a word such as a colour. The quotation marks are a requirement of HTML 4.01!

It is possible to nest tags within tags provided that you close the nested tags at the end of the appropriate places. Unknown tags are ignored and are generally displayed on the screen. No matter how many spaces or tabs are entered within the HTML document itself, the parser just translates all of them together as one space. You can add multiple spaces by typing &nbsp the required number of times, this is the character entity for the Non-Breaking Space. We will come across more character entities later. Line returns within the HTML text are ignored by the parser.

It is good practice to annotate the HTML code, particularly if it is complex. You can add comments by typing them between . Note that a space is required after the opening comments delimiter and before the closing comments delimiter! Capitalising tags and attributes adds to the size of the HTML file. It is possible to use utilities to compact your HTML code by stripping out spaces and tabs, enabling the page to be downloaded a little more speedily.

For a web page there are three essential tag pairs:

The tags <HTML> and </HTML> enclose an entire web document telling the browser that this file is for the browser's attention.
The <HEAD> and </HEAD> tags enclose the description of the page and perhaps the heading.
<BODY> and </BODY> tags contain the rest of the page.

If we type these into Notepad like this:

Then, in a Web browser, this turns out like this:

Bloomin' marvellous!

What shall we try next? Let's use the tags <TITLE> and </TITLE> like this:

And guess what this gives us:

Hmmm! OK so the document title appears in the menu bar but it is getting a little boring now!

Don't panic! What's happening is that the <TITLE> tag pair hides the title from view but provides a bookmark for anyone (including web search engines) who wishes to return to this page.

2.2 Headings

Let's get our heading in view by using the heading tags <H1> and </H1>. We will also throw in some centering using <CENTER> and stick a couple of horizontal lines with <HR>.

This gives the following display:

The <HR> tag has the following attributes:

WIDTH - this can take a value in pixels or a percentage of the width of the page.
ALIGN - by default the line is centred on the page, however you can align the line to the LEFT or to the RIGHT if you wish.
NOSHADE - changes the default, bevelled line to a plain line.
SIZE - changes the thickness of the line in pixels.

Using H1 gives the largest size letters in the heading. We can use H2, H3, etc. up to H6 to give progressively smaller headings. The heading tag automatically creates a blank line underneath to separate the heading from the next block of content. Using the <CENTER> tags puts everything contained within the tags, in the centre of the screen (be careful of the American spelling of center!). As an aside, you will notice how the browser reshapes the text as you re-size your browser window.

2.3 Meta Tags

Before we leave this 'heading' part of the HTML document, it is worth looking at the <META> tag. The Meta tag allows you to decide what information is picked up in some search engines on the World Wide Web, it is completely transparent to the user. The following example shows how I might advertise this tutorial on the internet:

<HEAD>
<TITLE>Writing For The Web Tutorial</TITLE>

<META> NAME="description" CONTENT="A tutorial showing you
how to create Web Pages.">

<META> NAME="keywords" CONTENT="WWW, HTML, tags, frames,
tables, links">

</HEAD>

Notice how the META tag sits within the HEAD tags, and also notice how that there is some text or CONTENT associated with a NAME which I called description. Many search engines will grab the title 'Writing For the Web Tutorial' and the content of the description that I wrote down. Both NAME and CONTENT are the attributes.

Another META tag was also created which used the name keywords, and the CONTENT of this is a number of words which are likely to be used within a search. Someone typing one of these words in a search, is more likely to see your site come up in the results. You are allowed to use up to a total of 1024 characters for the keywords.

If you wished for your page not to be indexed by a search engine, then you could use robots as a NAME and this would exclude this particular page from search engines. You would give it the CONTENT value of noindex or nofollow to prevent Internet spiders from indexing or or following links on that particular page.

Only one general rule, you should not use any HTML formatting information within your META tags.

Other meta tag examples include <META NAME="author" CONTENT="Rhys Haden">, <META NAME="copyright" CONTENT="2003, Rhys Haden"> and a recent one <META NAME="MSSmartTagsPreventParsing" CONTENT="TRUE"> which will prevent browsers that have Smart Tags enabled from attaching tags to words and phrases on the website.

You are able to influence HTTP Response header information sent by the web server, by using the meta tags HTTP-EQUIV attribute. For instance, you may wish to inform Internet caching engines, browser caches and web robots that particular content included within a page has a specified expiry date i.e. once content has reached a certain age you want the clients to request updated content. Examples can include stock market information, weather or travel information. You would do this with a line such as <META HTTP-EQUIV="expires" CONTENT="August 10, 2003 12:31:00 UST">. Another example is if you want to inform clients which language a certain page has been written in e.g. <META HTTP-EQUIV="content-language" CONTENT="en-gb">. Also used is the content-type: to inform the browsers of the character sets being used on the website e.g. <META HTTP-EQUIV="content-type" CONTENT="text/html; CHARSET=utf-8">. The values for the HTTP-EQUIV attribute are header names used within the HTTP protocol.

One interesting header that has often been used is the refresh: header. This has been used for instance when a web page has moved from one URL to another. Instead of just deleting a page from a location that users have been used to visiting resulting in a 404 error, you can automatically redirect that user by way of the refresh header. An example would be <META HTTP-EQUIV="refresh" CONTENT="10; URL=http://new.html">. The value 10 means that after 10 seconds the refresh will occur. The URL has been inserted to indicate to the client browser which site to go to. Note how the quote marks surround both the interval value and the URL, plus there is a space after the semi-colon. Leaving out the URL and the semi-colon would result in just the current page getting refreshed, useful if the content on that page changes frequently. You can create a crude animation by cycling through a number of URLs that just contain graphics.

If you did not use META tags then search engines would just grab the title and the first few words to use as a summary.

2.4 Body Tags

The <BODY> tags indicate where the viewable parts of the page begin and end. You can use attributes within the <BODY> tags to affect what happens to the whole of that web page. The following attributes can be used:

ALINK="..." - If you click and hold on a link, it will turn whatever colour you specify here.
VLINK="..." - If you have visited a particular link, it will turn whatever colour you specify here.
LINK="..." - All links appear in the specified colour here until they are clicked on.
BGCOLOR="..." - Specifies the background colour for the page.
TEXT="..." - All text will appear in this colour unless overwritten by style sheets or the <FONT> tags
BACKGROUND="..." - Specifies an image to use as a background to the page. This image is tiled across the screen. There is an additional argument with a value <BGPROPERTIES="FIXED"> which keeps the background in the same place on the screen even when the screen is scrolled.

3. Manipulating Text

Home

Disclaimer