The Semanticator

Avdi March 1, 2005 16 Comments

To all authors of weblogging/journaling tools, documentation engines, Wiki engines, etc.:

The tag is a style tag. It conveys no semantic information. It could indicate emphasis, a quote, a citation, an address, or any number of other elements of text. As such, it is bad. As in the opposite of good. As in, your tool should never, ever, under any circumstances automatically gererate or insert this markup without explicit instruction from the user. Even then, it should only do so grudgingly, and it should insult the user on general principle.

The tag should always be used where emphasis is desired.

The same goes for and . Just because most browsers interpret to mean bold, doesn’t mean is shorthand for strong.

Thank you for your cooperation.

Uncategorized

16 Comments

secilliack March 1, 2005 at 20:28

what are these “tags” you speak of, and does one use them?
(i know, my stupidity is ridiculous)
1. secilliack March 1, 2005 at 20:28
 
 HOW does one use them.
 jesus.
 1. avdi March 1, 2005 at 20:56
 
 Tags are the basis of HTML, the language the web is written in. Your browser reads the tags and renders the text appropriatly – eg, if it comes across OMG! what you’ll actually see is: OMG!. HTML tags can also be used when writing LJ entries and comments, which is how I did that example.
 
 Unfortunately, HTML has some older tags in it left over from the bad old days before they’d come up with better ways to style text, who’s sole purpose is to explicitly say “this text should be italic” or “this text should be bold”. Which doesn’t say anything about the actual content of the text. 99% of the time when people use they really mean . But software tools don’t make this any easier. Invariably, when some programmer comes out with a tool for, say, posting to livejournal, it has buttons for italic and bold instead of “emphasis” and “strong emphasis”. I hate this because it makes it impossible for software to understand the intent of the author, which makes it much harder to, for example, automatically speak the text such that a blind person can understand it.
 
 Sorry if this was more than you wanted to know…
 2. secilliack March 2, 2005 at 01:27
 
 awesome.
 i feel really stupid for having to ask, but not so much now that i know.
 thanks :)
 3. avdi March 1, 2005 at 21:02
 
 On a more practical note, some tags you may find useful and/or fun in livejournal:
 
 em – for emphasis
 strong – for more emphasis
 you can nest them for strong emphasis
 blockquote – for quoting lengthy passages, e.g.
 
 ‘Twas brillig and the slithy toves
 Did gyre and gimbal in the wabe
 
 Just enclose the text you want to tag between a begin tag, like , and an end tag, like (note the slash).
spirilis March 1, 2005 at 21:31

Certainly a good point, I always forget about those ;)
Maybe I should bust out my HTML guide and reread it with a baseball bat to my head…
1. avdi March 1, 2005 at 21:48
 
 I can forgive users when they manually use the tags. It’s when tools generate them that it drives me up the wall. The particular straw that broke this camel’s back was a code documentation tool (RDoc, for Ruby) which formats _foo_ as foo, but formats *foo* as foo. It’s not even consistant in it’s tag misuse. And the documentation for the tool says to use underscores for “italic” text, even though it’s really inserting tags. Gah! People who have never cracked the HTML spec should NOT write tools which generate.
 
 I’ve never understood how tools written by programmers, for programmers, could be this wrong. As a programmer I think in semantic terms – when I scribble “do *not* press the red button” I’m not specifically imagining that the word ‘not’ as bold or italic or underlined, I’m just trying to convey the importance of it. And I would expect a tool to format that information appropriately for the output medium, rather than depending on me for styling guidance.
 1. spirilis March 1, 2005 at 21:49
 
 Maybe the programmer was drunk ;)
 2. leleth_faery March 1, 2005 at 22:02
 
 I *hate* it when programs automatically convert things. If I want something bold, italic or underlined, I’ll tell it so!!!!
 
 If the point of an emphasis tag and a strong emphasis tag is for things like a computor to make audible a text, why won’t they be programmed to interpret italic and bold as such? what else are they used for? *thinks* I suppose some people will write everything in italics, and then it isn’t actually requiring emphasis, but that’s an anommaly, I’d think…
 3. spirilis March 1, 2005 at 22:10
 
 The argument dates back to earlier days of HTML. HTML is meant to be a logical structure-oriented language, rather than presentation-oriented. Boldface and Italic are presentation-oriented attributes, rather than indicators of structure. “Strong” and “Em(phasis)” are more logical attributes. I guess the original HTML designers got lazy and added Boldface, Italic, etc. in there for lack of thoroughness in the intended design of the language.
 4. avdi March 1, 2005 at 22:38
 
 It wasn’t the original HTML designers. In the early days of the web Netscape started extending HTML like crazy with markup that only worked in Navigator, in order to entice people to use Navigator.
 5. avdi March 1, 2005 at 22:34
 
 is correct in his history. Understand, though, I’m not talking about having ‘i’ tags magically converted to ’em’ behind the scenes. I’m saying that when I’m writing something that I intend to go on the internet, I don’t WANT an ‘Italic’ button and a ‘Bold’ button like in Word (frankly I don’t want them in Word either, but that’s another story). I want “Em” and “Strong” buttons, because that’s what I’m actually thinking, and I don’t know how it’s going to be read on the other end. Likewise, if a program reformats text with conventions like _emphasis_ and *strong emphasis* into HTML, I want it to use semantic tags, not stylistic tags.
 
 If the point of an emphasis tag and a strong emphasis tag is for things like a computor to make audible a text, why won’t they be programmed to interpret italic and bold as such? what else are they used for?
 
 Off the top of my head, ‘cite’, ’em’, and ‘address’ are all shown as italics by common browsers – but they mean completely different things! If we use ‘i’ for all of them it semantically impoverishes the text, not only making things harder for those using alternative readers, but reducing the opportunities for future programs to do clever things we might not have even thought of with existing pages. For example, here’s an automatically generated page on Mark Pilgrim’s blog which lists every citation he’s made, grouped by the person he cited. That’s not possible without rich semantic markup.
 
 For absolute, pinpoint control of style we have CSS. That’s why it was invented, to get style information back out of HTML.
 6. leleth_faery March 1, 2005 at 22:59
 
 wow. Thanks. that’s neat to know. I’m not sure it’ll drill the i, u, and b tags out of me, though, since I’ve been using those for something like 9 years now…
 7. avdi March 1, 2005 at 23:09
 
 I’m guilty myself of using ‘i’ to quote people, partly because some browsers aren’t even aware of the ‘q’ (quote) tag. It’s not so much individual users typing them out by hand that bugs me, as programmers who should know better creating tools which churn out misused tags.
der_m March 1, 2005 at 23:28

in my own journalling, i usually use italics for emphasis, or if you’d imagine i might lean in as i’m saying it. and i use bold when i might be growling or shouting something. that sorta fits and rather nicely… but i’ve always mistaken for whatever the tag is for “write this like you were a printer from the 80s” and never knew about .
danke :)
1. avdi March 1, 2005 at 23:40
 
 whatever the tag is for “write this like you were a printer from the 80s”
 
 That would be tt. (For TeleType! How could a good UNIX boy not know the markup for teletype? ;-) )

Comments are closed.