Currently, the World Wide Web is based primarily on documents written in HTML, a language that is useful for describing, with an emphasis on visual presentation, a body of structured text interspersed with multimedia objects such as images and interactive forms. HTML has limited ability to classify the blocks of text on a page, apart from the roles they play in a typical document’s organization and in the desired visual layout.
For example, with HTML and a tool to render it (perhaps Web browser software, perhaps another user agent), one can create and present a page that lists items for sale. The HTML of this catalog page can make simple, document-level assertions such as “this document’s title is ‘Widget Superstore’”. But there is no capability within the HTML itself to unambiguously assert that, say, item number X586172 is an Acme Gizmo with a retail price of €199, or that it is a consumer product. Rather, HTML can only say that the span of text “X586172″ is something that should be positioned near “Acme Gizmo” and “€199″, etc. There is no way to say “this is a catalog” or even to establish that “Acme Gizmo” is a kind of title or that “€199″ is a price. There is also no way to express that these pieces of information are bound together in describing a discrete item, distinct from other items perhaps listed on the page.
The Semantic Web addresses this shortcoming, using the descriptive technologies RDF and OWL, and the data-centric, customizable markup language XML. These technologies are combined in order to provide descriptions that supplement or replace the content of Web documents. Thus, content may manifest as descriptive data stored in Web-accessible databases, or as markup within documents (particularly, in XHTML interspersed with XML, or, more often, purely in XML, with layout/rendering cues stored separately). The machine-readable descriptions allow content managers to add meaning to the content, thereby facilitating automated information gathering and research by computers.
Text based on the Wikipedia article on the semantic web released under the GNU Free Documentation Licence.