Chapter 3: Introduction to Technological Aspects of Privacy
Web Infrastructure: HTTP, HTML, HTTPS and XML
The web is narrower than the internet. Hypertext transfer protocol (HTTP) and Hypertext markup language (HTML), invented by Tim Berners-Lee, drive the web; HTTPS adds encryption, and Extensible markup language (XML) describes data rather than display.
The internet is broader than the web: it also carries email, IP telephony, file sharing, and IoT traffic. Hypertext transfer protocol (HTTP) manages web data communications and defines browser/server behavior, while Hypertext markup language (HTML) authors the pages. Sir Tim Berners-Lee invented both at CERN in the early 1990s, enabling hyperlinking.
HTTPS encrypts the browser-to-website connection and by 2016 exceeded plain HTTP traffic. Extensible markup language (XML) also uses tags, but unlike HTML (which describes how content displays) XML describes the data being produced, enabling automatic high-volume processing and so requiring extra privacy attention.
⚠️ HTML vs XML
Exam trap: HTML describes how content should be displayed; XML describes the data itself. Conflating them is a common error.
Key terms - quick answers
What is “Hypertext transfer protocol (HTTP)”?
An application protocol that formats and transmits messages over a TCP/IP network and defines how servers and browsers respond to commands.
What is “Hypertext markup language (HTML)”?
A content-authoring language used to create web pages; HTML5 is the most recent version and can run media without plug-ins.
What is “HTTPS”?
Hypertext transfer protocol secure, which transfers data between browser and website over an encrypted connection; by 2016 it exceeded HTTP traffic.
What is “Extensible markup language (XML)”?
A language that describes content in terms of the data being produced (not how it is displayed), enabling automated high-volume processing.