gemini/site/comp/uri-ns.scm

125 lines
12 KiB
Scheme

;;; site/comp/uri-ns.scm
(import "util.scm")
(define title "A Draft Specification for Decentralised, URI-based Namespaces")
`(,@(import "header.scm")
(p "Author: snit")
(p "Date: 2025-08")
(p)
(h1 ,title)
(p)
(h2 "Abstract")
(p "This specification defines a new Uniform Resource Identifier(URI) scheme, ns, as an alternative scheme to that of Uniform Resource Names(URN). The intent for this new scheme is to provide a decentralised means to identify namespaced resources. This document details the canonical syntax for the new scheme, and the means by which namespaces may be defined and used.")
(p)
(h2 "Table of Contents")
(ul (li "1. Introduction")
(li "2. Terminology")
(li "3. Grammar")
(li "3.1. Syntax Components")
(li "4. URI Equality")
(li "5. Namespace Shorthand")
(li "6. Security Considerations")
(li "7. Further Considerations")
(li "7.1. Expanded Location")
(li "7.2. Nested Namespaces")
(li "8. Examples")
(li "9. References")
(li "10. Acknowledgements"))
(p)
(h2 "1. Introduction")
(p "A Uniform Resource Name(URN) is a Uniform Resource Identifier(URI) which is designed to be a persistent, location-independent resource identifier. Unfortunately, URNs rely heavily on a centralised namespace registry to consistently group resources and prevent conflicts. As there is no means defined within the specification to create new namespaces, other than by registering with a specific centralised authority, URNs are unfortunately limited solely to organisations with the resources and patience to go through the proper procedures to register a namespace. As a result, only around one hundred namespaces are currently registered, many of which have very little use outside of specific organisations or fields, which may have contributed to URNs' failure to gain widespread adoption.")
(p)
(p "As an alternative, many areas in which namespaces are used, such as XML namespaces, rely on HTTP URIs like http://www.w3.org/1999/xhtml. This is an unsatisfactory solution, and often leads to confusion for beginners. In first learning how namespaces function, many implicitly assume that all HTTP URIs resolve to a specific resource, and carry such assumptions over to the URIs used in namespaces. Alas, the URI may be anything whatsoever, and it need not resolve to anything at all. Ideally, there would exist a URI scheme which identifies namespaced resources in such a decentralised manner as HTTP URIs, whilst simultaneously providing no implication of resolvability. This is where the namespace URI scheme comes in.")
(p)
(p "The namespace URI scheme sacrifices the location-independent, and, indeed, temporal-independent, nature of URNs in favor of decentralised namespacing. With namespace URIs, users point to a location they control on a given date, allowing them to specify any arbitrary namespace they would like, just as in the HTTP URIs from earlier, but without the need for a central registry of namespaces. For example, the following URI could theoretically be used to identify the current section of this document:")
(pre "ns:isekai.rocks:2025-08:ns:introduction")
(p)
(h2 "2. Terminology")
(q "The key words \"MUST\", \"MUST NOT\", \"REQUIRED\", \"SHALL\", \"SHALL NOT\", \"SHOULD\", \"SHOULD NOT\", \"RECOMMENDED\", \"NOT RECOMMENDED\", \"MAY\", and \"OPTIONAL\" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.")
(p)
(h2 "3. Grammar")
(p "The syntax for a namespace URI is defined via the following ABNF grammar:")
(pre "nsURI = \"ns:\" location \":\" date \":\" namespace [ \":\" resource ]"
""
"location = host ; see: RFC3986"
"date = year [ month [ day ] ]"
"namespace = 1*ipchar ; see: RFC3987"
"resource = 1*( ipchar / \"/\" / \"?\" ) ; see: RFC3987"
""
"year = 4DIGIT"
"month = 2DIGIT"
"day = 2DIGIT")
(p)
(p "The location is case-insensitive, but the namespace and resource components are not. A pair of namespaces or resources that differ only in capitalisation MUST NOT be considered equal.")
(p)
(p "The location and date MUST consist of ASCII characters, International domain names MUST be punycode-encoded. The namespace and resource MAY be percent-encoded.")
(p)
(h3 "3.1. Syntax Components")
(p "In general, the namespace URI consists of the following components. Details on each are below.")
(pre "ns:<location>:<date>:<namespace>[:<resource>]")
(p)
(p "The location consists of a host component as defined in RFC3986. When creating a new namespace, the author MUST own the location it falls under, or have prior permission from those that do.")
(p)
(p "The date component MUST refer to a period of time in which the namespace author owns/owned, or had permission from the owners of, the location component. The date MAY be in the past, but MUST NOT be in the future. Date components refer to the entire time range for which they are specified; that is, a date of 20250828 refers to the entire day of August 28th, 2025, 202508 refers to the entire month, and 2025 refers to the entire year. Less specific dates SHALL NOT be considered equal to more specific dates that fall within their range.")
(p)
(p "The namespace is a collection of common resources. The namespace MAY be identified directly, by excluding the resource component. The namespace SHALL be unique for a given location and date; if changes to its definition are needed, at least one of the location, date, or namespace MUST be altered.")
(p)
(p "The resource is a specific object or concept identified within a namespace. The expected structure of a resource for a given namespace is not specified here. The structure SHALL be specified when the namespace is created. So long as the URI remains the same, the resource identified by the URI MUST NOT change.")
(p)
(h2 "4. URI Equality")
(p "Two namespace URIs are considered equal if and only if they consist of the same codepoints when using the same character encodings, after the location and date are converted to lowercase, and after the URIs are percent-encoded.")
(p)
(h2 "5. Namespace Shorthand")
(p "Namespace URIs may get long and unwieldy quickly, especially when nested (see: Section 7.2). Users and software MAY define shorthand for a given namespace. The shorthand SHOULD be of the form:")
(pre "ns:<identifier>[:<resource>]")
(p)
(p "The mechanism by which such shorthand is defined in a given context is unspecified. URIs which use shorthand forms SHOULD be canonicalised before exiting the current context, such as when being transmitted over the network, or passed into external software. Shorthand forms SHALL NOT be assumed to be unique. If two or more namespaces are found to use the same shorthand in a given context, the shorthand(s) MUST be altered to be unambiguous.")
(p)
(h2 "6. Security Considerations")
(p "No means is provided by which a given namespace can be proven to be valid for the given location and date components. Do your research, and prefer namespaces from trusted parties when possible.")
(p)
(p "The lack of central authority on namespaces means that it is up to the users of a namespace to use it correctly and accurately. It is not hard at all to use a namespace incorrectly, whether by ignorance or malice.")
(p)
(h2 "7. Further Considerations")
(p "This specification is a draft. Although it can be used in its current state as-is, a few considerations still need to be made before it could be considered complete.")
(p)
(h3 "7.1. Expanded Location")
(p "One aspect in which this specification is limited is that the location component is restricted to domain names and literal IP addresses. Should this be expanded? Not everyone has a domain, or a static IP address, in which they can create the namespaces they might need. It might be worth expanding the location to include one or more of email, JID, UUID, or PGP key. The three excluding UUID would allow namespaces to be created under a unique user identitiy, although email and JID would have to be disambiguated if both included. The UUID is unwieldy for human use, but perhaps it could be useful in the automated creation of ephemeral namespaces; however, the use-case for this is yet unclear to the author of this document.")
(p)
(p "Even if more options are provided, such options are not defined in a decentralised manner. Each addition would have to be added to this specification manually, with the approval of a central party. This is certainly not an ideal situation, but, if location extensibility is found to be sought after, perhaps Section 7.2 could help.")
(p)
(h3 "7.2. Nested Namespaces")
(p "Section 7.1 of this specification demonstrates how the location component is currently static or limited to centralised extensions. One possible way to work around this is by defining namespaces whose resource component is itself a namespace URI, minus the scheme. For example, if a user wishes to create namespaces under their email address, they could define the following namespace:")
(pre "ns:example.com:2025:email")
(p "Which expects a resource formatted the same as the usual namespace URI, except where the location is an email address instead of a domain/IP:")
(pre "ns:example.com:2025:email:user@example.com:2025:myns")
(p)
(p "One downside of this mechanism is that someone might define a namespace which allows the use of XMPP JIDs, but do so under their email address. Such scenarios quickly get out of hand, even when intentionally trying to keep the URI short:")
(pre "ns:example.com:2025:email:alice@example.com:2025:xmpp:bob@example.com:2025:myns")
(p)
(p "One workaround to make human usage more convenient is the shorthand notation demonstrated in Section 5. If the following mappings are defined:")
(pre "ns:example.com:2025:email -> email"
"ns:email:alice@example.com:xmpp -> xmpp")
(p "Then the earlier example shrinks to become")
(pre "ns:xmpp:bob@example.com:2025:myns")
(p "Which is half the size it used to be.")
(p)
(p "The biggest downside of the shorthand approach is that each shorthand has to be manually defined for every context, and possibly collectively agreed upon to be defined by default for the most common subset. No such defaults will be defined in this document.")
(p)
(h2 "Examples")
(p "TODO: Provide some example use-cases to actually justify this document's existence")
(p)
(h2 "9. References")
(a "https://www.rfc-editor.org/bcp/bcp14" "BCP 14")
(a "https://www.rfc-editor.org/rfc/rfc2119" "RFC2119 - Key words for use in RFCs to Indicate Requirement Levels")
(a "https://www.rfc-editor.org/rfc/rfc2234" "RFC2234 - Augmented BNF for Syntax Specifications: ABNF")
(a "https://www.rfc-editor.org/rfc/rfc3986" "RFC3986 - Uniform Resource Identifier (URI): Generic Syntax")
(a "https://www.rfc-editor.org/rfc/rfc3987" "RFC3987 - Internationalized Resource Identifiers (IRIs)")
(a "https://www.rfc-editor.org/rfc/rfc8141" "RFC8141 - Uniform Resource Names (URNs)")
(a "https://www.rfc-editor.org/rfc/rfc8174" "RFC8174 - Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words")
(p)
(h2 "10. Acknowledgements")
(p "The majority of this specification would not have been possible without so much effort put into previous RFCs by far smarter people. In particular, RFC4151, which covers a somewhat similar use-case as this document, and which was only discovered halfway through writing, was a massive source of inspiration for the finer details of this specification.")
(a "https://www.rfc-editor.org/rfc/rfc4151" "RFC4151 - The 'tag' URI Scheme"))