diff options
author | Michał Górny <mgorny@gentoo.org> | 2017-09-14 23:14:39 +0200 |
---|---|---|
committer | Ulrich Müller <ulm@gentoo.org> | 2017-10-09 12:08:51 +0200 |
commit | c6fe2071a2e83be2203196ad7f9459941821a034 (patch) | |
tree | d81e1d9898c05917e05203af9803b581dff0d915 /glep-0068.rst | |
parent | glep-0045: Mark Final since GLEP 1 now uses ISO 8601 dates (diff) | |
download | glep-c6fe2071a2e83be2203196ad7f9459941821a034.tar.gz glep-c6fe2071a2e83be2203196ad7f9459941821a034.tar.bz2 glep-c6fe2071a2e83be2203196ad7f9459941821a034.zip |
Rename all GLEPs to .rst
Diffstat (limited to 'glep-0068.rst')
-rw-r--r-- | glep-0068.rst | 520 |
1 files changed, 520 insertions, 0 deletions
diff --git a/glep-0068.rst b/glep-0068.rst new file mode 100644 index 0000000..36f3dff --- /dev/null +++ b/glep-0068.rst @@ -0,0 +1,520 @@ +GLEP: 68 +Title: Package and category metadata +Version: $Revision$ +Last-Modified: $Date$ +Author: Michał Górny <mgorny@gentoo.org> +Status: Final +Type: Standards Track +Content-Type: text/x-rst +Created: 14-Mar-2016 +Post-History: 16-Mar-2016 +Replaces: 34, 46, 56 +Requires: 67 + +Abstract +======== + +This GLEP specifies the format of files used to describe category and package +metadata (``metadata.xml``). + + +Motivation +========== + +At the moment of writing this GLEP, category and package ``metadata.xml`` +lacked proper specification. PMS Appendix A [#PMS-A]_ specified that +the format of this file is beyond its scope, deferring the specification +to the DTD file. + +The original metadata.dtd file [#METADATA-DTD]_ (the version before cleanups +related to this spec) did not serve well as the specification. Due to +the technical limitations on DTD format, it was both unable to enforce +the specification fully and explain it in a readable form. Furthermore, +it lacked some important details such as the format of ``<pkg/>`` entries. + +Besides that, there were numerous alterations to the format. GLEP 34 added +metadata files for category descriptions, GLEP 46 added upstream information, +GLEP 56 added USE flag descriptions, GLEP 67 altered the maintainer +descriptions. Furthermore, there were additions and removals done without +a formal specification, e.g. addition of slot descriptions. + +Sadly, some of those GLEPs are partially in conflict with other specifications +— for example, the ``<pkg/>`` element as described in GLEP 56 is different +than the one originally proposed and used in metadata.xml. + +Therefore, the motivation for this GLEP is to provide unified, clear +and complete specification for both category-wide and package-wide +metadata.xml files. It is meant to combine previous GLEPs, relevant +discussions and implementation in order to provide the specification that is +closest to the originally intended meaning while preserving best compatibility +with existing tools and data. + + +Specification +============= + +Metadata files +-------------- + +This specification provides two kinds of metadata files: category metadata +files and package metadata files. Both kinds of files use XML file format +with structure defined in this GLEP. The XML structure does not use +a namespace and must not contain any elements outside the scope of this +specification. + +Category metadata files are named ``metadata.xml`` and located inside category +directories in an ebuild repository. Their structure is described +in `Category metadata`_ section. + +Package metadata files are named ``metadata.xml`` and located inside package +directories in an ebuild repository. Their structure is described +in `Package metadata`_ section. + +Text data +--------- + +The following text data types are used: +- text data, +- multi-line text data. + +In case of text data, all whitespace inside the element is normalized +(consecutive whitespace sequences are replaced by a single SP). Trailing +and leading whitespace is stripped. + +In case of multi-line text data, all whitespace except for newline characters +is normalized. Newlines are used to delimit lines of text. Leading +and trailing lines of text that are either empty or consist purely of +whitespace are stripped. Afterwards, the whitespace belonging to +the indentation common to all non-empty lines of text is stripped. + +Optionally, interspersing text with ``<cat/>`` and ``<pkg/>`` elements can be +allowed. In this case, ``<cat/>`` element is used to reference a category +inside the repository, and must contain a valid category name. ``<pkg/>`` +is used to reference a package, and must contain a valid qualified package +name. + +Common attributes +----------------- + +The following common attributes are allowed on multiple elements: +- language specifiers, +- restriction specifiers. + +Language specifiers are used whenever an element supports variants +in different languages. In this case, each occurrence of the element may +contain an optional ``lang=""`` attribute that contains a ISO 639-1 language +code. In case no ``lang=""`` attribute is provided, an implicit default +of ``en`` is assumed. + +Restriction specifiers are used whenever an element supports restricting to +specific package versions. In this case, each occurence of the element may +contain an optional ``restrict=""`` attribute that contains an EAPI 0 +dependency specification that has to match one or more versions of the +package. In this case, the metadata provided by the element applies only to +the package versions matching the restriction. + +Category metadata +----------------- + +The category metadata file uses ``<catmetadata/>`` top-level element. This +element can contain, in any order: + +- zero or more ``<longdescription/>`` elements containing category + descriptions in different languages (at most one for each language). + The category description is formed of multi-line text, optionally + interspersed with ``<cat/>`` and ``<pkg/>`` elements. + +Package metadata +---------------- +Top-level structure +~~~~~~~~~~~~~~~~~~~ +The package metadata file uses ``<pkgmetadata/>`` top-level element. This +element can contain, in any order: + +- zero or more ``<longdescription/>`` elements containing package descriptions + in different languages, possibly restricted to specific package versions + (at most one for each combination of language and package version). + The package description is formed of multi-line text, optionally + interspersed with ``<cat/>`` and ``<pkg/>`` elements. + +- zero or more ``<maintainer/>`` elements listing package maintainers, + optionally restricted to specific package versions. The maintainer format + is detailed in `Maintainer descriptions`_. + +- zero or more ``<slots/>`` elements containing slot descriptions in different + languages (at most one for each language), as detailed + in `Slot descriptions`_. + +- zero or more ``<use/>`` elements containing USE flag descriptions + in different languages (at most one for each language), as detailed + in `USE flag descriptions`_. + +- at most one ``<upstream/>`` element providing information on upstream + of the package, as detailed in `Upstream descriptions`_. + +Maintainer descriptions +~~~~~~~~~~~~~~~~~~~~~~~ +Each ``<maintainer/>`` element describes a single maintainer. + +The ``<maintainer/>`` element has an obligatory ``type=""`` attribute whose +value can be either ``person`` or ``project``. + +The ``<maintainer/>`` element contains the following elements, in any order: + +- exactly one ``<email/>`` element that contains the maintainer's e-mail + address (used as unique identifier), + +- at most one ``<name/>`` element that contains the maintainer's + human-readable name (real name or nickname), + +- zero or more ``<description/>`` elements that explain the role + of the maintainer in different languages (at most one ``<description/>`` + for each language). + +Slot descriptions +~~~~~~~~~~~~~~~~~ +Each ``<slots/>`` element describes slots of a package (in specific language). + +The ``<slots/>`` element can contain the following elements: + +- zero or more ``<slot/>`` elements describing specific ebuild slots + (at most one for each slot name). + The ``<slot/>`` element contains an obligatory ``name=""`` attribute stating + the slot to which the description applies, and contains slot description as + text. Alternatively, a slot name of ``*`` can be used to indicate a single + description applying to all slots (no other ``<slot/>`` elements may be used + in this case). + +- at most one ``<subslots/>`` element describing the role of subslots (all + of them) as text. + +USE flag descriptions +~~~~~~~~~~~~~~~~~~~~~ +Each ``<use/>`` element describes USE flags of a package (in specific +language). + +The ``<use/>`` element can contain the following elements: + +- zero or more ``<flag/>`` elements describing specific USE flags, optionally + restricted to specific package versions (at most one entry for a combination + of USE flag name and package version). The ``<flag/>`` element contains + an obligatory ``name=""`` attribute stating the name of the USE flag to + which the description applies, and contains text, optionally interspersed + with ``<cat/>`` and ``<pkg/>`` elements. + +Upstream descriptions +~~~~~~~~~~~~~~~~~~~~~ +The ``<upstream/>`` element provides information on the upstream of a package. +It contains the following elements: + +- zero or more ``<maintainer/>`` elements listing package's upstream + maintainers, as described in `Upstream maintainer descriptions`_, + +- at most one ``<changelog/>`` element containing URL to an on-line copy + of upstream changelog, + +- zero or more ``<doc/>`` elements containing URLs to on-line copies + of upstream documentation in different languages (at most one for each + language), + +- at most one ``<bugs-to/>`` element containing upstream bug reporting URL, + that can optionally be a ``mailto:`` URL, + +- zero or more ``<remote-id/>`` elements listing package identities on package + identification trackers. Each of those elements has an obligatory + ``type=""`` attribute that matches a pre-defined name of package + identification tracker, and a value that is an identifier specific to + the tracker. The list of available trackers and their specific identifiers + are outside scope of this specification. + +Upstream maintainer descriptions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Each ``<maintainer/>`` element inside ``<upstream/>`` describes a single +upstream maintainer. + +The ``<maintainer/>`` element has an optional ``status=""`` attribute whose +value can be either ``active`` or ``inactive``. If not specified, an implicit +``unknown`` value is assumed. + +The ``<maintainer/>`` element has the following attributes, in any order: + +- at most one ``<email/>`` element that contains the maintainer's e-mail + address, + +- exactly one ``<name/>`` element that contains the maintainer's + human-readable name (real name or nickname). + + +Rationale +========= + +Information sources +------------------- + +The basic source of information on current metadata.xml format was +``metadata.dtd`` as of 2016-03-02 [#ORIGINAL-METADATA-XML]_. Whenever the DTD +was unclear, appropriate GLEPs were referenced in order to deduce the original +intent. Whenever the GLEPs were unclear or the elements missed GLEPs, original +mailing list discussions were referenced. + +Removed elements +---------------- + +Compared to the original DTD, the following elements were removed (both +in the spec and in the updated DTD file): + +- package-scope ``<changelog/>`` element was removed. It dates back to the + original metadata.xml proposal [#ORIGINAL-METADATA-XML]_ but it was never + implemented — instead, plain text ChangeLogs were used. Furthermore, + GLEP 46 introduced ``<changelog/>`` inside ``<upstream/>`` with + different type which collided with the global declaration due to DTD + limitations. + +- package-scope ``<natural-name/>`` element was removed. It was available for + 1.5yr and after that time, it reached four packages providing it and no + known tool supporting/using it. It was used only to provide a copy of + package name with correct case (e.g. libressl -> LibreSSL), therefore + the information provided by it was considered redundant. + +- top-level ``<packages/>`` variant was removed. It was never used and it was + really unclear what its use would be. In any case, this made the DTD + simpler. + +<pkg/> value format +------------------- + +A debate on valid format of ``<pkg/>`` element values preceded the writing of +this GLEP. The DTD did not specify a value format restriction on this, only +suggested that it is used *for cross-linking*. Further on, GLEP 56 redefined +its value to *a valid CP or CPV*. The practical uses did not include +the latter case; however, it was common to include EAPI 1 slot specifiers or +even EAPI 5 slot operators following the qualified package names. + +After finding the Doug Goldstein's blog post on introduction of <pkg/> +elements [#USE-FLAG-METADATA]_, it turned out that the original intent was to +*allow cross-linking/referencing from packages.gentoo.org*. Since the latter +uses qualified package names as identifiers, it was decided to restrict +``<pkg/>`` elements to reference those. For entries that include slot +specifiers, it is recommended to move the slot specifiers out of ``<pkg/>`` +element. + +Language identifiers +-------------------- + +Originally, the DTD used implicit default value of ``C``. However, this value +was not in line with real language specifiers found in ``metadata.xml``. +The latter usually took form of ISO 639-1 language codes which do not form +a valid (complete) locale identifiers, while the former is not a valid +language identifier in any of the considered standards. Furthermore, since +``en`` was commonly used to identify English in metadata.xml files, +and no tools relied on the implicit default defined in the DTD, it was decided +to change the implicit default to ``en``. + +Package restrictions +-------------------- + +Originally, the DTD described the ``restrict=""`` attribute as: *the format +of this attribute is equal to the format of DEPEND lines in ebuilds.* This +specification is based upon this definition. However, for practical reasons it +added three clarifications to it: + +- only package dependency specifications are allowed (i.e. no USE-conditionals + or multiple dependency specifications), + +- only EAPI=0 dependency specifications are allowed, since ``metadata.xml`` + provides no EAPI identification mechanism and it predates EAPI, + +- only dependencies referencing the same package are allowed. + +Furthermore, DTD added a special case for ``*`` value that *applies if there +are no other tags that apply*. This behavior was not used at all, and being +at least a bit confusing (compared to the common use of ``*`` to imply +matching everything), it was removed. + +Upstream block +-------------- + +The upstream block was defined by GLEP 46. However, this GLEP is ambiguous +at the best. Tiziano Müller (one of the original authors) has explained +the intent behind most of the elements of the GLEP. + +In particular, he confirmed that the GLEP lists all elements that are allowed +explicitly, and no implicit inclusions were meant to be allowed. This means +that the ``<maintainer/>`` element does not allow a ``<description/>``. + +He also confirmed that unless noted otherwise, elements were not allowed to +be used more than once. This affects ``<bugs-to/>`` and ``<changelog/>`` +elements. Repetitions of ``<doc/>`` were only allowed because DTD technically +didn't permit restricting them while allowing uses of different languages. + +At the time of writing this GLEP, only a single Gentoo package was using +multiple ``<bugs-to/>`` elements, and no packages were using multiple +``<changelog/>`` or ``<doc/>`` elements (or non-English docs). For this +reason, this GLEP enforces the original intent of *at most one* element. + +Rationale for upstream maintainer descriptions +---------------------------------------------- + +The proper contents of the ``<maintainer/>`` elements in ``<upstream/>`` +blocks were unclear in the DTD since the technical file format limitation +implied that all elements and attributes added for the Gentoo maintainers +also applied to upstream maintainers, and vice versa. + +The comments in the DTD clearly separated attributes between the two — +i.e. stated that the ``type`` attribute is used only for Gentoo maintainers, +while the ``status`` attribute is used only for upstream maintainers. However, +package version restrictions and maintainer descriptions were also implicitly +allowed on them. Since neither of the two was allowed by GLEP 46, this +specification disallows them. + + +Backwards Compatibility +======================= + +This specification does not introduce any new elements or attributes compared +to the current DTD. Therefore, all ``metadata.xml`` files created in its +compliance will be read correctly by the existing tools and will conform +to the current DTD. + +However, this specification is more strict than the rules enforced by the DTD. +Therefore, not all existing ``metadata.xml`` will be conforming to the spec, +even though they would be correct according to the DTD. New tools will +consider the files incorrect and request developers to fix them. + + +Reference implementation +======================== + +Parsing metadata.xml +-------------------- + +Since the metadata.xml format provided by this specification is compatible +with existing tool, no new implementation is required for reading those files. + +Checking metadata.xml validity +------------------------------ + +To provide more strict checking of metadata.xml files, XML schema file is +provided in the Gentoo xml-schema repository [#XML-SCHEMA]_. This schema +provides: + +- element structure checks, + +- data duplication checks (e.g. multiple descriptions for the same flag + but see below), + +- partial value correctness checks. + +The limitations of the schema are: + +- values are verified using simple regular expressions, so not all format + violations will be caught (e.g. the rule will consider ``app-foo/bar-1`` + a valid qualified package name when the version suffix is disallowed), + +- cross-references can not be checked (package references, category + references, URLs, project identifiers), + +- ``<maintainer type=""/>`` correctness can not be checked, + +- data duplication checks are done per ``restrict=""`` value rather than + per every package version matched by the restriction. Therefore, multiple + definitions that are applied to a single package by two different + ``restrict=""`` rules will not be caught. + +Example metadata.xml file +------------------------- + +.. code:: xml + + <?xml version='1.0' encoding='UTF-8'?> + <pkgmetadata> + <maintainer type='person'> + <email>developer@example.com</email> + <name>Example Developer</name> + </maintainer> + <maintainer type='project'> + <email>project@example.com</email> + <name>Example Project</name> + </maintainer> + <maintainer type='person'> + <email>upstream@example.com</email> + <name>Upstream Developer</name> + <description>Upstream developer, wishing to be CC-ed on bugs</description> + </maintainer> + <longdescription> + First paragraph of extensive description. + + Second paragraph. + </longdescription> + <longdescription lang='de'> + Erster Absatz mit detaillierter Beschreibung. + + Zweiter Absatz. + </longdescription> + <slots> + <slot name='11'>Compatibility slot providing libfoo.so.11 only.</slot> + <subslots> + Match SONAME of libfoo.so. + </subslots> + </slots> + <slots lang='de'> + <slot name='11'>Kompatibilitäts-Slot, installiert ausschließlich libfoo.so.11.</slot> + <subslots> + Subslot ist stets identisch mit dem SONAME von libfoo.so. + </subslots> + </slots> + <use> + <flag name='foo'>Enables foo feature</flag> + <flag name='bar' restrict='<dev-libs/foo-12'>Enables bar feature (requires <pkg>dev-libs/bar</pkg>)</flag> + <flag name='bar' restrict='>=dev-libs/foo-12'>Enables bar feature</flag> + </use> + <use lang='de'> + <flag name='foo'>Konfiguriert das Paket mit Unterstütztung für foo</flag> + <flag name='bar' restrict='<dev-libs/foo-12'>Konfiguriert das Paket mit Unterstütztung für bar (benötigt <pkg>dev-libs/bar</pkg>)</flag> + <flag name='bar' restrict='>=dev-libs/foo-12'>Konfiguriert das Paket mit Unterstütztung für bar</flag> + </use> + <upstream> + <maintainer status='active'> + <email>upstream@example.com</email> + <name>Upstream Developer</name> + </maintainer> + <maintainer status='inactive'> + <!-- e-mail unknown --> + <name>John Smith</name> + </maintainer> + <changelog>http://www.example.com/releases.html</changelog> + <doc>http://www.example.com/doc.html</doc> + <doc lang='de'>http://www.example.com/doc.de.html</doc> + <bugs-to>http://www.example.com/issues.html</bugs-to> + <remote-id type='foohub'>example/foo</remote-id> + </upstream> + </pkgmetadata> + +German translations provided by tamiko. + + +References +========== + +.. [#PMS-A] PMS Appendix A + https://projects.gentoo.org/pms/5/pms.html#x1-163000A + +.. [#METADATA-DTD] The original metadata.dtd file + https://gitweb.gentoo.org/data/dtd.git/tree/metadata.dtd?id=a908a93b5afe295359e0a01814c9bef8b5268bcd + +.. [#ORIGINAL-METADATA-XML] The original metadata.xml proposal (gentoo-dev) + http://thread.gmane.org/gmane.linux.gentoo.devel/9663 + +.. [#USE-FLAG-METADATA] Doug Goldstein: USE flag metadata + https://cardoe.wordpress.com/2007/11/19/use-flag-metadata/ + +.. [#XML-SCHEMA] Gentoo XML schema + https://gitweb.gentoo.org/data/xml-schema.git/ + + +Copyright +========= + +This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 +Unported License. To view a copy of this license, visit +http://creativecommons.org/licenses/by-sa/3.0/. |