summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMatthew Marchese <maffblaster@gentoo.org>2017-06-28 19:06:44 +0000
committerMatthew Marchese <maffblaster@gentoo.org>2017-06-28 19:06:44 +0000
commit1046c75310ffeaae45f1b0486029ba6c1376eb2a (patch)
treea69934abdc2191dab7462e9824fb00b175e16a4f
parent/* QA check logic */ Add whitespace in attempt to correct margins around the ... (diff)
downloadglep-1046c75310ffeaae45f1b0486029ba6c1376eb2a.tar.gz
glep-1046c75310ffeaae45f1b0486029ba6c1376eb2a.tar.bz2
glep-1046c75310ffeaae45f1b0486029ba6c1376eb2a.zip
Light editing: Sentence case for section headers. Whitespace.
-rw-r--r--GLEP:73.mw79
1 files changed, 66 insertions, 13 deletions
diff --git a/GLEP:73.mw b/GLEP:73.mw
index 73e51ad..55e9196 100644
--- a/GLEP:73.mw
+++ b/GLEP:73.mw
@@ -7,10 +7,13 @@
}}
==Abstract==
+
This GLEP proposes using automated solving to satisfy REQUIRED_USE constraints. It lists the problems with the current handling of REQUIRED_USE, and explains how auto-solving would solve them. It specifies the algorithms that can be used to verify the constraints, automatically solve them and check whether they can be solved. It provides the rationale for all the design decisions, and considers the compatibility with the PMS, and between the constraints and the package managers before and after the GLEP is used.
==Motivation==
+
===The issues with REQUIRED_USE===
+
[https://projects.gentoo.org/pms/6/pms.html#x1-910008.2.7 REQUIRED_USE] has been introduced in EAPI 4 as a solution to the problem of enforcing specific relations between USE flags. According to the [https://dev.gentoo.org/~zmedico/portage/doc/ch06s03s05.html#package-ebuild-eapi-4-metadata-required-use Portage documentation on REQUIRED_USE], it has been specifically targeted as a more data-oriented and machine-friendly alternative to verifying the validity of USE flag choice in ebuild phases.
At the moment of writing, REQUIRED_USE is used in around 25% of the ebuilds in Gentoo. It is an obligatory part of some eclasses, e.g. in the Python ecosystem. Its uses include improving clarity of user choices, simplifying ebuilds via copying upstream feature dependencies and enforcing valid data for USE dependencies. Nevertheless, a number of developers raise strong arguments against using REQUIRED_USE.
@@ -23,9 +26,11 @@ The commonly noted disadvantages of REQUIRED_USE are:
Those arguments have resulted in a number of developers avoiding REQUIRED_USE. For example, [https://wiki.gentoo.org/wiki/Project:Qt/Policies#Handling_different_versions_of_Qt the Qt team policies] discourage using it unless absolutely necessary. The attempts of avoiding REQUIRED_USE sometimes result in suboptimal descriptions of USE flags or even inconsistent use of them.
===The providers problem===
+
A very specific case of a problem where REQUIRED_USE has some use is the ''providers'' problem. That is, whenever a package has a feature that can be supplied by more than one library of choice, and the user needs to choose between the providers. The exact form of this problem depends on the number of providers and whether the feature is optional.
The commonly used solutions include:
+
* Using one or more binary flags to toggle between the providers (with number of the flags < number of providers). This is most readable with only two providers, e.g. with ''USE=libressl'' meaning ''use LibreSSL instead of OpenSSL'', and ''USE=-libressl'' meaning ''use OpenSSL''. For packages with optional SSL/TLS feature, there is also an additional ''USE=ssl'' to toggle that feature, and with ''USE=-ssl'', the ''libressl'' flag is meaningless (ignored). This is usually the least intrusive method but it's unreadable and causes the flags to be confusing.
* Using unary flags for providers along with REQUIRED_USE. In this case, each provider gets an explicit flag and REQUIRED_USE is used to force selecting exactly one of them. For optional feature, there is either an additional feature flag or it is disabled when all providers are disabled. This is usually the most readable solution although it frequently requires adjusting flags.
* Using unary flags without REQUIRED_USE. In this case, if user selects more than one provider (or does not select any), the package decides which one is preferred and uses that. For optional feature, again there could be either an additional feature flag or it could be disabled by disabling all the providers. This is less intrusive than the previous solution but it's less readable (it is unclear which provider is actually used) and unsuitable for USE dependencies.
@@ -33,7 +38,8 @@ The commonly used solutions include:
As noted, all of the mentioned solutions have their specific disadvantages. This causes different developers to use different solutions for specific problems. Sometimes, it could go as bad as to have more than one solution applied to a single problem, or different concepts used inconsistently by different developers.
===Automatic solving as the solution===
-This GLEP focuses on the idea of establishing automated solving of REQUIRED_USE as a solution to the forementioned issues. In this context, REQUIRED_USE is extended not only to specify what combinations of USE flags are valid but also how to proceed from a disallowed flag set to one that satisfies the constraints.
+
+This GLEP focuses on the idea of establishing automated solving of REQUIRED_USE as a solution to the aforementioned issues. In this context, REQUIRED_USE is extended not only to specify what combinations of USE flags are valid but also how to proceed from a disallowed flag set to one that satisfies the constraints.
This clearly resolves the first two issues with REQUIRED_USE. Since REQUIRED_USE mismatches are solved automatically, there is no explicit user interaction required. No changes are done in configuration files — since the solving is meant to be deterministic, the package manager can recalculate the effective USE flag set using the input USE flag set and the REQUIRED_USE constraint.
@@ -44,7 +50,9 @@ Solving the most common problems with REQUIRED_USE makes it possible to extend i
Furthermore, the non-intrusive version of REQUIRED_USE could be used extensively to conditionally mask meaningless flags and map equivalent flag sets into a single common set of choice. This can further improve readability (by making flags clearly indicate what it used, e.g. by disabling all SSL/TLS provider flags when SSL/TLS is disabled) and improve compatibility between binary packages (by reducing the number of incompatible USE flag sets).
==Specification==
+
===Restrictions on REQUIRED_USE format===
+
The REQUIRED_USE format is defined by [https://projects.gentoo.org/pms/6/pms.html the PMS]. This specification requires the following additional restrictions being enforced:
* An any-of group (||), at-most-one-of (??) and an exactly-one-of (^^) group can contain only flat USE flag items. In particular, no other group can be nested inside it.
@@ -57,7 +65,9 @@ As a result, unlimited nesting is allowed only for use-conditional groups. All o
* keeping the specification and implementation relatively simple.
===The algorithm for satisfying REQUIRED_USE constraints===
+
====Processing algorithm====
+
The existing package managers have to validate REQUIRED_USE constraints while evaluating the dependency graph. The current validation action is replaced by the following algorithm:
# Check whether the REQUIRED_USE constraint is satisfied by the USE flags enabled by the current user configuration. If it is, accept the package (the algorithm stops).
@@ -67,9 +77,11 @@ The existing package managers have to validate REQUIRED_USE constraints while ev
# If the attempt at solving failed, report a REQUIRED_USE mismatch and abort.
====REQUIRED_USE verification algorithm====
+
The verification algorithm is implied by the meanings of REQUIRED_USE constructs as defined by the PMS. It is repeated here for completeness and for reuse in further algorithms.
The REQUIRED_USE constraint is considered satisfied if ''all'' the top-level items evaluate to true. An item evaluates to true if, depending on the item type:
+
* A '''USE flag name''' that is not prefixed by an exclamation mark evaluates to true if the named flag is enabled. Accordingly, a USE flag name that is prefixed by an exclamation mark evaluates to true if the named flag is disabled.
* For a '''USE-conditional group''' the condition needs to be tested first (according to the same rule). If the condition evaluates to true, the USE-conditional group is true only if all items in it evaluate to true. If the condition evaluates to false, the USE-conditional group always evaluates to true and the items inside it need not to be tested.
* An '''any-of group''' (||) evaluates to true if at least one of the items in it evaluates to true.
@@ -77,6 +89,7 @@ The REQUIRED_USE constraint is considered satisfied if ''all'' the top-level ite
* An '''at-most-one-of group''' (??) evaluates to true if at most one of the items in it evaluates to true.
====Constraint group reordering algorithm====
+
The constraint solving algorithm is built on ''prefer leftmost'' assumption for all any-of, exactly-one-of and at-most-one-of groups. That is, if the constraint is not satisfied by the current set of enabled USE flags, the algorithm prefers enforcing the leftmost constraints and disabling rightmost.
Due to different system profiles, it might be impossible to automatically solve the constraint using the leftmost flag specified by ebuild (e.g. when it is masked). In order to account for this, the specification provides a group reordering (sorting) phase before the solving algorithm.
@@ -84,6 +97,7 @@ Due to different system profiles, it might be impossible to automatically solve
The reordering applies to any-of, exactly-one-of and at-most-one-of groups. Per the format restriction, each group can only contain flat USE flags.
For each of the items in the group, if the item names a forced/masked USE flag:
+
* if the item evaluates to true according to the flag's value, it is moved to the leftmost position in the group,
* if the item evaluates to false according to the flag's value, it is moved to the rightmost position in the group,
@@ -92,6 +106,7 @@ Relative positions of multiple forced/masked flags are of no relevance since tho
This reordering ensures that if a flag is forced, it is always preferred over other choices; and if it is masked, it is never preferred. This makes it possible to easily account for all possible cases without having to provide a detailed algorithm to handle various possible results.
====REQUIRED_USE solving algorithm====
+
If the REQUIRED_USE constraint is not satisfied according to the initial set of USE flags implied by the configuration, the package manager attempts to alter the USE flags according to REQUIRED_USE.
Before solving, a set of '''immutable flags''' is determined based on forced and masked USE flags. If a flag is either forced or masked, it is marked immutable and the algorithm can not alter its value. If a particular rule would cause the flag to be altered, the solving is aborted and an error is reported.
@@ -99,6 +114,7 @@ Before solving, a set of '''immutable flags''' is determined based on forced and
The solving algorithm is applied at least once, and the REQUIRED_USE is rechecked after each application. The package manager may support running multiple iterations of the algorithm, in which case it needs to either limit the allowed number of iterations or abort after obtaining one of the values previously given by the algorithm (hitting an infinite loop).
In order to enforce REQUIRED_USE, each top-level item in REQUIRED_USE that did not evaluate to true needs to be enforced. All items are enforced in order, left to right. Depending on the item type, enforcing implies:
+
* For a '''USE flag name''' that is not prefixed by an exclamation mark, the named flag is enabled. If it is prefixed by an exclamation mark, the named flag is disabled.
* For a '''USE-conditional group''', the condition (LHS) is evaluated first. If the condition evaluates to true, all the items inside the group are enforced, in order. If it evaluates to false, the group is skipped.
* For an '''any-of group''' that did evaluate to false, the first (left-most) item in the group is enforced.
@@ -107,9 +123,10 @@ In order to enforce REQUIRED_USE, each top-level item in REQUIRED_USE that did n
The negative enforcing action can be applied to plain '''USE flag names''' only. If the name is not prefixed by an exclamation mark, then the flag is disabled. If the name is prefixed by an exclamation mark, it is enabled appropriately.
-
===QA checks to verify REQUIRED_USE solutions===
+
====Context to QA checks====
+
All of the QA checks are performed in context of a specific set of forced and masked USE flags, called ''immutable flags''. All of the checks need to be repeated for every set. Since they can alter the preferences inside any-of, at-most-one-of and exactly-one-of groups, it may also be necessary to perform a separate transformation for each set.
The complete set of immutable flag combinations can be obtained using the following algorithm:
@@ -123,6 +140,7 @@ The complete set of immutable flag combinations can be obtained using the follow
Afterwards, all checks should be performed for all unique values in the result set.
====Requirements for REQUIRED_USE constraints====
+
In order to verify the ability to solve REQUIRED_USE reliably, the QA check tools should ensure that the following conditions are met:
# no valid combination of USE flags can result in the constraint requesting the same flag to be simultaneously both enabled and disabled;
@@ -130,11 +148,13 @@ In order to verify the ability to solve REQUIRED_USE reliably, the QA check tool
# no constraint in REQUIRED_USE may alter flags in such a way that any of the constraints preceding it would start to apply and change the resulting flags in a second iteration.
====Concept for transforming REQUIRED_USE into implications====
+
The algorithms used to verify REQUIRED_USE rely on them being expressed in a ''flat implication form''. In this form, the constraints are expressed as zero or more ''implications''. Each implication specifies zero or more conjunctive ''conditions'', and one or more ''effects''. It is equivalent to a nested USE-conditional group. If all of the ''conditions'' are met, the ''effects'' are applied.
If a constraint is valid, then the solutions of its transformation are the same as of the original.
By idea, the transformation consists of the following steps:
+
# Reordering all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups according to the [[#Constraint_group_reordering_algorithm|reordering algorithm]].
# Replacing all any-of (||), at-most-one-of (??) and exactly-one-of (^^) groups according to the following transformations:
#* <code>^^ ( a b c… )</code> → <code>|| ( a b c… ) ?? ( a b c… )</code>,
@@ -152,6 +172,7 @@ Nodes and edges are numbered to explain the ordering. Furthermore, the final (ef
[[File:ReqUseExampleGraph.svg|framed|Example graph for REQUIRED_USE]]
Traversing this graph produces the following paths, in order:
+
# '''a<sub>(1)</sub>'''
# b<sub>(2)</sub> → c<sub>(3)</sub> → '''d<sub>(4)</sub>'''
# b<sub>(2)</sub> → c<sub>(3)</sub> → '''!b<sub>(5)</sub>'''
@@ -159,6 +180,7 @@ Traversing this graph produces the following paths, in order:
# b<sub>(8)</sub> → '''g<sub>(9)</sub>'''
Those paths are roughly equivalent to the following USE-conditional group constructs:
+
# <code>a</code>
# <code>b? ( c? ( d ) )</code>
# <code>b? ( c? ( !b ) )</code>
@@ -168,6 +190,7 @@ Those paths are roughly equivalent to the following USE-conditional group constr
Except that the value of ''b'' for constraint 4 is considered from the initial value rather than the one possibly altered by constraint 3. Constraint 5 uses a separate condition, and so uses the new value of ''b''.
====Algorithm for transforming REQUIRED_USE into implications====
+
Steps 2 through 4 of the fore-mentioned transformation can be performed using the following recursive function. It should be applied to every top-level REQUIRED_USE item, in order.
It should be noted that for the purpose of distinguishing separate branches, all the condition objects need to have an unique identity. In Python this occurs naturally via instantiating an object. In other languages an explicit unique identifier may need to be included.
@@ -246,10 +269,13 @@ In order to determine the final set of flags ''F<sub>n</sub>'', with specific se
** If the condition ''C<sub>i</sub>'' always evaluates to true, update ''F'' with ''E<sub>i</sub>'' (''F<sub>i+1</sub> = F<sub>i</sub> ∪ {E<sub>i</sub>} ∖ {¬E<sub>i</sub>}'').
==Rationale==
+
===Restrictions for allowed REQUIRED_USE syntax===
+
The specification imposes a number of arbitrary restrictions to REQUIRED_USE syntax, in particular by restricting the possible nesting and disallowing other complex constructs. The main goal is to simplify the algorithms used and make the results more obvious. This is at cost of prohibiting constructs that are rarely used, and usually could be replaced by simpler and more readable constructs.
====Nested any-of, at-most-one-of, exactly-one-of groups====
+
The first and most important restriction is that nesting of any-of, at-most-one-of and exactly-one-of groups is forbidden. While technically such constructs could work, some of them are not really meaningful and others are really confusing. At the time of writing, nested ||/??/^^ groups were used in exactly two Gentoo packages. The specific uses were:
# app-admin/bacula: <code>|| ( ^^ ( mysql postgres sqlite ) bacula-clientonly )</code>
@@ -266,6 +292,7 @@ The second use is much more confusing. It means that both ''gl3plus'' and either
As can be seen, in both cases the alternative constructs were both more readable and shorter than the nested expressions. In the first case, it is also the more natural way of expressing the problem. While replacing expressions that have more than two subexpressions would be harder, there were no uses of such expressions so far, and the potential ambiguity makes them unlikely to appear.
====All-of groups====
+
The second restriction imposed by this GLEP is disallowing all-of groups. The PMS allows them anywhere but in reality they are only meaningful inside ||, ?? and ^^ groups (elsewhere they do not have any effect, and can be inlined into parent block). Inside those groups, they imply that the item is considered matched only if all items inside the all-of group match.
The meaning of all-of groups inside || is pretty clear. However, inside ?? and ^^ some confusion may occur. In particular, for a general case of:
@@ -297,6 +324,7 @@ The third case is potentially valid. It indicates that either ''deprecated'' or
That is, if user enables ''gtk3'' and ''gtk3'' requires ''introspection'', then it seems more reasonable to enable ''introspection'' than to ignore the ''gtk3'' flag and force ''deprecated'' module instead.
====USE-conditionals inside ||, ??, ^^ groups====
+
The last restriction forbids using USE-conditional groups inside any-of, at-most-one-of and exactly-one-of groups. Those indicate that some of the items inside the group are to be considered its members only if the relevant flags are enabled. They are logically equivalent to all-of groups, i.e. <code>|| ( foo? ( bar ) ... )</code> and <code>|| ( ( foo bar ) ... )</code>, except they have a different semantic — the latter form suggests enabling both flags, the former suggests considering ''bar'' only if ''foo'' is already enabled.
Supporting USE-conditional groups properly would most likely require splitting the parent group into multiple variants for different initial values of USE conditionals. Considering the above equality, it would also be inconsistent with the ban on all-of groups. Finally, those groups have little real value.
@@ -307,11 +335,12 @@ The only use case in Gentoo was in media-video/mpv:
It indicates that the OpenGL video output requires selecting one of the variants, with the ''libmpv'' variant being allowed only without CLI enabled. While this may be technically valid, it is confusing. Furthermore, other REQUIRED_USE constraints already require that either ''cli''' or ''libmpv'' is enabled, making ''!cli'' imply ''libmpv''. Therefore, the USE-conditional in the constraint is redundant.
-
===Solving algorithm===
+
The solving algorithm attempts to enforce REQUIRED_USE in the most natural way, interpreting the constraints as developer suggestions on how to make the constraint apply.
====Application of different types of constraints====
+
The algorithm aims to solve mismatched constraints in the most natural way, presuming that this interpretation is the most likely to be correct.
For the USE-conditional groups, it assumes that they mean ''if X is true, then Y should also be true''. Appropriately, the algorithm does not alter the flag in the condition (''X''); instead, if the condition is true, it enforces the expression inside the group (''Y'').
@@ -319,6 +348,7 @@ For the USE-conditional groups, it assumes that they mean ''if X is true, then Y
For other groups, the algorithm applies the natural interpretation presuming that the items in group are stated in decreasing preference order, with the left-most item being most preferred. That is, if the group evaluates to false, it enforces a solution that either disables all enabled items except for the left-most already enabled, or enables the first item if no item is enabled.
====Reordering of ||, ??, ^^ groups====
+
The left-most-preferred assumption about the groups results in the solving algorithm relying on the ability to enable the item and disable other items. This is not possible if the relevant flag is masked, or (in cases of ??, ^^) some other flag is forced. If that were the case, the ordering inside those groups would have to be strictly limited by the 'common denominator' between the profiles. This would sometimes result in less preferred options being encouraged, or even impossible to express constraints — e.g. if the preferred implementation would not be stable but the package were stabilized.
To account for this, the groups are transformed to account for forced/masked (immutable) flags. The transformation is done through reordering the items because this keeps the specification as simple as possible. It does not to cover specifically how to interpret immutable flags in different kind of groups, and how to handle the groups afterwards. Instead, reordering results in the forced flags being preferred naturally, and the masked flags being discouraged naturally.
@@ -326,6 +356,7 @@ To account for this, the groups are transformed to account for forced/masked (im
It also naturally handles the case when forced/masked flags result in impossible to satisfy constraints. Those cases do not need to be detected by the reordering algorithm implicitly, and instead just cause solver to fail early.
====Left-to-right constraint application====
+
The solving algorithm applies all changes necessary to enforce the constraints in order, left to right. Enforcing a specific ordering, combined with the PMS specifying how ebuild and eclass values for REQUIRED_USE are combined, makes the algorithm deterministic. Applying left-to-right is also the most natural way of doing it, making it easy for developers to predict the results.
Originally I had considered making the algorithm work independently of constraint order. However, this would clearly defining what the desired solution is, and finding an algorithm to enforce that. To achieve a deterministic solution, we would most likely have to require developers to provide groups that do not overlap. That is, for example:
@@ -339,6 +370,7 @@ would be unacceptable since with both ''a'' and ''b'' flags enabled, the constra
While this is a possible solution, expressing complex constraints would be very hard. Developers would no longer be able to naturally express the constraints, and instead would have to determine the correct sets of conditions for each requested result.
====Single vs multiple iterations====
+
This GLEP does not specifically restrict the implementations to doing simple or multiple iterations. Both options have their advantages.
A single iteration can successfully solve all valid REQUIRED_USE constraints, as long as they are properly ordered. An implementation using a single iteration has simpler error handling — it is only necessary to verify whether the REQUIRED_USE actually matches after enforcing it. It is also reasonable to request developers to order their constraints for a single iteration solving.
@@ -350,26 +382,32 @@ For most of the real-life use cases, two iterations should be able to solve all
c? ( d ) b? ( c ) a? ( b )
===QA checks/verification===
+
====The necessity of verification====
+
The purpose of REQUIRED_USE constraint verification is to ensure that for all valid combinations of input USE flags, the solver will be able to find a valid solution. This needs to be done explicitly since complex REQUIRED_USE constraints may trigger solving issues with non-obvious USE flag combinations, causing the developers to miss the issue.
Since the solver must be able to deal with non-solvable constraints (by reporting them and letting the user deal with them), verification is not a strict necessity for enforcing REQUIRED_USE. However, it improves the user experience, and so is a worthwhile addition to the QA tools in place.
To provide the best coverage, it is beneficial to integrate the verification into the tools commonly used by developers — repoman and pkgcheck, including the CI runs. For this to be possible, the algorithm must meet two requirements:
-* it must be fast enough not to cause significant increase in repoman/pkgcheck run time for the full repository,
-* it must not trigger a large number of false positives, and if any are triggered, they should be easy to work around.
+
+* It must be fast enough not to cause significant increase in repoman/pkgcheck run time for the full repository.
+* It must not trigger a large number of false positives, and if any are triggered, they should be easy to work around.
====Context to the checks====
+
As noted in the specification part, all of them checks need to be repeated for all possible sets of the immutable flags. This is necessary since the immutable flags can alter the solutions significantly. In particular:
-* they can alter the preferred choices in the any-of, at-most-one-of and exactly-one-of groups,
-* they can cause some of the constraints to be unable to be satisfied,
-* they can cause some of the USE-conditional groups to be disabled entirely.
+
+* They can alter the preferred choices in the any-of, at-most-one-of and exactly-one-of groups,
+* They can cause some of the constraints to be unable to be satisfied,
+* They can cause some of the USE-conditional groups to be disabled entirely.
To account for that and avoid the case where REQUIRED_USE solving would fail on some of the profiles, the verification should be performed for all combinations of immutable flags found throughout the enabled classes of profiles. Only the flags that apply to the REQUIRED_USE constraint in question need to be considered.
Due to the EAPI 5 [https://projects.gentoo.org/pms/6/pms.html#x1-600005.2.11 stable masking], the immutable flags have to separately be calculated for ~arch and stable keywords. The stable variant does not need to be considered unless the package is actually stable or being stabilized, to avoid unnecessarily cluttering up {{Path|package.use.stable.mask}} and/or {{Path|package.use.stable.force}} for packages that are going to stay in ~arch.
====The requirements for REQUIRED_USE====
+
The rules imposed for verification aim to cover most of the common cases of unsolvable constraints. In particular:
''no valid combination of USE flags can result in the constraint requesting the same flag to be simultaneously both enabled and disabled''
@@ -381,11 +419,13 @@ The rules imposed for verification aim to cover most of the common cases of unso
: The addition condition for the second iteration change has been added to account for the common case of <code>a? ( b ) c? ( a b )</code>. While technically the second clause causes the first to start to apply, the second one already covers that case explicitly, so a second iteration would not change the result.
====Transformation into implication form====
+
The transformation of REQUIRED_USE into implication form is used to provide a form of the original constraint that is more convenient for analysis.
Firstly, the diverse (convenience) item types are all converted into a combination of implications and plain USE flags. The latter can express all the original constraints exactly, provided that the any reordering necessary is done prior to the transformation. As a result, we gain both simplified set of items that need to be considered, and a clear logical mapping of behavior associated any-of, at-most-one-of and exactly-one-of groups.
All of the transformed forms are built by definition, from the verification and solving algorithm:
+
* Any-of group constraints are satisfied if at least one of the items match. Therefore, the solving only applies if none of them does, in which case the first item is enforced. Appropriately, the result of transformation is the enforcement of first item conditional to the negation of all other items (the condition for the first item is omitted as redundant — enforcing a flag that is already enabled does not change anything).
* At-most-one-of group constraints are satisfied if no more than one item matches. The solving is applied if more than one item is enabled, in which case all but the first enabled item are forcibly disabled. Since disabling an already disabled flag does not change anything, this can be simplified to disabling all the remaining items if the left-most item is matched. The transformation does exactly that, for each item that can be possibly enabled, left-to-right.
* Exactly-one-of group constraints are satisfied if exactly one item matches. Logically this is equivalent to both having at least one item and no more than one item matching. Therefore, this constraint can be converted into a combination of an any-of group and an at-most-one-of-group, for which the transforms are already defined.
@@ -397,12 +437,14 @@ A plain graph could be used to visualize relations between different conditions
Thirdly, having the graph (tree) of conditions, we can easily traverse them. In doing so, we construct paths that precisely express which conditions need to be met for a particular enforcement to apply. Since the constraints are applied in order, we need to traverse the graph in this specific order, and write the paths down in the same order.
In doing the two last steps, it is important that we preserve the identity of the original condition nodes. This is necessary to distinguish between two cases:
+
# <code>a? ( b c )</code>
# <code>a? ( b ) a? ( c )</code>
Since the solving algorithm is applied recursively to USE-conditional groups, in the first case the outer ''a'' condition is not reevaluated between processing ''b'' and ''c''. In the latter case, the use of separate groups causes reevaluation of the condition.
While in this specific example there is no technical difference between the two forms, it becomes apparent when dealing with the following corner case:
+
# <code>a? ( !a b )</code>
# <code>a? ( !a ) a? ( b )</code>
@@ -410,8 +452,10 @@ In both cases, applying the first sub-item disables ''a''. However, only in the
In order to prevent that from happening, the verification algorithms need to be able to determine that the ''a'' condition is the same node in both resulting flattened expressions, and appropriately account for the fact that it is not affected by the enforcement. In the reference implementation, this is done via preserving the identity of the node, and doing identity-wide node comparison.
-==Backwards Compatibility==
+==Backwards compatibility==
+
===Compliance with the PMS===
+
This GLEP does not break the PMS compliance in any way. The syntax used by the constraints is a subset of the [https://projects.gentoo.org/pms/6/pms.html#x1-780008.2 REQUIRED_USE syntax allowed by the PMS]. The semantic extends the one defined in the PMS in non-conflicting way.
The PMS does not require a very specific behavior for REQUIRED_USE. The [https://projects.gentoo.org/pms/6/pms.html#x1-910008.2.7 USE state constraints section] requires that the package manager does not use (build/install) package versions where REQUIRED_USE constraints are not met.
@@ -419,20 +463,23 @@ The PMS does not require a very specific behavior for REQUIRED_USE. The [https:/
However, it does not require the package manager to verbosely report the conflict which the package managers actually do. That considered, it should not cause any non-compliance if this verbose reporting is (partially) replaced by automatic solving. If the solving succeeds, the constraints are met and the package manager can proceed with building/installing the package. If it does not, the existing behavior of reporting the issue is preserved.
===New constraints vs non-compliant package managers===
+
This GLEP preserves full syntax compatibility with the existing package managers. The constraints written for auto-solving will still work correctly in the package managers not supporting it, resulting in regular REQUIRED_USE mismatch. Furthermore, the extended semantic meaning can result in improved readability of constraints, and therefore the messages issued by the package managers. Users aware of the auto-solving rules will have a suggested algorithm for satisfying REQUIRED_USE.
The only potential danger is that the auto-solving will result in more extensive use of REQUIRED_USE and less concern for whether they are satisfied by default, resulting in more frequent REQUIRED_USE mismatches. Avoiding this problem should be done on policy level, requiring the developers not to rely purely on auto-solving through a migration period.
===Old constraints vs auto-solving===
+
Most of the existing REQUIRED_USE constraints are already compatible with auto-solving. There are three problematic cases:
-# constraints that are disallowed per [[#Restrictions_on_REQUIRED_USE_format|the restrictions on REQUIRED_USE format]],
-# constraints that can not be solved by the algorithm,
-# constraints that give sub-optimal (non-preferred) solutions.
+# Constraints that are disallowed per [[#Restrictions_on_REQUIRED_USE_format|the restrictions on REQUIRED_USE format]],
+# Constraints that can not be solved by the algorithm,
+# Constraints that give sub-optimal (non-preferred) solutions.
While the impact and details differ for each case, it can be commonly noted that all of them can be reliably fixed before implementing auto-solving, and — as noted above — the fixes will not break existing package managers.
====Constraints disallowed in this GLEP====
+
For simplification, this GLEP will reject some of the REQUIRED_USE forms that are valid per the PMS. They will be rejected for all combinations of USE flags that do not satisfy the constraint. However, this is not a major issue for three reasons:
# The unsupported constraints are extremely rare, of low value and fixing them improves readability. As listed in [[#Restrictions_for_allowed_REQUIRED_USE_syntax|rationale for the restrictions]], there were a total of 8 packages being affected at the time of writing, and fixing them was already in progress.
@@ -440,6 +487,7 @@ For simplification, this GLEP will reject some of the REQUIRED_USE forms that ar
# This GLEP does not strictly disallow the package manager from solving those constraints, only does not specify the solutions for them. Therefore, the package managers may implement custom extensions to solve them. However, they should still warn that this is non-portable and unreadable.
====Constraints that can not be solved====
+
Not all valid REQUIRED_USE constraints can be reliably solved. There are two major cases for that:
# Constraints that toggle flags that caused previous conditions not to apply. Solving those may require more than one iteration of the solving algorithm. However, they usually can be fixed easily by reordering.
@@ -448,6 +496,7 @@ Not all valid REQUIRED_USE constraints can be reliably solved. There are two maj
However, the problem usually applies to only some of the disallowed USE flag combinations. The verification algorithm should be able to detect most of those cases.
====Constraints with sub-optimal solutions====
+
While this specification uses an algorithm that attempts to read REQUIRED_USE constraints in the most natural way, not all constraints in Gentoo are written in this manner. Especially, many any-of, at-most-one-of and exactly-one-of groups are written with no specific ordering in mind. In some cases, they are used interchangeably with USE-conditional groups. Some USE-conditional groups are written without concern for clearly establishing the relation between the condition and the items inside the group.
While the auto-solving algorithm is able to solve many of those constraints, the solution can be considered sub-optimal as they do not follow the solution that the developers would knowingly suggest. For example, per the current rules the two following constraints are equivalent:
@@ -459,12 +508,16 @@ However, per the auto-solving semantic the first one will favor enabling the dep
This is probably the most important issue since there is no easy way to automatically detect that.
-==Reference Implementation==
+==Reference implementation==
+
===Proof-of-concept code===
+
The reference implementation of various algorithms and the scripts used to test them are included in [https://github.com/mgorny/required-use the required-use project on GitHub]. However, it needs to be noted that they are still work-in-progress.
===PkgCore===
+
The implementation of the following parts of the specification have been submitted to the PkgCore package manager for inclusion:
+
* validation of REQUIRED_USE constraints for compliance with the restricted syntax: [https://github.com/pkgcore/pkgcheck/pull/58 pkgcheck PR#58].
All of those bits are actively used in the pkgcheck fork used for the [[Project:Repository_mirror_and_CI|Repository mirror & CI]] project for CI.