POSIX shell: does `$` lose its special meaning if it is the last character in a word?












16














On ash, dash and bash, when I run



$ echo ab$


it returns



ab$


Is this behavior specified by POSIX or is it just a common convention in POSIX-compliant shells? I couldn't find anything on the POSIX Shell Command Language page that mentions this behavior.










share|improve this question




















  • 2




    The better question is "Does $ gain a special meaning if it is the last character in a word?" There is no single special meaning assigned to $; it is used to introduce multiple, but distinct, expansions, like parameter expansion ${...}, command substitution $(...), and arithmetic expressions $((...)). Some shells introduce additional contexts, like ksh's command-substitution variant x=${ echo foo; echo bar;} (which differs from the standard $(...) by not executing the commands in a subshell).
    – chepner
    21 hours ago


















16














On ash, dash and bash, when I run



$ echo ab$


it returns



ab$


Is this behavior specified by POSIX or is it just a common convention in POSIX-compliant shells? I couldn't find anything on the POSIX Shell Command Language page that mentions this behavior.










share|improve this question




















  • 2




    The better question is "Does $ gain a special meaning if it is the last character in a word?" There is no single special meaning assigned to $; it is used to introduce multiple, but distinct, expansions, like parameter expansion ${...}, command substitution $(...), and arithmetic expressions $((...)). Some shells introduce additional contexts, like ksh's command-substitution variant x=${ echo foo; echo bar;} (which differs from the standard $(...) by not executing the commands in a subshell).
    – chepner
    21 hours ago
















16












16








16


2





On ash, dash and bash, when I run



$ echo ab$


it returns



ab$


Is this behavior specified by POSIX or is it just a common convention in POSIX-compliant shells? I couldn't find anything on the POSIX Shell Command Language page that mentions this behavior.










share|improve this question















On ash, dash and bash, when I run



$ echo ab$


it returns



ab$


Is this behavior specified by POSIX or is it just a common convention in POSIX-compliant shells? I couldn't find anything on the POSIX Shell Command Language page that mentions this behavior.







shell posix






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited yesterday









Sparhawk

9,33163991




9,33163991










asked yesterday









Harold Fischer

628315




628315








  • 2




    The better question is "Does $ gain a special meaning if it is the last character in a word?" There is no single special meaning assigned to $; it is used to introduce multiple, but distinct, expansions, like parameter expansion ${...}, command substitution $(...), and arithmetic expressions $((...)). Some shells introduce additional contexts, like ksh's command-substitution variant x=${ echo foo; echo bar;} (which differs from the standard $(...) by not executing the commands in a subshell).
    – chepner
    21 hours ago
















  • 2




    The better question is "Does $ gain a special meaning if it is the last character in a word?" There is no single special meaning assigned to $; it is used to introduce multiple, but distinct, expansions, like parameter expansion ${...}, command substitution $(...), and arithmetic expressions $((...)). Some shells introduce additional contexts, like ksh's command-substitution variant x=${ echo foo; echo bar;} (which differs from the standard $(...) by not executing the commands in a subshell).
    – chepner
    21 hours ago










2




2




The better question is "Does $ gain a special meaning if it is the last character in a word?" There is no single special meaning assigned to $; it is used to introduce multiple, but distinct, expansions, like parameter expansion ${...}, command substitution $(...), and arithmetic expressions $((...)). Some shells introduce additional contexts, like ksh's command-substitution variant x=${ echo foo; echo bar;} (which differs from the standard $(...) by not executing the commands in a subshell).
– chepner
21 hours ago






The better question is "Does $ gain a special meaning if it is the last character in a word?" There is no single special meaning assigned to $; it is used to introduce multiple, but distinct, expansions, like parameter expansion ${...}, command substitution $(...), and arithmetic expressions $((...)). Some shells introduce additional contexts, like ksh's command-substitution variant x=${ echo foo; echo bar;} (which differs from the standard $(...) by not executing the commands in a subshell).
– chepner
21 hours ago












3 Answers
3






active

oldest

votes


















19














A $ followed by an space (or no character (IMO)) is unspecified by POSIX.




The '$' character is used to introduce parameter expansion, command substitution, or arithmetic evaluation. If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




To make it explicit, an unquoted $ that is not followed by a character in this regex:



 [0-9@*#?$!_a-zA-Z{(-]


is explicitly unspecified: any result is allowed by POSIX.



That is: any specific result is not guaranteed by POSIX.



Or, if used, there is not way to know what would be done by following POSIX.





A quoted $ (either with " or ') must be followed by the quoting character, so, a $ could not be the last character of a word. Understand that a word must contain the quotes. From [2.3 Token Recognition][3]




(…) the result token shall contain exactly the characters that appear in the input (…), unmodified, including any embedded or enclosing quotes (…).




The only other option left is a quoted $ with a backslash, which is to be interpreted as the "raw" character losing its special meaning (if any).





Conclusion



So, yes, a trailing could $ lose its special meaning of starting an expansion, either by being backlash quoted or, while unquoted, by being unspecified.



However, all implementations that I know of accept a trailing $ as part of the preceding word (if any) without any error or warning.



In trailing I mean that the following character ends a word (<blank>, |, &, ;, <, >, or NUL) or the end of input was signaled.





Let's take a walk through the processing steps of the two basic sequences that interpret the command line. One divides the command line into tokens (and identify them) and is explained in 2.3 Token Recognition



echo a$a



With echo a$abc, at some point, the $ gets to be processed:




  1. step 1: As a $ is not the end of input, keep going.

  2. steps 2 and 3: The previous character a was not an operator, keep going.

  3. step 4: A $ is not a <backslash>, a single-quote, or a double-quote, keep going.

  4. step 5: The current character is an unquoted '$' or '`', the shell shall identify the start of any candidates for parameter expansion.

  5. step 5: The shell shall read sufficient input to determine the end of the unit to be expanded (as explained in the cited sections).

  6. Needs to go into the cited sections to decide if the token is valid.

  7. Section 2.6 Word Expansions: The '$' character is used to introduce an expansion.

  8. The unquoted '$' is followed by an a, it is one of the valid characters.

  9. Keep reading to the following <blank> to find the end of the unit to be expanded

  10. The $abc thus collected is delimited and tagged as an expansion.


echo a$ a



For echo a$ a all the steps above are the same until step 8, but here:




  1. The unquoted '$' is followed by a <blank>, it is not one of the valid characters, it is therefore unspecified. Generally, implementations delimit the token that is followed by an space, but do not tag it as an expansion.


echo a$



Again, most of the steps above apply for echo a$, but 8 change to:





  1. The unquoted '$' is followed by a NUL (the terminating character for a C string), it is therefore not one of the valid characters. It is therefore unspecified AFAICT.



    Following comments: If the interpretation of the string: If an unquoted '$' is followed by a character (…) is to claim that no character follows when there is a following NUL, newline, or that the end of the input was reached by some other indicator (EOT, etc.), then, the echo a$ is not a valid expansion (but doesn't fall into the unspecified case), the token gets delimited but not tagged as an expansion.



    Most implementations delimit the token that is followed by an invalid character or by the end of input, but do not tag it as an expansion, anyway.




In summary



echo a$abc # expansion of parameterabc, is valid and specified.
echo a$#c # expansion of special parameter#, valid and specified.
echo a$+c # unspecified behavior.
echo a$ c # is unspecified.
echo a$ # unspecified IMO, unclear by an alternate interpretation.
echo $ # Falls into the same issue.






share|improve this answer



















  • 2




    If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
    – Michael Homer
    18 hours ago










  • @MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
    – Harold Fischer
    16 hours ago






  • 1




    Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
    – Michael Homer
    14 hours ago








  • 1




    I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
    – Michael Homer
    14 hours ago






  • 1




    That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
    – Michael Homer
    14 hours ago



















9














$ does not have a special meaning by itself (try echo $), only when combined with other character after it and forming an expansion, e.g. $var (or ${var}), $(util), $((1+2)).



The $ gets its "special" meaning as defining an expansion in the POSIX standard under the section Token Recognition:




If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively. The shell shall read sufficient input to determine the end of the unit to be expanded. While processing the characters, if instances of expansions or quoting are found nested within the substitution, the shell shall recursively process them in the manner specified for the construct that is found. The characters found from the beginning of the substitution to its end, allowing for any recursion necessary to recognize embedded constructs, shall be included unmodified in the result token, including any embedded or enclosing substitution operators or quotes. The token shall not be delimited by the end of the substitution.




So, if $ does not form an expansion, other parsing rules come into effect:




If the previous character was part of a word, the current character shall be appended to that word.




That covers your ab$ string.



In the case of a lone $ (the "new word" would be the $ by itself):




The current character is used as the start of a new word.




The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX.



Also note that $ is the last character in $$, but that this also happens to be the variable that holds the current shell's PID. In bash, !$ may invoke a history expansion (the last argument af the previous command). So in general, no, $ is not without meaning at the end of an unquoted word, but at the end of a word it does at least not denote a standard expansion.






share|improve this answer























  • @Isaac I deleted the parenthesis that I'm assuming you're referring to.
    – Kusalananda
    yesterday










  • @Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
    – Kusalananda
    yesterday










  • It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
    – Isaac
    yesterday












  • @Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
    – Kusalananda
    yesterday












  • @Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
    – Kusalananda
    yesterday



















5














Depending on the exact situation, this is either explicitly unspecified (so implementations may do as they will) or required to happen as you observed. In your exact scenario echo ab$, POSIX mandates the output "ab$" that you observed and it is not unspecified. A quick summary of all the different cases is at the end.



There are two elements: first tokenising into words, and then interpretation of those words.





Tokenisation



POSIX tokenisation requires that a $ that is not the start of a valid parameter expansion, command substitution, or arithmetic substitution to be considered a literal part of the WORD token being constructed. This is because rule 5 ("If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively") does not apply, as none of those expansions are viable there. Parameter expansion requires a valid name to appear there, and an empty name is not valid.



Since this rule did not apply, we continue following until we find one that does. The two candidates are #8 ("If the previous character was part of a word, the current character shall be appended to that word.") and #10 ("The current character is used as the start of a new word."), which apply to echo a$ and echo $ respectively.



There is also a third case of the form echo a$+b which falls through the same crack, since + is not the name of a special parameter. This one we'll return to later, since it triggers different parts of the rules.



The specification thus requires that the $ be considered a part of the word syntactically, and it can then be further processed later on.





Word expansion



After the input has been parsed in this way, with the $ included in the word, word expansions are applied to each of the words that have been read. Each word is processed individually.



It is specified that:




If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




"Unspecified" is a particular term here meaning that




  1. A conforming shell can choose any behaviour in this case

  2. A conforming application cannot rely on any particular behaviour


In your example, echo ab$, the $ is not followed by any character, so this rule does not apply and the unspecified result is not invoked. There is simply no expansion incited by the $, so it is literally present and printed out.



Where it would apply is in our third case from above: echo a$+b. Here $ is followed by +, which is not a number, special parameter (@, *, #, ?, -, $, !, or 0), start of a variable name (underscore or an alphabetic from the portable character set), or one of the brackets. In this case, the behaviour is unspecified: a conforming shell is permitted to invent a special parameter called + to expand, and a conforming application should not assume that the shell does not. The shell could do anything else it liked as well, including reporting an error.



For example, zsh, including in its POSIX mode, interprets $+b as "is variable b set" and substitutes either 1 or 0 in its place. It similarly has extensions for ~ and =. This is conforming behaviour.



Another place this could happen is echo "a$ b". Again, the shell is permitted to do as it wishes, and you as the script author should escape the $ if you want literal output. If you don't, it may work, but you can't rely on it. This is the absolute letter of the specification, but I don't think this sort of granularity was intended or considered.





In summary





  • echo ab$: literal output, fully specified


  • echo a$ b: literal output, fully specified


  • echo a$ b$: literal output, fully specified


  • echo a$b: expansion of parameter b, fully specified


  • echo a$-b: expansion of special parameter -, fully specified


  • echo a$+b: unspecified behaviour


  • echo "a$ b": unspecified behaviour


For a $ at the end of a word, you are permitted to rely on the behaviour and it must be treated literally and passed on to the echo command as part of its argument. That is a conformance requirement on the shell.






share|improve this answer



















  • 1




    Awesome summary at the end
    – Harold Fischer
    15 hours ago










  • As an aside, this also means echo a$ b$ would be fully specified, correct?
    – Harold Fischer
    15 hours ago










  • @HaroldFischer Yes, each word could have its own $.
    – Michael Homer
    15 hours ago










  • Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
    – Christopher
    14 hours ago










  • @Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
    – Stéphane Chazelas
    14 hours ago













Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492397%2fposix-shell-does-lose-its-special-meaning-if-it-is-the-last-character-in-a%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









19














A $ followed by an space (or no character (IMO)) is unspecified by POSIX.




The '$' character is used to introduce parameter expansion, command substitution, or arithmetic evaluation. If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




To make it explicit, an unquoted $ that is not followed by a character in this regex:



 [0-9@*#?$!_a-zA-Z{(-]


is explicitly unspecified: any result is allowed by POSIX.



That is: any specific result is not guaranteed by POSIX.



Or, if used, there is not way to know what would be done by following POSIX.





A quoted $ (either with " or ') must be followed by the quoting character, so, a $ could not be the last character of a word. Understand that a word must contain the quotes. From [2.3 Token Recognition][3]




(…) the result token shall contain exactly the characters that appear in the input (…), unmodified, including any embedded or enclosing quotes (…).




The only other option left is a quoted $ with a backslash, which is to be interpreted as the "raw" character losing its special meaning (if any).





Conclusion



So, yes, a trailing could $ lose its special meaning of starting an expansion, either by being backlash quoted or, while unquoted, by being unspecified.



However, all implementations that I know of accept a trailing $ as part of the preceding word (if any) without any error or warning.



In trailing I mean that the following character ends a word (<blank>, |, &, ;, <, >, or NUL) or the end of input was signaled.





Let's take a walk through the processing steps of the two basic sequences that interpret the command line. One divides the command line into tokens (and identify them) and is explained in 2.3 Token Recognition



echo a$a



With echo a$abc, at some point, the $ gets to be processed:




  1. step 1: As a $ is not the end of input, keep going.

  2. steps 2 and 3: The previous character a was not an operator, keep going.

  3. step 4: A $ is not a <backslash>, a single-quote, or a double-quote, keep going.

  4. step 5: The current character is an unquoted '$' or '`', the shell shall identify the start of any candidates for parameter expansion.

  5. step 5: The shell shall read sufficient input to determine the end of the unit to be expanded (as explained in the cited sections).

  6. Needs to go into the cited sections to decide if the token is valid.

  7. Section 2.6 Word Expansions: The '$' character is used to introduce an expansion.

  8. The unquoted '$' is followed by an a, it is one of the valid characters.

  9. Keep reading to the following <blank> to find the end of the unit to be expanded

  10. The $abc thus collected is delimited and tagged as an expansion.


echo a$ a



For echo a$ a all the steps above are the same until step 8, but here:




  1. The unquoted '$' is followed by a <blank>, it is not one of the valid characters, it is therefore unspecified. Generally, implementations delimit the token that is followed by an space, but do not tag it as an expansion.


echo a$



Again, most of the steps above apply for echo a$, but 8 change to:





  1. The unquoted '$' is followed by a NUL (the terminating character for a C string), it is therefore not one of the valid characters. It is therefore unspecified AFAICT.



    Following comments: If the interpretation of the string: If an unquoted '$' is followed by a character (…) is to claim that no character follows when there is a following NUL, newline, or that the end of the input was reached by some other indicator (EOT, etc.), then, the echo a$ is not a valid expansion (but doesn't fall into the unspecified case), the token gets delimited but not tagged as an expansion.



    Most implementations delimit the token that is followed by an invalid character or by the end of input, but do not tag it as an expansion, anyway.




In summary



echo a$abc # expansion of parameterabc, is valid and specified.
echo a$#c # expansion of special parameter#, valid and specified.
echo a$+c # unspecified behavior.
echo a$ c # is unspecified.
echo a$ # unspecified IMO, unclear by an alternate interpretation.
echo $ # Falls into the same issue.






share|improve this answer



















  • 2




    If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
    – Michael Homer
    18 hours ago










  • @MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
    – Harold Fischer
    16 hours ago






  • 1




    Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
    – Michael Homer
    14 hours ago








  • 1




    I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
    – Michael Homer
    14 hours ago






  • 1




    That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
    – Michael Homer
    14 hours ago
















19














A $ followed by an space (or no character (IMO)) is unspecified by POSIX.




The '$' character is used to introduce parameter expansion, command substitution, or arithmetic evaluation. If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




To make it explicit, an unquoted $ that is not followed by a character in this regex:



 [0-9@*#?$!_a-zA-Z{(-]


is explicitly unspecified: any result is allowed by POSIX.



That is: any specific result is not guaranteed by POSIX.



Or, if used, there is not way to know what would be done by following POSIX.





A quoted $ (either with " or ') must be followed by the quoting character, so, a $ could not be the last character of a word. Understand that a word must contain the quotes. From [2.3 Token Recognition][3]




(…) the result token shall contain exactly the characters that appear in the input (…), unmodified, including any embedded or enclosing quotes (…).




The only other option left is a quoted $ with a backslash, which is to be interpreted as the "raw" character losing its special meaning (if any).





Conclusion



So, yes, a trailing could $ lose its special meaning of starting an expansion, either by being backlash quoted or, while unquoted, by being unspecified.



However, all implementations that I know of accept a trailing $ as part of the preceding word (if any) without any error or warning.



In trailing I mean that the following character ends a word (<blank>, |, &, ;, <, >, or NUL) or the end of input was signaled.





Let's take a walk through the processing steps of the two basic sequences that interpret the command line. One divides the command line into tokens (and identify them) and is explained in 2.3 Token Recognition



echo a$a



With echo a$abc, at some point, the $ gets to be processed:




  1. step 1: As a $ is not the end of input, keep going.

  2. steps 2 and 3: The previous character a was not an operator, keep going.

  3. step 4: A $ is not a <backslash>, a single-quote, or a double-quote, keep going.

  4. step 5: The current character is an unquoted '$' or '`', the shell shall identify the start of any candidates for parameter expansion.

  5. step 5: The shell shall read sufficient input to determine the end of the unit to be expanded (as explained in the cited sections).

  6. Needs to go into the cited sections to decide if the token is valid.

  7. Section 2.6 Word Expansions: The '$' character is used to introduce an expansion.

  8. The unquoted '$' is followed by an a, it is one of the valid characters.

  9. Keep reading to the following <blank> to find the end of the unit to be expanded

  10. The $abc thus collected is delimited and tagged as an expansion.


echo a$ a



For echo a$ a all the steps above are the same until step 8, but here:




  1. The unquoted '$' is followed by a <blank>, it is not one of the valid characters, it is therefore unspecified. Generally, implementations delimit the token that is followed by an space, but do not tag it as an expansion.


echo a$



Again, most of the steps above apply for echo a$, but 8 change to:





  1. The unquoted '$' is followed by a NUL (the terminating character for a C string), it is therefore not one of the valid characters. It is therefore unspecified AFAICT.



    Following comments: If the interpretation of the string: If an unquoted '$' is followed by a character (…) is to claim that no character follows when there is a following NUL, newline, or that the end of the input was reached by some other indicator (EOT, etc.), then, the echo a$ is not a valid expansion (but doesn't fall into the unspecified case), the token gets delimited but not tagged as an expansion.



    Most implementations delimit the token that is followed by an invalid character or by the end of input, but do not tag it as an expansion, anyway.




In summary



echo a$abc # expansion of parameterabc, is valid and specified.
echo a$#c # expansion of special parameter#, valid and specified.
echo a$+c # unspecified behavior.
echo a$ c # is unspecified.
echo a$ # unspecified IMO, unclear by an alternate interpretation.
echo $ # Falls into the same issue.






share|improve this answer



















  • 2




    If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
    – Michael Homer
    18 hours ago










  • @MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
    – Harold Fischer
    16 hours ago






  • 1




    Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
    – Michael Homer
    14 hours ago








  • 1




    I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
    – Michael Homer
    14 hours ago






  • 1




    That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
    – Michael Homer
    14 hours ago














19












19








19






A $ followed by an space (or no character (IMO)) is unspecified by POSIX.




The '$' character is used to introduce parameter expansion, command substitution, or arithmetic evaluation. If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




To make it explicit, an unquoted $ that is not followed by a character in this regex:



 [0-9@*#?$!_a-zA-Z{(-]


is explicitly unspecified: any result is allowed by POSIX.



That is: any specific result is not guaranteed by POSIX.



Or, if used, there is not way to know what would be done by following POSIX.





A quoted $ (either with " or ') must be followed by the quoting character, so, a $ could not be the last character of a word. Understand that a word must contain the quotes. From [2.3 Token Recognition][3]




(…) the result token shall contain exactly the characters that appear in the input (…), unmodified, including any embedded or enclosing quotes (…).




The only other option left is a quoted $ with a backslash, which is to be interpreted as the "raw" character losing its special meaning (if any).





Conclusion



So, yes, a trailing could $ lose its special meaning of starting an expansion, either by being backlash quoted or, while unquoted, by being unspecified.



However, all implementations that I know of accept a trailing $ as part of the preceding word (if any) without any error or warning.



In trailing I mean that the following character ends a word (<blank>, |, &, ;, <, >, or NUL) or the end of input was signaled.





Let's take a walk through the processing steps of the two basic sequences that interpret the command line. One divides the command line into tokens (and identify them) and is explained in 2.3 Token Recognition



echo a$a



With echo a$abc, at some point, the $ gets to be processed:




  1. step 1: As a $ is not the end of input, keep going.

  2. steps 2 and 3: The previous character a was not an operator, keep going.

  3. step 4: A $ is not a <backslash>, a single-quote, or a double-quote, keep going.

  4. step 5: The current character is an unquoted '$' or '`', the shell shall identify the start of any candidates for parameter expansion.

  5. step 5: The shell shall read sufficient input to determine the end of the unit to be expanded (as explained in the cited sections).

  6. Needs to go into the cited sections to decide if the token is valid.

  7. Section 2.6 Word Expansions: The '$' character is used to introduce an expansion.

  8. The unquoted '$' is followed by an a, it is one of the valid characters.

  9. Keep reading to the following <blank> to find the end of the unit to be expanded

  10. The $abc thus collected is delimited and tagged as an expansion.


echo a$ a



For echo a$ a all the steps above are the same until step 8, but here:




  1. The unquoted '$' is followed by a <blank>, it is not one of the valid characters, it is therefore unspecified. Generally, implementations delimit the token that is followed by an space, but do not tag it as an expansion.


echo a$



Again, most of the steps above apply for echo a$, but 8 change to:





  1. The unquoted '$' is followed by a NUL (the terminating character for a C string), it is therefore not one of the valid characters. It is therefore unspecified AFAICT.



    Following comments: If the interpretation of the string: If an unquoted '$' is followed by a character (…) is to claim that no character follows when there is a following NUL, newline, or that the end of the input was reached by some other indicator (EOT, etc.), then, the echo a$ is not a valid expansion (but doesn't fall into the unspecified case), the token gets delimited but not tagged as an expansion.



    Most implementations delimit the token that is followed by an invalid character or by the end of input, but do not tag it as an expansion, anyway.




In summary



echo a$abc # expansion of parameterabc, is valid and specified.
echo a$#c # expansion of special parameter#, valid and specified.
echo a$+c # unspecified behavior.
echo a$ c # is unspecified.
echo a$ # unspecified IMO, unclear by an alternate interpretation.
echo $ # Falls into the same issue.






share|improve this answer














A $ followed by an space (or no character (IMO)) is unspecified by POSIX.




The '$' character is used to introduce parameter expansion, command substitution, or arithmetic evaluation. If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




To make it explicit, an unquoted $ that is not followed by a character in this regex:



 [0-9@*#?$!_a-zA-Z{(-]


is explicitly unspecified: any result is allowed by POSIX.



That is: any specific result is not guaranteed by POSIX.



Or, if used, there is not way to know what would be done by following POSIX.





A quoted $ (either with " or ') must be followed by the quoting character, so, a $ could not be the last character of a word. Understand that a word must contain the quotes. From [2.3 Token Recognition][3]




(…) the result token shall contain exactly the characters that appear in the input (…), unmodified, including any embedded or enclosing quotes (…).




The only other option left is a quoted $ with a backslash, which is to be interpreted as the "raw" character losing its special meaning (if any).





Conclusion



So, yes, a trailing could $ lose its special meaning of starting an expansion, either by being backlash quoted or, while unquoted, by being unspecified.



However, all implementations that I know of accept a trailing $ as part of the preceding word (if any) without any error or warning.



In trailing I mean that the following character ends a word (<blank>, |, &, ;, <, >, or NUL) or the end of input was signaled.





Let's take a walk through the processing steps of the two basic sequences that interpret the command line. One divides the command line into tokens (and identify them) and is explained in 2.3 Token Recognition



echo a$a



With echo a$abc, at some point, the $ gets to be processed:




  1. step 1: As a $ is not the end of input, keep going.

  2. steps 2 and 3: The previous character a was not an operator, keep going.

  3. step 4: A $ is not a <backslash>, a single-quote, or a double-quote, keep going.

  4. step 5: The current character is an unquoted '$' or '`', the shell shall identify the start of any candidates for parameter expansion.

  5. step 5: The shell shall read sufficient input to determine the end of the unit to be expanded (as explained in the cited sections).

  6. Needs to go into the cited sections to decide if the token is valid.

  7. Section 2.6 Word Expansions: The '$' character is used to introduce an expansion.

  8. The unquoted '$' is followed by an a, it is one of the valid characters.

  9. Keep reading to the following <blank> to find the end of the unit to be expanded

  10. The $abc thus collected is delimited and tagged as an expansion.


echo a$ a



For echo a$ a all the steps above are the same until step 8, but here:




  1. The unquoted '$' is followed by a <blank>, it is not one of the valid characters, it is therefore unspecified. Generally, implementations delimit the token that is followed by an space, but do not tag it as an expansion.


echo a$



Again, most of the steps above apply for echo a$, but 8 change to:





  1. The unquoted '$' is followed by a NUL (the terminating character for a C string), it is therefore not one of the valid characters. It is therefore unspecified AFAICT.



    Following comments: If the interpretation of the string: If an unquoted '$' is followed by a character (…) is to claim that no character follows when there is a following NUL, newline, or that the end of the input was reached by some other indicator (EOT, etc.), then, the echo a$ is not a valid expansion (but doesn't fall into the unspecified case), the token gets delimited but not tagged as an expansion.



    Most implementations delimit the token that is followed by an invalid character or by the end of input, but do not tag it as an expansion, anyway.




In summary



echo a$abc # expansion of parameterabc, is valid and specified.
echo a$#c # expansion of special parameter#, valid and specified.
echo a$+c # unspecified behavior.
echo a$ c # is unspecified.
echo a$ # unspecified IMO, unclear by an alternate interpretation.
echo $ # Falls into the same issue.







share|improve this answer














share|improve this answer



share|improve this answer








edited 7 hours ago

























answered yesterday









Isaac

11.4k11650




11.4k11650








  • 2




    If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
    – Michael Homer
    18 hours ago










  • @MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
    – Harold Fischer
    16 hours ago






  • 1




    Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
    – Michael Homer
    14 hours ago








  • 1




    I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
    – Michael Homer
    14 hours ago






  • 1




    That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
    – Michael Homer
    14 hours ago














  • 2




    If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
    – Michael Homer
    18 hours ago










  • @MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
    – Harold Fischer
    16 hours ago






  • 1




    Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
    – Michael Homer
    14 hours ago








  • 1




    I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
    – Michael Homer
    14 hours ago






  • 1




    That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
    – Michael Homer
    14 hours ago








2




2




If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
– Michael Homer
18 hours ago




If we’re being fiddly, “if an unquoted ‘$’ is followed by a character that is not ...” and “a $ that is not followed by” are not the same thing. It appears that a $ at the end of input does create a literal word by the letter of the specification with no unspecified behaviour invoked.
– Michael Homer
18 hours ago












@MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
– Harold Fischer
16 hours ago




@MichaelHomer Are you saying that echo ab$ moreinput invokes unspecified behavior and echo ab$ does not?
– Harold Fischer
16 hours ago




1




1




Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
– Michael Homer
14 hours ago






Sorry, what I meant was that a $ followed by nothing at all, as in Harold's echo ab$, isn't "followed by a character that is not ..." because it isn't followed by a character. I wasn't talking about double-quoting (or quoting at all).
– Michael Homer
14 hours ago






1




1




I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
– Michael Homer
14 hours ago




I guess not then? Here's where my confusion lies: "$... followed by a character that is not ..." requires that $ is followed by a character, and that the character in question is not one on the list. If $ is not followed by anything, it certainly is not followed by a character. This is distinct from "$ not followed by a character on this list", which would encompass both $ followed by unlisted characters and $ not followed by any character.
– Michael Homer
14 hours ago




1




1




That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
– Michael Homer
14 hours ago




That is, "followed by a character that is not one of" and "not followed by a character that is one of" are not equivalent.
– Michael Homer
14 hours ago













9














$ does not have a special meaning by itself (try echo $), only when combined with other character after it and forming an expansion, e.g. $var (or ${var}), $(util), $((1+2)).



The $ gets its "special" meaning as defining an expansion in the POSIX standard under the section Token Recognition:




If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively. The shell shall read sufficient input to determine the end of the unit to be expanded. While processing the characters, if instances of expansions or quoting are found nested within the substitution, the shell shall recursively process them in the manner specified for the construct that is found. The characters found from the beginning of the substitution to its end, allowing for any recursion necessary to recognize embedded constructs, shall be included unmodified in the result token, including any embedded or enclosing substitution operators or quotes. The token shall not be delimited by the end of the substitution.




So, if $ does not form an expansion, other parsing rules come into effect:




If the previous character was part of a word, the current character shall be appended to that word.




That covers your ab$ string.



In the case of a lone $ (the "new word" would be the $ by itself):




The current character is used as the start of a new word.




The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX.



Also note that $ is the last character in $$, but that this also happens to be the variable that holds the current shell's PID. In bash, !$ may invoke a history expansion (the last argument af the previous command). So in general, no, $ is not without meaning at the end of an unquoted word, but at the end of a word it does at least not denote a standard expansion.






share|improve this answer























  • @Isaac I deleted the parenthesis that I'm assuming you're referring to.
    – Kusalananda
    yesterday










  • @Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
    – Kusalananda
    yesterday










  • It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
    – Isaac
    yesterday












  • @Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
    – Kusalananda
    yesterday












  • @Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
    – Kusalananda
    yesterday
















9














$ does not have a special meaning by itself (try echo $), only when combined with other character after it and forming an expansion, e.g. $var (or ${var}), $(util), $((1+2)).



The $ gets its "special" meaning as defining an expansion in the POSIX standard under the section Token Recognition:




If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively. The shell shall read sufficient input to determine the end of the unit to be expanded. While processing the characters, if instances of expansions or quoting are found nested within the substitution, the shell shall recursively process them in the manner specified for the construct that is found. The characters found from the beginning of the substitution to its end, allowing for any recursion necessary to recognize embedded constructs, shall be included unmodified in the result token, including any embedded or enclosing substitution operators or quotes. The token shall not be delimited by the end of the substitution.




So, if $ does not form an expansion, other parsing rules come into effect:




If the previous character was part of a word, the current character shall be appended to that word.




That covers your ab$ string.



In the case of a lone $ (the "new word" would be the $ by itself):




The current character is used as the start of a new word.




The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX.



Also note that $ is the last character in $$, but that this also happens to be the variable that holds the current shell's PID. In bash, !$ may invoke a history expansion (the last argument af the previous command). So in general, no, $ is not without meaning at the end of an unquoted word, but at the end of a word it does at least not denote a standard expansion.






share|improve this answer























  • @Isaac I deleted the parenthesis that I'm assuming you're referring to.
    – Kusalananda
    yesterday










  • @Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
    – Kusalananda
    yesterday










  • It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
    – Isaac
    yesterday












  • @Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
    – Kusalananda
    yesterday












  • @Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
    – Kusalananda
    yesterday














9












9








9






$ does not have a special meaning by itself (try echo $), only when combined with other character after it and forming an expansion, e.g. $var (or ${var}), $(util), $((1+2)).



The $ gets its "special" meaning as defining an expansion in the POSIX standard under the section Token Recognition:




If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively. The shell shall read sufficient input to determine the end of the unit to be expanded. While processing the characters, if instances of expansions or quoting are found nested within the substitution, the shell shall recursively process them in the manner specified for the construct that is found. The characters found from the beginning of the substitution to its end, allowing for any recursion necessary to recognize embedded constructs, shall be included unmodified in the result token, including any embedded or enclosing substitution operators or quotes. The token shall not be delimited by the end of the substitution.




So, if $ does not form an expansion, other parsing rules come into effect:




If the previous character was part of a word, the current character shall be appended to that word.




That covers your ab$ string.



In the case of a lone $ (the "new word" would be the $ by itself):




The current character is used as the start of a new word.




The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX.



Also note that $ is the last character in $$, but that this also happens to be the variable that holds the current shell's PID. In bash, !$ may invoke a history expansion (the last argument af the previous command). So in general, no, $ is not without meaning at the end of an unquoted word, but at the end of a word it does at least not denote a standard expansion.






share|improve this answer














$ does not have a special meaning by itself (try echo $), only when combined with other character after it and forming an expansion, e.g. $var (or ${var}), $(util), $((1+2)).



The $ gets its "special" meaning as defining an expansion in the POSIX standard under the section Token Recognition:




If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively. The shell shall read sufficient input to determine the end of the unit to be expanded. While processing the characters, if instances of expansions or quoting are found nested within the substitution, the shell shall recursively process them in the manner specified for the construct that is found. The characters found from the beginning of the substitution to its end, allowing for any recursion necessary to recognize embedded constructs, shall be included unmodified in the result token, including any embedded or enclosing substitution operators or quotes. The token shall not be delimited by the end of the substitution.




So, if $ does not form an expansion, other parsing rules come into effect:




If the previous character was part of a word, the current character shall be appended to that word.




That covers your ab$ string.



In the case of a lone $ (the "new word" would be the $ by itself):




The current character is used as the start of a new word.




The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX.



Also note that $ is the last character in $$, but that this also happens to be the variable that holds the current shell's PID. In bash, !$ may invoke a history expansion (the last argument af the previous command). So in general, no, $ is not without meaning at the end of an unquoted word, but at the end of a word it does at least not denote a standard expansion.







share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered yesterday









Kusalananda

122k16230375




122k16230375












  • @Isaac I deleted the parenthesis that I'm assuming you're referring to.
    – Kusalananda
    yesterday










  • @Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
    – Kusalananda
    yesterday










  • It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
    – Isaac
    yesterday












  • @Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
    – Kusalananda
    yesterday












  • @Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
    – Kusalananda
    yesterday


















  • @Isaac I deleted the parenthesis that I'm assuming you're referring to.
    – Kusalananda
    yesterday










  • @Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
    – Kusalananda
    yesterday










  • It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
    – Isaac
    yesterday












  • @Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
    – Kusalananda
    yesterday












  • @Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
    – Kusalananda
    yesterday
















@Isaac I deleted the parenthesis that I'm assuming you're referring to.
– Kusalananda
yesterday




@Isaac I deleted the parenthesis that I'm assuming you're referring to.
– Kusalananda
yesterday












@Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
– Kusalananda
yesterday




@Isaac Ah, I see. I thought "left unspecified" would mean the same as "defined as unspecified", but I could definitely make that wording better.
– Kusalananda
yesterday












It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
– Isaac
yesterday






It seems reasonable that In the case of a lone $ (the "new word" would be the $ by itself): and that is what has been generally implemented, but what sounds reasonable and what the spec states don't always match. What the spec clearly states is: "not followed by (…)", and a trailing $ character is "not followed by (…)". So, it is explicitly unspecified.
– Isaac
yesterday














@Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
– Kusalananda
yesterday






@Isaac Yes. This is what I say. The section I'm quoting is on token recognition. The lone $ is recognised as a word token. It is not recognised as an expansion. The meaning of that lone $ word, i.e. the action that the shell takes, is unspecified. This is what I say.
– Kusalananda
yesterday














@Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
– Kusalananda
yesterday




@Isaac (sigh), yes, and this is what I now have in my answer: "The meaning of the generated word containing a $ that is not a standard expansion is explicitly defined as unspecified by POSIX." What do you suggest that I reword it as?
– Kusalananda
yesterday











5














Depending on the exact situation, this is either explicitly unspecified (so implementations may do as they will) or required to happen as you observed. In your exact scenario echo ab$, POSIX mandates the output "ab$" that you observed and it is not unspecified. A quick summary of all the different cases is at the end.



There are two elements: first tokenising into words, and then interpretation of those words.





Tokenisation



POSIX tokenisation requires that a $ that is not the start of a valid parameter expansion, command substitution, or arithmetic substitution to be considered a literal part of the WORD token being constructed. This is because rule 5 ("If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively") does not apply, as none of those expansions are viable there. Parameter expansion requires a valid name to appear there, and an empty name is not valid.



Since this rule did not apply, we continue following until we find one that does. The two candidates are #8 ("If the previous character was part of a word, the current character shall be appended to that word.") and #10 ("The current character is used as the start of a new word."), which apply to echo a$ and echo $ respectively.



There is also a third case of the form echo a$+b which falls through the same crack, since + is not the name of a special parameter. This one we'll return to later, since it triggers different parts of the rules.



The specification thus requires that the $ be considered a part of the word syntactically, and it can then be further processed later on.





Word expansion



After the input has been parsed in this way, with the $ included in the word, word expansions are applied to each of the words that have been read. Each word is processed individually.



It is specified that:




If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




"Unspecified" is a particular term here meaning that




  1. A conforming shell can choose any behaviour in this case

  2. A conforming application cannot rely on any particular behaviour


In your example, echo ab$, the $ is not followed by any character, so this rule does not apply and the unspecified result is not invoked. There is simply no expansion incited by the $, so it is literally present and printed out.



Where it would apply is in our third case from above: echo a$+b. Here $ is followed by +, which is not a number, special parameter (@, *, #, ?, -, $, !, or 0), start of a variable name (underscore or an alphabetic from the portable character set), or one of the brackets. In this case, the behaviour is unspecified: a conforming shell is permitted to invent a special parameter called + to expand, and a conforming application should not assume that the shell does not. The shell could do anything else it liked as well, including reporting an error.



For example, zsh, including in its POSIX mode, interprets $+b as "is variable b set" and substitutes either 1 or 0 in its place. It similarly has extensions for ~ and =. This is conforming behaviour.



Another place this could happen is echo "a$ b". Again, the shell is permitted to do as it wishes, and you as the script author should escape the $ if you want literal output. If you don't, it may work, but you can't rely on it. This is the absolute letter of the specification, but I don't think this sort of granularity was intended or considered.





In summary





  • echo ab$: literal output, fully specified


  • echo a$ b: literal output, fully specified


  • echo a$ b$: literal output, fully specified


  • echo a$b: expansion of parameter b, fully specified


  • echo a$-b: expansion of special parameter -, fully specified


  • echo a$+b: unspecified behaviour


  • echo "a$ b": unspecified behaviour


For a $ at the end of a word, you are permitted to rely on the behaviour and it must be treated literally and passed on to the echo command as part of its argument. That is a conformance requirement on the shell.






share|improve this answer



















  • 1




    Awesome summary at the end
    – Harold Fischer
    15 hours ago










  • As an aside, this also means echo a$ b$ would be fully specified, correct?
    – Harold Fischer
    15 hours ago










  • @HaroldFischer Yes, each word could have its own $.
    – Michael Homer
    15 hours ago










  • Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
    – Christopher
    14 hours ago










  • @Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
    – Stéphane Chazelas
    14 hours ago


















5














Depending on the exact situation, this is either explicitly unspecified (so implementations may do as they will) or required to happen as you observed. In your exact scenario echo ab$, POSIX mandates the output "ab$" that you observed and it is not unspecified. A quick summary of all the different cases is at the end.



There are two elements: first tokenising into words, and then interpretation of those words.





Tokenisation



POSIX tokenisation requires that a $ that is not the start of a valid parameter expansion, command substitution, or arithmetic substitution to be considered a literal part of the WORD token being constructed. This is because rule 5 ("If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively") does not apply, as none of those expansions are viable there. Parameter expansion requires a valid name to appear there, and an empty name is not valid.



Since this rule did not apply, we continue following until we find one that does. The two candidates are #8 ("If the previous character was part of a word, the current character shall be appended to that word.") and #10 ("The current character is used as the start of a new word."), which apply to echo a$ and echo $ respectively.



There is also a third case of the form echo a$+b which falls through the same crack, since + is not the name of a special parameter. This one we'll return to later, since it triggers different parts of the rules.



The specification thus requires that the $ be considered a part of the word syntactically, and it can then be further processed later on.





Word expansion



After the input has been parsed in this way, with the $ included in the word, word expansions are applied to each of the words that have been read. Each word is processed individually.



It is specified that:




If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




"Unspecified" is a particular term here meaning that




  1. A conforming shell can choose any behaviour in this case

  2. A conforming application cannot rely on any particular behaviour


In your example, echo ab$, the $ is not followed by any character, so this rule does not apply and the unspecified result is not invoked. There is simply no expansion incited by the $, so it is literally present and printed out.



Where it would apply is in our third case from above: echo a$+b. Here $ is followed by +, which is not a number, special parameter (@, *, #, ?, -, $, !, or 0), start of a variable name (underscore or an alphabetic from the portable character set), or one of the brackets. In this case, the behaviour is unspecified: a conforming shell is permitted to invent a special parameter called + to expand, and a conforming application should not assume that the shell does not. The shell could do anything else it liked as well, including reporting an error.



For example, zsh, including in its POSIX mode, interprets $+b as "is variable b set" and substitutes either 1 or 0 in its place. It similarly has extensions for ~ and =. This is conforming behaviour.



Another place this could happen is echo "a$ b". Again, the shell is permitted to do as it wishes, and you as the script author should escape the $ if you want literal output. If you don't, it may work, but you can't rely on it. This is the absolute letter of the specification, but I don't think this sort of granularity was intended or considered.





In summary





  • echo ab$: literal output, fully specified


  • echo a$ b: literal output, fully specified


  • echo a$ b$: literal output, fully specified


  • echo a$b: expansion of parameter b, fully specified


  • echo a$-b: expansion of special parameter -, fully specified


  • echo a$+b: unspecified behaviour


  • echo "a$ b": unspecified behaviour


For a $ at the end of a word, you are permitted to rely on the behaviour and it must be treated literally and passed on to the echo command as part of its argument. That is a conformance requirement on the shell.






share|improve this answer



















  • 1




    Awesome summary at the end
    – Harold Fischer
    15 hours ago










  • As an aside, this also means echo a$ b$ would be fully specified, correct?
    – Harold Fischer
    15 hours ago










  • @HaroldFischer Yes, each word could have its own $.
    – Michael Homer
    15 hours ago










  • Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
    – Christopher
    14 hours ago










  • @Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
    – Stéphane Chazelas
    14 hours ago
















5












5








5






Depending on the exact situation, this is either explicitly unspecified (so implementations may do as they will) or required to happen as you observed. In your exact scenario echo ab$, POSIX mandates the output "ab$" that you observed and it is not unspecified. A quick summary of all the different cases is at the end.



There are two elements: first tokenising into words, and then interpretation of those words.





Tokenisation



POSIX tokenisation requires that a $ that is not the start of a valid parameter expansion, command substitution, or arithmetic substitution to be considered a literal part of the WORD token being constructed. This is because rule 5 ("If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively") does not apply, as none of those expansions are viable there. Parameter expansion requires a valid name to appear there, and an empty name is not valid.



Since this rule did not apply, we continue following until we find one that does. The two candidates are #8 ("If the previous character was part of a word, the current character shall be appended to that word.") and #10 ("The current character is used as the start of a new word."), which apply to echo a$ and echo $ respectively.



There is also a third case of the form echo a$+b which falls through the same crack, since + is not the name of a special parameter. This one we'll return to later, since it triggers different parts of the rules.



The specification thus requires that the $ be considered a part of the word syntactically, and it can then be further processed later on.





Word expansion



After the input has been parsed in this way, with the $ included in the word, word expansions are applied to each of the words that have been read. Each word is processed individually.



It is specified that:




If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




"Unspecified" is a particular term here meaning that




  1. A conforming shell can choose any behaviour in this case

  2. A conforming application cannot rely on any particular behaviour


In your example, echo ab$, the $ is not followed by any character, so this rule does not apply and the unspecified result is not invoked. There is simply no expansion incited by the $, so it is literally present and printed out.



Where it would apply is in our third case from above: echo a$+b. Here $ is followed by +, which is not a number, special parameter (@, *, #, ?, -, $, !, or 0), start of a variable name (underscore or an alphabetic from the portable character set), or one of the brackets. In this case, the behaviour is unspecified: a conforming shell is permitted to invent a special parameter called + to expand, and a conforming application should not assume that the shell does not. The shell could do anything else it liked as well, including reporting an error.



For example, zsh, including in its POSIX mode, interprets $+b as "is variable b set" and substitutes either 1 or 0 in its place. It similarly has extensions for ~ and =. This is conforming behaviour.



Another place this could happen is echo "a$ b". Again, the shell is permitted to do as it wishes, and you as the script author should escape the $ if you want literal output. If you don't, it may work, but you can't rely on it. This is the absolute letter of the specification, but I don't think this sort of granularity was intended or considered.





In summary





  • echo ab$: literal output, fully specified


  • echo a$ b: literal output, fully specified


  • echo a$ b$: literal output, fully specified


  • echo a$b: expansion of parameter b, fully specified


  • echo a$-b: expansion of special parameter -, fully specified


  • echo a$+b: unspecified behaviour


  • echo "a$ b": unspecified behaviour


For a $ at the end of a word, you are permitted to rely on the behaviour and it must be treated literally and passed on to the echo command as part of its argument. That is a conformance requirement on the shell.






share|improve this answer














Depending on the exact situation, this is either explicitly unspecified (so implementations may do as they will) or required to happen as you observed. In your exact scenario echo ab$, POSIX mandates the output "ab$" that you observed and it is not unspecified. A quick summary of all the different cases is at the end.



There are two elements: first tokenising into words, and then interpretation of those words.





Tokenisation



POSIX tokenisation requires that a $ that is not the start of a valid parameter expansion, command substitution, or arithmetic substitution to be considered a literal part of the WORD token being constructed. This is because rule 5 ("If the current character is an unquoted $ or `, the shell shall identify the start of any candidates for parameter expansion, command substitution, or arithmetic expansion from their introductory unquoted character sequences: $ or ${, $( or `, and $((, respectively") does not apply, as none of those expansions are viable there. Parameter expansion requires a valid name to appear there, and an empty name is not valid.



Since this rule did not apply, we continue following until we find one that does. The two candidates are #8 ("If the previous character was part of a word, the current character shall be appended to that word.") and #10 ("The current character is used as the start of a new word."), which apply to echo a$ and echo $ respectively.



There is also a third case of the form echo a$+b which falls through the same crack, since + is not the name of a special parameter. This one we'll return to later, since it triggers different parts of the rules.



The specification thus requires that the $ be considered a part of the word syntactically, and it can then be further processed later on.





Word expansion



After the input has been parsed in this way, with the $ included in the word, word expansions are applied to each of the words that have been read. Each word is processed individually.



It is specified that:




If an unquoted '$' is followed by a character that is not one of the following:




  • A numeric character

  • The name of one of the special parameters (see Special Parameters)

  • A valid first character of a variable name

  • A <left-curly-bracket> ( '{' )

  • A <left-parenthesis>


the result is unspecified.




"Unspecified" is a particular term here meaning that




  1. A conforming shell can choose any behaviour in this case

  2. A conforming application cannot rely on any particular behaviour


In your example, echo ab$, the $ is not followed by any character, so this rule does not apply and the unspecified result is not invoked. There is simply no expansion incited by the $, so it is literally present and printed out.



Where it would apply is in our third case from above: echo a$+b. Here $ is followed by +, which is not a number, special parameter (@, *, #, ?, -, $, !, or 0), start of a variable name (underscore or an alphabetic from the portable character set), or one of the brackets. In this case, the behaviour is unspecified: a conforming shell is permitted to invent a special parameter called + to expand, and a conforming application should not assume that the shell does not. The shell could do anything else it liked as well, including reporting an error.



For example, zsh, including in its POSIX mode, interprets $+b as "is variable b set" and substitutes either 1 or 0 in its place. It similarly has extensions for ~ and =. This is conforming behaviour.



Another place this could happen is echo "a$ b". Again, the shell is permitted to do as it wishes, and you as the script author should escape the $ if you want literal output. If you don't, it may work, but you can't rely on it. This is the absolute letter of the specification, but I don't think this sort of granularity was intended or considered.





In summary





  • echo ab$: literal output, fully specified


  • echo a$ b: literal output, fully specified


  • echo a$ b$: literal output, fully specified


  • echo a$b: expansion of parameter b, fully specified


  • echo a$-b: expansion of special parameter -, fully specified


  • echo a$+b: unspecified behaviour


  • echo "a$ b": unspecified behaviour


For a $ at the end of a word, you are permitted to rely on the behaviour and it must be treated literally and passed on to the echo command as part of its argument. That is a conformance requirement on the shell.







share|improve this answer














share|improve this answer



share|improve this answer








edited 14 hours ago

























answered 15 hours ago









Michael Homer

46.1k8121160




46.1k8121160








  • 1




    Awesome summary at the end
    – Harold Fischer
    15 hours ago










  • As an aside, this also means echo a$ b$ would be fully specified, correct?
    – Harold Fischer
    15 hours ago










  • @HaroldFischer Yes, each word could have its own $.
    – Michael Homer
    15 hours ago










  • Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
    – Christopher
    14 hours ago










  • @Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
    – Stéphane Chazelas
    14 hours ago
















  • 1




    Awesome summary at the end
    – Harold Fischer
    15 hours ago










  • As an aside, this also means echo a$ b$ would be fully specified, correct?
    – Harold Fischer
    15 hours ago










  • @HaroldFischer Yes, each word could have its own $.
    – Michael Homer
    15 hours ago










  • Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
    – Christopher
    14 hours ago










  • @Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
    – Stéphane Chazelas
    14 hours ago










1




1




Awesome summary at the end
– Harold Fischer
15 hours ago




Awesome summary at the end
– Harold Fischer
15 hours ago












As an aside, this also means echo a$ b$ would be fully specified, correct?
– Harold Fischer
15 hours ago




As an aside, this also means echo a$ b$ would be fully specified, correct?
– Harold Fischer
15 hours ago












@HaroldFischer Yes, each word could have its own $.
– Michael Homer
15 hours ago




@HaroldFischer Yes, each word could have its own $.
– Michael Homer
15 hours ago












Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
– Christopher
14 hours ago




Wow, thanks. Great teaching. I get the output of a0 in zsh 5.6.2 for echo a$+b. I understand it's the invention of zsh.
– Christopher
14 hours ago












@Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
– Stéphane Chazelas
14 hours ago






@Christopher, yes in zsh $+var or ${+var} expands to 1 if $var is set and 0 otherwise (see also $#var from csh to get the number of elements of the array (or length of a variable). Note that zsh only tries to be POSIX compliant in sh emulation (when invoked as sh or after emulate sh). In sh emulation, $#var is disabled as $#var means something different in POSIX/Bourne, but $+var is not as $+var is unspecified anyway by POSIX. See also $[var] unspecified by POSIX, but used as arithmetic expansion by bash/zsh (based on an early POSIX draft IIRC)
– Stéphane Chazelas
14 hours ago




















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492397%2fposix-shell-does-lose-its-special-meaning-if-it-is-the-last-character-in-a%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

An IMO inspired problem

Management

Investment