2

On page 10 of the book The TeXbook, it says that \show\cs, where \cs is any control sequence, can output its meaning. For example, \show\thinspace outputs

> \thinspace=macro:
->\kern .16667em .

Why does \show\ output:

> \^^M=macro:
->\ .

Why is there ^^M here? I knew the ^^M refers to the enter key in keyboard, but may you tell me the reason?

Joseph Wright
  • 259,911
  • 34
  • 706
  • 1,036
Y. zeng
  • 1,885
  • I am comfused why the titile was needed to be changed to \show" instead of "\show\ " and why the why was changed to where. – Y. zeng Oct 27 '23 at 11:47
  • I changed it because you are showing the end-of-line command – Joseph Wright Oct 27 '23 at 11:49
  • @JosephWright But "\show" wouldn't output "^^M" – Y. zeng Oct 27 '23 at 11:50
  • @JosephWright But why 'where'? The original made sense. I don't understand the 'where'? – cfr Oct 27 '23 at 11:51
  • @Y.zeng It will, but it is arguably a slightly different question. For example, David wouldn't need to mention space-stripping if the space weren't there. – cfr Oct 27 '23 at 11:56
  • @Y.zeng Er, not sure how that happened: I've fixed – Joseph Wright Oct 27 '23 at 11:58
  • @cfr I have tested and comfirm that it will. So, there is no difference between "\show" and "\show\ "? It is weird. – Y. zeng Oct 27 '23 at 12:00
  • As David says in his answer, TeX strips spaces at end-of-lines before tokenization, so \ [end-of-line] and \[end-of-line] are identical – Joseph Wright Oct 27 '23 at 12:02
  • @Y.zeng Because the is at the end of the line, it gets removed from the input stream. So \show\ <end of line> becomes \show\<end of line>. – cfr Oct 27 '23 at 12:03
  • @JosephWright There is a question, I can add % symbol at the end of the line. But if it is "\show", I can't add % after it. If we can add, we can tell the difference. – Y. zeng Oct 27 '23 at 12:06
  • @cfr Do you see this post: https://tex.stackexchange.com/questions/699598/space-vs-in-tex?noredirect=1#comment1738912_699598 , which says the last space can be added by \. Is \show cause the last space removed? – Y. zeng Oct 27 '23 at 12:08
  • 2
    @Y.zeng OK, I've reverted as this is clearly part of what is confusing you – Joseph Wright Oct 27 '23 at 12:08
  • @Y.zeng The \ there is because the macro right before would otherwise swallow the space. But \ at the end of a line won't stop TeX removing the space. – cfr Oct 27 '23 at 12:09
  • @cfr If the space at the end of the line is removed, what is the meaning of the last "" in "\show"? The last "" has no meaning. – Y. zeng Oct 27 '23 at 12:16
  • @Y.zeng That's formally a different question - \[newline] is the control sequence you get with [newline] - it's just a character like [space] or a or whatever, so can be used to make one – Joseph Wright Oct 27 '23 at 12:19
  • @Y.zeng It applies to the new line. So you get an escaped new line, maybe. – cfr Oct 27 '23 at 12:19
  • @JosephWright If "\show\ " equals to "\show", this means that nothing can be added after the backslash, since there may not be a next line after the text. The TeXbook doesn't say this. – Y. zeng Oct 27 '23 at 12:24
  • @cfr If there is no new line after it? – Y. zeng Oct 27 '23 at 12:24
  • @Y.zeng As David has explained, this happens because TeX strips end-of-line spaces - if you had the \show\ in the middle of a line of other content, it would be different – Joseph Wright Oct 27 '23 at 12:25
  • @JosephWright As both of us know, what you get from \ at the end of a line of .tex-input also depends on the value of \endlinechar at the time of reading/pre-processing the line of .tex-input in question. The question is about the control sequence token whose name is formed by the carriage return character whose codepoint is 13(dec) in ASCII. Probably that token can colloquially be called "control carriage return" like in the TeXbook the control sequence token whose name is formed by the space character, whose codepoint is 32(dec) in ASCII, is called control space? – Ulrich Diez Oct 27 '23 at 13:12

1 Answers1

5

the behaviour of non printable characters in tex output is customisable by the tcx options on the commandline but by default a newline is shown as ^^M (character 13) and \ at the end of a line is \<newline> that is \^^M as the space character is stripped by TeXs file reading routines before the input is tokenised. If you end the line with \ % then you will get a \ shown as the control sequence will have name space not name newline.

Joseph Wright
  • 259,911
  • 34
  • 706
  • 1,036
David Carlisle
  • 757,742
  • And plain.tex, line 505 defines \^^M as \. – wipet Oct 27 '23 at 11:49
  • May you tell why ^^M represents the character 13? Do you feel ^^M is a little weird? – Y. zeng Oct 27 '23 at 11:56
  • 1
    @Y.zeng if you subtract 32 (0x40) to M you have the ASCII code of the carriage return. See https://upload.wikimedia.org/wikipedia/commons/1/1b/ASCII-Table-wide.svg. This is why the character with code zero can be written ^^@, etc... – Rmano Oct 27 '23 at 11:59
  • 1
    It is the TeX syntax for ctrl-M which is carriage return, \r just as ^^J is ctrl-J which is newline \n . This is not tex specific (apart from the ^^ syntax), ctrl-m has end of line behaviour in general. – David Carlisle Oct 27 '23 at 12:03
  • @Rmano the ASCII code of the M is 77. 77-32=45. But the carriage return is 13. Did I make a mistake? – Y. zeng Oct 27 '23 at 12:04
  • 2
    @Y.zeng the offset for ctrl is 64 not 32 – David Carlisle Oct 27 '23 at 12:05
  • @Y.zeng is that I forgot my multiplication, I am better in hex than in decimal (yep, 4x16 is not 32) – Rmano Oct 27 '23 at 12:07
  • @DavidCarlisle So ^^ represents ctrl key? – Y. zeng Oct 27 '23 at 12:10
  • @Rmano May you tell me why there is 4x16? – Y. zeng Oct 27 '23 at 12:10
  • @Y.zeng well it's not documented as that, it is documented as offsetting by 64, but ctrl is defined the same way (unless you have a non standard keyboard mapping, so yes) – David Carlisle Oct 27 '23 at 12:11
  • 0x40 in hexadecimal -> 4*16=64 in decimal – Rmano Oct 27 '23 at 12:11
  • @Rmano Where did you find 0x40? By your link I only find the 4D. – Y. zeng Oct 27 '23 at 12:18
  • @Y.zeng just standard way to write hex numbers. 4 is the high nibble of 4D, 4D-40=D... Using 0x in front of a number is almost a standard way to say it's hexadecimal. – Rmano Oct 27 '23 at 12:52
  • @wipet yes latex the same (line 565 of latex.ltx) – David Carlisle Oct 27 '23 at 13:12
  • Why did you say \<newline> and not \<carriage return>? || Probably it is worth mentioning that a backslash character (trailed by some space characters) at the end of a line of .tex-input being tokenized as a control sequence token whose name is formed by the carriage return character is due to \endlinechar usually having the value 13? And iirc it was you who drew my attention towards the circumstance that with some implementations of TeX (deviating from what is written in the TeXbook) not only space characters but also horizontal tab characters at the right ends of lines are stripped.;-) – Ulrich Diez Oct 27 '23 at 13:24
  • Why ^^M represents the character 13? TeX's ^^⟨character⟩-notation is leaned on caret-notation for non-printable ASCII-control-characters. And caret-notation in turn is designed so that in most cases the character whose codepoint is the number k in ASCII is represented by means of the k-th letter of the latin alphabet. E.g., M is the 13th letter in the latin alphabet. – Ulrich Diez Oct 27 '23 at 13:30
  • 1
    @UlrichDiez well since tex normalizes 10, 13, and 10-13 pairs on input (to \endlinechar=13) the names are a bit arbitrary to be honest. – David Carlisle Oct 27 '23 at 13:31
  • @DavidCarlisle So the point of view is what a user might have typed. Now I got it. Thanks. ;-) – Ulrich Diez Oct 27 '23 at 13:33
  • @DavidCarlisle linefeed, carriage return and linefeed-carriage return all being treated as end of line when reading input - am I right when assuming that this is specified neither in the TeXbook nor in TeX the program, but is a peculiarity of Web2C-implementations? – Ulrich Diez Oct 27 '23 at 13:48
  • 2
    @UlrichDiez tex just assumes that the input consists of "lines" it does not have to be a stream with separator characters could be a record based system, or punched card per line or whatever, so yes this is web2c but explicitly authorised system-specific part of tex.web – David Carlisle Oct 27 '23 at 13:57
  • 1
    @UlrichDiez web2c is a "well-developed pascal site" :-) Since the inner loop of |input_ln| is part of \TeX's ``inner loop''---each character of input comes in at this place---it is wise to reduce system overhead by making use of special routines that read in an entire array of characters at once, if such routines are available. The following code uses standard \PASCAL\ to illustrate what needs to be done, but finer tuning is often possible at well-developed \PASCAL\ sites. @^inner loop@> – David Carlisle Oct 27 '23 at 14:01