The short answer to "why do my DFT responses differ?" is that you have a chirp with a very small time-bandwidth product.
Chirps that last longer in time, or sweep over a larger frequency range, become closer to an ideal chirp.

(from https://en.wikipedia.org/wiki/Chirp_spectrum, though the page is also somewhat confusing overall)
The shape of the rect. function in frequency has nothing to do with the sampling frequency $f_s$ (assuming you're sampling sufficiently faster than the Nyquist frequency), and nothing to do with adding extra zero padding.
I agree that the original book was confusing by drawing the graph as a straight line instead of a rectangle.
The post by @Envidia already illustrates how a larger time-bandwidth product will make the frequency response approach the ideal rect function, with the Gibb's ringing being unavoidable for non-infinite discrete signals.
Re: the puzzle "why does the frequency 10 sinusoid move slower than the frequency 1->2 chirp?"
I just wanted to include this because I've made the same mistake and seen new signal processing students do this too. Most signal processing texts have mostly math formulas with continuous variables, so it's easy to think writing cos(2 * pi * f * t) in Python or MATLAB would mean that your angular phase is $2 \pi f t$ and therefore frequency is the variable f, regardless of what the code variables t and f hold.
Though it seems like doing f = linspace(1, 2); cos(2 * pi * f * t) means that the cosine is "sweeping frequency from 1 to 2", the instantaneous frequency of $\cos(\phi(t))$ is the derivative of $\phi$, $\dot{\phi}(t)$ (as noted above).
Therefore, doing this only works if you start time at 0, and have f = linspace(1, 2). This will get the plot you expect for the phase (the argument to the cosine) and the cosine itself:

BUT if you have something like t = linspace(1, 2, n), the phase that sweeps the same frequency range is no longer just 2 * pi * f * t.
To sweep from $f_{1}(t)$ to $f_2(t)$, you'll want to create the function $f(t)$ as a line starting from $f_{1}(t)$ to $f_2(t)$. Then use the values of $t$ you're considering and solve for the slope and intercept. This ensures that the slope of the phase matches the frequencies you are trying to sweep over.
Here's an example of the difference, using slightly different numbers (If you are trying to "sweep frequency from 1 to 2", but the time your looking at starts at 1 instead of 0 and goes for 1 second). The left are phases, and right are cos(phase) plots to see how fast they wiggle. The wrong way in red is doing
t = linspace(1, 2); f = linspace(1, 2); phase_wrong = 2 * pi * f * t. By comparing with the constant-slope-phase of 2, which definitely has a frequency of 2 everywhere, you can see that this method goes something with a steeper slope than 2, so it's not sweeping frequency from 1 to 2.

(code for the plot:)
import matplotlib.pyplot as plt
from numpy import pi, cos, linspace
n = 500
t = linspace(0, 2, n)
f = linspace(1, 2, n)
fig, axes = plt.subplots(1, 2)
ax = axes[0]
ax.plot(t, 2 * pi * 1 * t, c="b", label="freq=1")
ax.plot(t, 2 * pi * 2 * t, c="cyan", label="freq=2")
ax.plot(t, 2 * pi * f * t, c="red", label="sweep 1->2")
ax.set_title(r"$\phi(t)$")
ax.legend()
ax = axes[1]
ax.plot(t, cos(2 * pi * 1 * t), c="b", label="freq=1")
ax.plot(t, cos(2 * pi * 2 * t), c="cyan", label="freq=2")
ax.plot(t, cos(2 * pi * f * t), c="red", label="sweep 1->2")
ax.set_title(r"$\cos(\phi(t))$")
ax.legend()
fig, axes = plt.subplots(1, 2)
t = linspace(1, 2, n)
phi_right = pi * (t ** 2 - 1)
f = linspace(1, 2, n)
phi_wrong = 2 * pi * f * t
Compare to a constant freq (straight phase line)
const_freq = 2
ax = axes[0]
ax.plot(t, 2 * pi * const_freq * t, c="b", label=f"{const_freq = }")
Make it start at same spot as constant freq phase line
offset1 = phi_right[0] - 2 * pi * const_freq * t[0]
ax.plot(t, phi_right - offset1, c="cyan", label="correct: f sweeps 1 -> 2")
offset2 = phi_wrong[0] - 2 * pi * const_freq * t[0]
ax.plot(t, phi_wrong - offset2, c="red", label="wrong way (2 pi f t)")
ax.set_title(r"$\phi(t)$")
ax.legend()
ax = axes[1]
ax.plot(t, cos(2 * pi * const_freq * t), c="b", label=f"{const_freq = }")
ax.plot(t, cos(phi_right), c="cyan", label="correct: f sweeps 1 -> 2")
ax.plot(t, cos(phi_wrong), c="red", label="wrong way (2 pi f t)")
ax.legend()
ax.set_title(r"$\cos(\phi(t))$")
(@OverLordGoldDragon also pointed out more real code examples of his, which can be more useful than the math-only formulas: https://overlordgolddragon.github.io/test-signals/#signal-general-forms-derivations )
N -> inf? As I further noted, the energy of the magnitude spike doesn't near zero with increasingN, which it must if we are to conclude "it's becauseNis finite". It isn't even a spike, but worse, and unwrapped phase turned out linear - question updated. – OverLordGoldDragon Sep 03 '20 at 00:00fftto find real DFT, then finds magnitude assqrt(imag^2 + real^2). Are you able to flatten the magnitude in MATLAB? – OverLordGoldDragon Sep 03 '20 at 00:32If you want to read this in a textbook, Fundamentals of Radar Signal Processing by Mark A. Richards will go into LFM and show spectrums as well as the derivation of the analytical expressions.
– Envidia Sep 03 '20 at 03:15tau=0.4with8*Npadding - the width of the positive flat region doesn't budge. The limit is clearly a square wave here, not a flat line. I further discovered that one can obtain the phase/magnitude you've shown with a real chirp; the only difference is in how we definedt. I'll look closer - and thanks for the references. – OverLordGoldDragon Sep 03 '20 at 03:19bandWidthandFsimmensely affect phase & magnitude. I found that settingFs = 2 * bandWidthdoes yield a flat line in the limit - but this is strange; shouldn'tFs >> bandWidthbe more accurate? Yet that yields a square of ever-diminishing width, maybe even an impulse in the limit. E.g. comparebandWidth=1e6 / 2withbandWidth=1e4, forFs=1e6,tau=1.bandWidth=1e7bombs entirely, guess per aliasing. – OverLordGoldDragon Sep 03 '20 at 03:40=1from 0 to 0.5 of sampling frequency, meaning for allf. Found your recommended textbook, somehow the exact chapter needed: pg 170-175. It shows the limit as a square. You mentioned "zooms to the portion where the chirp exists in FD" - meaning what, exactly? How would one modify a chirp in time domain to yield a flat line for allf? – OverLordGoldDragon Sep 03 '20 at 04:11