I had a quick look at Gilbert's text and here is what I understood:
You are given a curve $\gamma=(x,y):I\to\mathbb R^2$, where $I\subset\mathbb R$ is some open interval.
Suppose for example that the curve is smooth and that $\gamma'(t)\neq 0$ for all $t\in I$.
Then as derived by me here, the length of $\gamma$ is given by $$\int_I \|\gamma'(t)\| \,\mathrm dt.$$ If you use the strange-looking notation $\gamma' = (\mathrm dx, \mathrm dy)$, then you obtain the "$\mathrm ds = \sqrt{\mathrm dx^2+\mathrm dy^2}$" part. (I will maybe give a justification for this notation down below in the future, but currently that part is incomplete.)
Furthermore, an application of the implicit function Theorem [2; Satz 169.1], details left as exercise to you, gives that there exists (after relabeling the axes $x$ and $y$ if necessary) for each $t\in I$ a neighborhood $J\subset I$ and a smooth function $f_J:J\to\mathbb R$ such that $\gamma(t) = (x(t),f_J(x(t)))$ for all $t\in J$.
Then the length of $\gamma\vert_J$ is $$\int_J \|\gamma'(t)\| \,\mathrm dt = \int_J \|(x(t),f_J(x(t)))'\| \,\mathrm dt=\int_J\lvert x'(t)\rvert\sqrt{1+f_J'(x(t))^2}\,\mathrm dt.$$ Now substitute $u=x=x(t)$ to obtain, in your notation, "$\mathrm ds = \sqrt{1+\left(\frac{\mathrm dy}{\mathrm dx}\right)^2}\,\mathrm dx$". (Here, $\frac{\mathrm dy}{\mathrm dx}$ is short for $f_J'$.) (And indeed you have to be wary of the sign of $x'$.)
This part is not finished yet.
Then every point $t\in I$ has a neighborhood $J\subset I$ such that the image of $\gamma\vert_J$ is an embedded $1$-dimensional smooth submanifold of $\mathbb R^2$. This follows from the Local Parametrization Theorem [1; Theorem 2.5]. Indeed, Such a $f_J$ provides the local parametrizations for [1; Theorem 2.5].
Literature
[2] Harro Heuser, Lehrbuch der Analysis. Teil 2. 11. Auflage. 2000.