82

In my Calculus class, my math teacher said that differentials such as $dx$ are not numbers, and should not be treated as such.

In my physics class, it seems like we treat differentials exactly like numbers, and my physics teacher even said that they are in essence very small numbers.

Can someone give me an explanation which satisfies both classes, or do I just have to accept that the differentials are treated differently in different courses?

P.S. I took Calculus 2 so please try to keep the answers around that level.

P.S.S. Feel free to edit the tags if you think it is appropriate.

Qmechanic
  • 201,751
Ovi
  • 2,889
  • Related: https://physics.stackexchange.com/q/70376/2451 , https://mathoverflow.net/q/178267/13917 – Qmechanic Jan 09 '14 at 00:53
  • 1
    @dmckee Depending on what you mean by "such shortcuts" I think they're perfectly rigorous either for reasons seen at anupam's link or because of the formalizations of nonstandard analysis or smooth infinitesimal analysis. (for a very brief demonstration of what calculus is like with the former, see http://math.stackexchange.com/a/623657/26369 ) – Mark S. Jan 09 '14 at 03:14
  • @MarkS. Yes, it seems that my nth-hand oral tradition information was much out of date. – dmckee --- ex-moderator kitten Jan 09 '14 at 03:40
  • 1
    As a non-expert (although educated in Physics), it seemed to be enough for me to realize that things like going from $\frac{dy}{dx} = x$ to $dy = x dx$ or using $\frac{dy}{dx} = \frac{dy}{dt} \frac{dt}{dx}$ are not algebraic operations as they might first appear to be. We can do that, but there is a great deal of complicated math required to justify it (as seen in these answers). So I didn't think of them as numbers. I just thought of what I was doing as a shortcut that allowed me to keep track of what I was doing, and I made sure I never tried to apply algebraic operations blindly. – jpmc26 Jan 09 '14 at 08:42
  • 4
    When somebody tells you that something is not a number, it probably means that it is not a real number, or that it doesn't belong to the real numbers or any other number set that the person speaking has in mind. For example people may say that infinite is not a number, yet it is - for example in Riemann sphere - but it is not a real number. The same goes for infinitesimals. – Theraot Jan 09 '14 at 09:55
  • 3
    Since this is a frequently asked question by physics students, and phrased in physics language, it seems a valid question on Phys.SE. Also note that Math.SE already has several posts on differentials and infinitesimals. – Qmechanic Jan 09 '14 at 13:46
  • @jpmc26 The math is not so complicated. See my answer below. It becomes a bit more complicated if you want to do as much as possible without coordinates. But, that is not necessary for Calculus 2. – Tobias Jan 11 '14 at 13:05
  • 1
    There is no effort to incorporate physics into this question, and it needs to be migrated, despite what QM says. – Larry Harson Jan 14 '14 at 23:26
  • 3
    This is a fine question, for the differing treatment of differentials and infinitesimals between physics courses and calculus courses have caused students who have succeeded in Calculus class to be ill-prepared for their physics class! ( See https://journals.aps.org/prper/pdf/10.1103/PhysRevSTPER.9.020108) – James Fair Nov 12 '19 at 09:39
  • 1
    It is more like physics courses are ill-prepared for students who have succeeded in Calculus class. ☺ – beroal May 21 '20 at 11:24

9 Answers9

38

There is an old tradition, going back all the way to Leibniz himself and carried on a lot in physics departments, to think of differentials intuitively as "infinitesimal numbers". Through the course of history, big minds have criticized Leibniz for this (for instance the otherwise great Bertrand Russell in Chapter XXXI of "A History of Western Philosophy" (1945)) as being informal and unscientific.

But then something profound happened: William Lawvere, one of the most profound thinkers of the foundations of mathematics and of physics, taught the world about topos theory and in there about "synthetic differential geometry". Among other things, this is a fully rigorous mathematical context in which the old intuition of Leibniz and the intuition of plenty of naive physicists finds a full formal justification. In Synthetic differential geometry those differentials explicitly ("synthetically") exist as infinitesimal elements of the real line.

A basic exposition of how this works is on the nLab at

Notice that this is not just a big machine to produce something you already know, as some will inevitably hasten to think. On the contrary, this leads the way to the more sophisticated places of modern physics. Namely the "derived" or "higher geometric" version of synthetic differential geometry includes modern D-geometry which is at the heart for instance of modern topics such as BV-BRST formalism (see e.g. Paugam's survey) for the quantization of gauge theories, or for instance geometric Langlands correspondence, hence S-duality in string theory.

Urs Schreiber
  • 13,678
  • 5
    +1 Not quite at the OP's asked for level, but most interesting indeed and a collection of this kind of posts of this kind could make this a killer question and set of answers. You've emboldened me to write up Robinson's nonstandard analysis when I have more time if someone doesn't beat me to it. – Selene Routley Jan 09 '14 at 01:16
  • Is the infinitesimal interval $D$ used to cheat new generalized elements into our space $X$, which aren't there in the classical formulations? Some of the links speak of $D$ as a subset of a ring. Is there always an ordering on the infinitesimal interval and does one think of this as a collection of different small things, equipped with some notion of size for distinguishing them? It also seems like the Kock-Lawvere axiom (a functional "$ϵ^2=0$"?) seems suitable for extension to infinitesimal calculi like Ito calculus etc., is that right? Do we do super-stuff directly by adding this object? – Nikolaj-K Jan 10 '14 at 20:14
  • 1
    There are two complementary aspects to this. On the one hand the categorical logic of toposes allows to formally speak of the subset of the real line of elements that square to 0. This is just what people following Leibniz intuitively did anyway, but categorical logic shows that and how exactly this is consistent. This is then how notably Anders Kock (http://home.imf.au.dk/kock/) wrote his two textbooks on synthetic differential geometry (http://home.imf.au.dk/kock/SGM-final.pdf): he speaks "synthetically" of the subset D of R on the elements that square to 0 and derives all of diff geometry. – Urs Schreiber Jan 11 '14 at 00:17
  • On the other hand one can choose to build concrete models for the axioms in which notably the textbooks by Kock are written, hence for toposes that validate the Kock-Lawvere axioms. In the typical such models the category of smooth manifolds is enlarged somewhat by objects known as "smooth loci", which include for instance the space formally dual to the "ring of dual numbers", which is just the ring embodying the equation "epsilon^2 = 0". This more concrete incarnation of SDG can be phrased entirely in classical logic and hence shows which classical notions embody the idea of infinitesimals. – Urs Schreiber Jan 11 '14 at 00:21
  • Hi Urs, while I agree that Lawvere's approach is a great accomplishment, presenting the historical development as you did seems to shortshrift Robinson's contribution to interpreting Leibniz's infinitesimal procedures. Robinson's original paper appeared in 1961, which would be earlier than Lawvere's approach. – Mikhail Katz Dec 10 '15 at 09:38
  • @Wet, I would encourage you to carry out your plan of writing up an answer in terms of Robinson's framework. – Mikhail Katz Dec 10 '15 at 09:39
17

(I'm addressing this from the point of view of standard analysis)

I don't think you will have a satisfactory understanding of this until you go to multivariable calculus, because in calculus 2 it's easy to think that $\frac{d}{dx}$ is all you need and that there's no need for $\frac{\partial}{\partial x}$ (This is false and it has to do with why in general derivatives do not always behave like fractions). So that's one reason why differentials are not like numbers. There are some ways that differentials are like numbers, however.

I think the most fundamental bit is that if you're told that $f dx=dy$, this means that $y$ can be approximated as $y(x)=y(x_0)+f\cdot(x-x_0)+O((x-x_0)^2)$ close to the point $x_0$ (this raises another issue*). Since this first order term is really all that matters after one applies the limiting procedures of calculus, this gives an argument for why such inappropriate treatment of differentials is allowable - higher order terms don't matter. This is a consequence of Taylor's theorem, and it is what allows your physics teacher to treat differentials as very small numbers, because $x-x_0$ is like your "dx" and it IS a real number. What allows you to do things you can't do with a single real number is that that formula for $y(x)$ holds for all $x$, not just some x. This lets you apply all the complicated tricks of analysis.

If I get particularly annoyed at improper treatment of differentials and I see someone working through an example where they write, "Now we take the differential of $x^2+x$ giving us $(2x+1)dx$", I may imagine $dx$ being a standard real number, and that there's a little $+O(dx^2)$ tacked off to the side.

Your math teacher might argue, "You don't know enough about those theorems to apply them properly, so that's why you can't think of differentials as similar to numbers", while your physics teacher might argue, "The intuition is the really important bit, and you'd have to learn complicated math to see it as $O(dx^2)$. Better to focus on the intuition."

I hope I cleared things up instead of making them seem more complicated.

*(The O notation is another can of worms and can also be used improperly. Using the linked notation I am saying "$y(x)-y(x_0)-f\cdot(x-x_0)=O((x-x_0)^2)$ as $x\to x_0$". Note that one could see this as working against my argument - It's meaningless to say "one value of $x$ satisfies this equation", so when written in this form (which your physics prof. might find more obtuse and your math prof. might find more meaningful) it's less of an equation and more of a logical statement.)

See also: https://mathoverflow.net/questions/25054/different-ways-of-thinking-about-the-derivative

  • good answer, but only scratching the surface. Apart from the square of an infinitesimal, there is also the 'square' of an infinitesimal (which basically is a matrix, think of ds^2 in Landau-Lifshitz Volume II). And ds is actually not a differential form (as far as I can tell, since it is not linear), it seems a mess. – lalala Apr 21 '21 at 08:54
13

I think your math teacher is right. One way to see that differentials are not normal numbers is to look at their relation to so called 1-forms. I do not know if you already have had forms in calculus 2, but it is easy to look up on the internet.

Since you chose a tag "integrals" in your question, let me give you an example based on an integral. Let's say you have a function $f(x^2+y^2)$ and want to integrate it over some area $A$:

$$\int_A f(x^2+y^2) \, dx \, dy$$

The important thing to realize here is, that the $dx\,dy$ is actually just an abbreviation for $dx\wedge dy$. This $\wedge$ thingy is an operation (wedge product - much like multiplication, but with slightly different rules) that can combine forms (in this case it combines two $1$-forms to a $2$-form). One important rule for wedge products is anti-commutation:

$$dx\wedge dy=-dy\wedge dx$$

This makes sure that $dx\wedge dx=0$ (where a physicist could cheat by saying that he neglects everything of order $O(dx^2)$, but that is like mixing pears and apples, frankly misleading). Why would differentials in integrals behave like this and where is the physical meaning? Well, here you can think about the 'handedness' of a coordinate system. For instance the integration measure $dx\wedge dy\wedge dz$ is cartesian 'right-handed'. You can make it 'left-handed' by commuting the $dx$ with $dy$ to obtain $-dy\wedge dx\wedge dz$, but then the minus sign appears in front, which makes sure that your integration in a 'left-handed' coordinate system still gives you the same result as the initial 'right-handed' one.

In any case, to come back to the above integral example, let's say you like polar coordinates better to perform your integration. So you do the following substitution (assuming you already know how to take total differentials):

$$x = r \cos \phi~~~,~~~dx = dr \cos \phi - d\phi\, r \sin \phi$$ $$y = r \sin \phi~~~,~~~dy = dr \sin \phi + d\phi\, r \cos \phi$$

Multiplying out your $dx\wedge dy$ you find what you probably already know and expect:

$$dx\wedge dy = (dr \cos \phi - d\phi\, r \sin \phi)\wedge(dr \sin \phi + d\phi\, r \cos \phi)$$ $$ = \underbrace{dr\wedge dr}_{=0} \sin \phi\cos \phi + dr\wedge d\phi\, r \cos^2 \phi - d\phi\wedge dr\, r \sin^2 \phi - \underbrace{d\phi\wedge d\phi}_{=0}\, r^2 \cos \phi \sin \phi $$ $$=r(dr\wedge d\phi \cos^2 \phi - d\phi\wedge dr \sin^2 \phi)$$ $$=r(dr\wedge d\phi \cos^2 \phi + dr\wedge d\phi \sin^2 \phi)$$ $$=r\, dr\wedge d\phi ( \cos^2 \phi + \sin^2 \phi)$$ $$=r\, dr\wedge d\phi $$

With this the integral above expressed in polar coordinates will correctly read:

$$\int_A f(r^2)r\, dr \, d\phi$$

Where we suppressed the wedge product here. It is important to realize, that if we would not have treated the differentials as 1-forms here, the transformation of the integration measure $dx \, dy$ into the one involving $dr$ and $d\phi$ would not have worked out properly!

Hope this example was down to earth enough and provides some feeling for how differentials are not entirely very small numbers.

Michael Hardy
  • 716
  • 3
  • 18
Kagaratsch
  • 1,519
  • Interesting interpretation of differentials. But it should be said that the interpretation where differentials are something like small numbers works well too in your example. In this traditional view, there is no need to require that $dxdy = rdrd\phi$, but instead we require that the latter expression gives standard hypervolume (that is the standard volume of the corresponding domain), up to first order in the differentials. This then leads to determinant of the Jacobi matrix and to the factor $r$ in $rdrd\phi$. – Ján Lalinský Jan 09 '14 at 11:13
  • 5
    It is misleading to interpret $dx\wedge dx=0$ as neglecting terms of $O(dx^2)$. The former is a geometrical thing: the area of a parallelogram with degenerate sides is zero. The latter is ingrained in the theory. – Emilio Pisanty Jan 09 '14 at 11:18
  • @Ján: I am not completely sure if I understand what you mean, but I feel that, since forms effectively encode the geometry of the space which is being integrated over, if you really choose to define differentials as simply very small numbers with no operator-like nature whatsoever, then you will have to impose the geometry bit in some other way. – Kagaratsch Jan 09 '14 at 11:24
  • @Emilio: I completely agree. That is why I pointed it out as cheating. Which is bad. In no way did I endorse it. Actually, let me edit my post and emphasize it more clearly, to avoid misunderstandings like this. – Kagaratsch Jan 09 '14 at 11:25
  • @Kagaratsch: if the integration is made in Euclidean space, we can forget about geometry, it is all just simple exchange of variables. Perhaps in curved spaces forms get more useful. – Ján Lalinský Jan 09 '14 at 11:29
  • @Ján: Personally, I usually prefer to keep things more general. Feels more rigorous, in a way. – Kagaratsch Jan 09 '14 at 11:37
  • Things should be made as simple as possible... although I like your example of differential forms, I do not think it is reasonable to explain differential by differential forms. It should be the other way around. – Ján Lalinský Jan 09 '14 at 11:46
  • Well, in my Analysis course back in the day differentials were introduced per definition as elements of the dual space, and therefore 1-form valued objects. Our Prof. actually explicitly disencouraged any 'simpler' notion for differentials. So, I don't know... – Kagaratsch Jan 09 '14 at 12:00
  • 1
    I think that this interpretation of the change of coordinates is wrong. The integral of a $n$ form over an $n$ chain $c:[0,1]^n \to \mathbb R ^n$ is defined as $$\intop _c \omega = \intop _{[0,1]^n} c^* \omega , $$ because this gives exactly the result of a change of coordinates, if $\omega = f,\text d x^1 \wedge \dots \wedge \text d x^n$. So if $c([0,1]^n)=A$, we may write $$\intop _A f = \intop _c \omega$$ as a consequence of the definition of the integral of forms. In other words, the change of coordinates through the pullback doesn't constitute a proof of the change of variable s formula. – pppqqq Jan 09 '14 at 12:03
  • To see this from another light, since the proof that $g ^ * f,\text dx^1 \dots = f\circ g |\det g'| \text d y_1\dots$ it's not difficult at all, it wouldn't make sense to make all that fuss to prove the change of variables formula for integrals through complicated things like the partition of unity. – pppqqq Jan 09 '14 at 12:07
  • Last comment: this is not a critique to the suggestion to study differential forms, at all. But I think that Ron Maimon's answer here http://physics.stackexchange.com/questions/32296/introduction-to-differential-forms-in-thermodynamics is relevant. – pppqqq Jan 09 '14 at 12:14
  • Actually, dx dy is short for the absolute value of dx ^ dy! You can see this if you use an order-reversing coordinate transformation (say change to dphi dr instead of to dr dphi). – Toby Bartels Jan 09 '14 at 15:12
  • @pppqqq: I am sorry, but I think your comments are beside the point here. This answer is in not supposed to be a proof of anything, but rather an example of how differentials can be very much unlike simple small numbers. – Kagaratsch Jan 09 '14 at 19:08
  • 2
    @Toby: If this was true, any integral $\int_{-\infty}^{\infty} dx f(x^2)$ would be trivially zero which is clearly not the case. (Just substitute $x\to -y$. If $dx$ in the integral is abbreviation for $|dx|$, then you get $\int_{\infty}^{-\infty} f(y^2)dy$, pull out a minus by reversing the integration direction and relabel y back to x. You would get that the integral must be equal to plus or minus itself and therefore zero. Clearly not a correct result generally.) – Kagaratsch Jan 09 '14 at 19:17
  • 1
    @Kagaratsch: You're looking at a different kind of integral! In $\int_a^b f(x) dx$, the $dx$ really is the 1-form $dx$, not its absolute value (as you say). But in $\int_A f(x,y) dx dy$, the $dx dy$ does not mean the 2-form $dx ^ dy$, but rather its absolute value (as I said). This discrepancy is confusing, but it is the only way that orientation-reversing substitutions will come out correctly in both cases. – Toby Bartels Jan 10 '14 at 17:46
  • The difference between these is not really the dimension, but rather how the region of integration is specified. In the first case, we are given an oriented region of integration, going from $a$ to $b$ (even if $a$ happens to be greater than $b$). In the second case, we are merely given an unoriented region in the plane. Since differential forms are easier to work with than their absolute values, people eventually learn to specify oriented regions, but that's not the kind of area integral that shows up in basic Calculus courses. – Toby Bartels Jan 10 '14 at 17:47
  • If you look at the general formula for change of variables in an area integral in a Calculus textbook, you'll see a reference to the absolute value of the Jacobian. But if you look at a description of u-substitution for 1-dimensional integrals, then there is no absolute value. This is the same issue. Again, it's not really the dimension that matters but rather how the region of integration is specified. If you look at the formula for a flux integral through a surface, the absolute value is gone again, because now you are instructed to keep track of orientation. – Toby Bartels Jan 10 '14 at 18:02
  • @Toby: The Jacobian is just a convenient way to write the prefactor, but you never take the absolute value of the differential forms. Just repeat the above substitution argument with $\int_{\mathbb{R}^N}dx_1 dx_2 ...dx_N f(x_1^2+x_2^2+...+x_N^2)$ for one of the $x_i$ to see that it works the same way in any number of dimensions. – Kagaratsch Jan 11 '14 at 00:09
  • I repeated it. I needed an absolute value. You should try it! Actually, your original example is more interesting; try it with $A$ being the unit circle, $f(x^2 + y^2) = 1$ (a constant function), and parametrize it using $0 \leq \phi \leq \pi$ and $-1 \leq r \leq 1$ (rather than $0 \leq \phi \leq 2\pi$ and $0 \leq r \leq 1$). The integral in question is simply taking the area of the unit circle, so the answer is $\pi$. Indeed, $\int_0^\pi \int_{-1}^1 |r| dr d\phi = \pi$. However, $\int_0^\pi \int_{-1}^1 r dr d\phi = 0$. – Toby Bartels Jan 11 '14 at 21:48
  • If you use differential forms, you must keep track of the orientation of the coordinate system $(r,\phi)$, note that it reverses, split the region up into two regions with consistent orientation, set up two integrals, reverse the sign on the one with reversed orientation, and get the correct answer as their sum. But the method in the Calculus books, using the absolute value of the Jacobian, gets the correct answer using only one integral, as I did it above. And using the absolute value of the differential forms reproduces this same result. – Toby Bartels Jan 11 '14 at 21:50
  • By the way, I'm not trying to argue that your answer is fundamentally wrong. I'm trying to correct what I see as a minor error. Your answer is a good one, and I voted it up when I first saw it. – Toby Bartels Jan 11 '14 at 22:45
  • 1
    Could you move this discussion to a chat room if it is to continue? You can use [chat] or just click on the prompt to move to chat which should appear when you try to post a comment at some point. – David Z Jan 12 '14 at 00:57
  • @Toby Bartels and Kagaratsch: as far as Ī understood, your “differential form vs its absolute value” dispute is actually the differential forms vs densities thing. Mathematicians cared about it. – Incnis Mrsi Oct 24 '14 at 13:40
  • @Incnis Mrsi : Yes indeed. The absolute value of a top-rank exterior differential form is a density, and there are also (less famous) lower-rank densities that serve as (among other things) the absolute values of lower-rank exterior differential forms. More on these densities at MathOverflow here: https://mathoverflow.net/a/90714/ – Toby Bartels Dec 31 '20 at 03:36
9

In mathematics the notation $\def\d{\mathrm d}\d x$ is actually a linear form, this means that $\d x$ is a linear function taking a vector a giving a scalar.

Let us take a differentiable function $f$ defined over $\def\R{\mathbf R}\R$ and consider it at point $a$. The tangent to the curve of $f$ at the point $a$ has a slope $f'(a)$. The point on this tangent of abscissa $b$ has ordinate $f_a(b)=f(a)+(b-a)f'(a)$. $f_a(b)$ is the linear approximation of $f(b)$ knowing $f$ at point $a$.

We define then $\d x(b-a)=b-a$. We have $$f_a(b)-f(a)=f'(a)\d x(b-a),\tag{1}$$ and we write $$\d f_a=f'(a)\d x$$ which is the formula (1) written for linear forms. Indeed the linear form $\d f_a$ is defined by $$\d f_a(\epsilon)=f'(a)\d x(\epsilon)=f'(a)\epsilon.$$

In physics one often makes the confusion between $\d x$ (the linear form) and $\epsilon$ (the argument of $\d x$). I hope you understand why when looking at the last equation.

NOTE. This may seem quite useless but in dimension $n>1$ this becomes more interesting. You have indeed $$ \def\vec#1{\boldsymbol{#1}} \def\der#1#2{\frac{\partial #2}{\partial #1}} \d f_{\vec a}=\nabla f(\vec a)\cdot\d\vec r=\begin{pmatrix}\der {x_1}{f(\vec a)}\\\vdots\\\der {x_n}{f(\vec a)}\end{pmatrix}\cdot \begin{pmatrix}\d x_1\\\vdots\\\d x_n\end{pmatrix}$$ that translates into, for $\vec\epsilon=(\epsilon_1,\dots,\,\epsilon_k)\in\R^n$, $$ \d f_{\vec a}(\vec\epsilon)=\sum_{k=1}^n \der{x_k}{f(\vec a)}\d x_k(\vec\epsilon)=\sum_{k=1}^n\der{x_k}{f(\vec a)}\epsilon_k,$$ because $\d x_k(\vec\epsilon)=\epsilon_k$ ($\d x_k$ is the $k^{\rm th}$ coordinate form).

Tom-Tom
  • 1,941
4

There is an old tradition going back all the way to Leibniz himself to think of differentials intuitively as "infinitesimal numbers". Through the course of history, big minds have criticized Leibniz for this. Thus, Russell accepted Cantor's claim that infinitesimals are inconsistent and even reproduced it in his book Principles of Mathematics in 1903.

But then something profound happened in 1961: Abraham Robinson, one of the most profound thinkers of the foundations of mathematics, taught the world a rigorous construction of infinitesimals in the traditional framework of the Zermelo-Fraenkel set theory, expressed in terms of the theory of types. Among other things, this is a fully rigorous mathematical context in which the old intuition of Leibniz and the intuition of plenty of naive physicists finds a full formal justification. In Robinson's framework those differentials explicitly exist as infinitesimal elements of a suitable real closed field.

A detailed exposition of how this works is in Robinson's 1966 book but simpler treatments have been developed since, such as the books by Martin Davis or by Robert Goldblatt, including exposition of differentiation via infinitesimals.

Notice that this is not just a big machine to produce something you already know, as some will inevitably hasten to think. On the contrary, this leads the way to the more sophisticated places of modern physics, as developed in detail in the book by Albeverio et al.:

Albeverio, Sergio; Høegh-Krohn, Raphael; Fenstad, Jens Erik; Lindstrøm, Tom. Nonstandard methods in stochastic analysis and mathematical physics. Pure and Applied Mathematics, 122. Academic Press, Inc., Orlando, FL, 1986. xii+514 pp.

Note 1. Lawvere's contribution in the framework of category theory dates from the 1970s.

Note 2. (In response to user Ovi's question) Robinson's framework is part of traditional analysis in the sense that it uses the traditional Zermelo-Fraenkel foundations and classical logic (as opposed to Lawvere's approach which relies on intuitionistic logic in a break with classical mathematics). Robinson's framework is an active research area today, featuring its own journal: Journal of Logic and Analysis (see http://logicandanalysis.org/) and an ever increasing number of monographs; most recently by Loeb and Wolff (see http://www.springer.com/us/book/9789401773263).

Note 3. Since the status of infinitesimals in nonstandard analysis seems to be of concern of some contributors here, I would like to add the following comments. In the axiomatic approach to nonstandard analysis, one can find infinitesimals in $\mathbb R$. Furthermore, this can be done effectively, in the sense of not using the axiom of choice at all. See this recent publication: https://arxiv.org/abs/2305.09672 Therefore these infinitesimals are numbers in the ordinary sense. They exist in a sense no different from $\mathbb R$ existing. Therefore they are on-topic here. The infinitesimals of nonstandard analysis are a vindication of Leibniz's perspective on the calculus, 3 centuries later. If one works with nonstandard extensions of the set-theoretic universe, then the construction of such an extension requires nonprincipal ultrafilters. However, in the axiomatic approach, no ultrafilters are needed to do nonstandard analysis, due to the conservativity result of Hrbacek and Katz; see https://doi.org/10.1016/j.apal.2021.102959

  • I have heard to Abraham Robinson's analysis, but why doesn't it seem to be used very often? Is it inferior to the mainstream analysis? – Ovi Dec 17 '15 at 08:40
  • @Ovi, that would make a nice separate question. – Mikhail Katz Dec 17 '15 at 09:01
  • Robinson's can usefully left to those with a passion for logic and the foundations of mathematics. It amounts to a fancy way of taking limits and is based on mathematics way beyond the scope of physicists. It was in fact the topic of my first published research paper, but I have since recognised that it is better left alone. – Charles Francis Jun 08 '23 at 05:22
  • In the axiomatic approach to nonstandard analysis, one can find infinitesimals in $\mathbb R$. Furthermore, this can be done effectively, in the sense of not using the axiom of choice at all. See this recent publication: https://arxiv.org/abs/2305.09672 Therefore these infinitesimals are numbers in the ordinary sense. They exist in a sense no different from $\mathbb R$ existing. Therefore it would be inaccurate to claim that they are off-topic here. The infinitesimals of nonstandard analysis are a vindication of Leibniz's perspective on calculus, 3 centuries later. @CharlesFrancis – Mikhail Katz Jun 08 '23 at 08:29
  • @CharlesFrancis I looked up your first published paper. Your starting point there is a nonstandard extension of the set-theoretic universe. The construction of such an extension requires nonprincipal ultrafilters. However, in the axiomatic approach, no ultrafilters are needed to do nonstandard analysis, due to the conservativity result of Hrbacek and Katz; see https://www.sciencedirect.com/science/article/pii/S0168007221000178?via%3Dihub – Mikhail Katz Jun 08 '23 at 08:37
  • @MikhailKatz, there are no inifinitesimals in the real number line. The exist only in the nonstandard extension. This is precisely the sort of dangerous mistake which negates the value of non-standard analysis for practical purposes. And yes, like the real numbers infinitesimals exist in mathematics and not in physics. They are certainly off topic here. – Charles Francis Jun 10 '23 at 09:41
  • @CharlesFrancis We proved otherwise, in the leading logic periodical Annals of Pure and Applied Logic (APAL). If you find problems with our proof I would certainly be happy to hear the details. – Mikhail Katz Jun 10 '23 at 20:23
4

As you see from the variety of answers there are many possibilities to interpret differentials mathematically exact.

One nice simple interpretation is as coordinates of tangential vectors.

Consider an equation $$ z = f(x,y) $$ describing a curved surface in three-dimensional space ($z$ is the height).

Then the equation $$ dz = \frac{\partial}{\partial x} f(x,y) \cdot dx + \frac{\partial}{\partial y} f(x,y) \cdot dy $$ describes the points $(\bar x,\bar y,\bar z)=(x+dx,y+dy,z+dz)$ of the tangential plane at the point $(x,y,z)$ on the surface. This equation is often named tangent equation.

If you have some specific point $(x,y,z)$ given by coordinate values as numbers and would like to have also a specific point on the tangent plane just put numbers in for $dx$, $dy$ and $dz$. Thus, the differentials can stand for numbers. Why not.

So far so good. Now, why should the numbers be small? We assume that the surface is smooth at the point $(x,y,z)$, meaning that $f$ should be continuously differentiable there. Then $$ \frac{z+dz - f(x+dx,y+dy)}{|(dx,dy)|}\rightarrow 0 \quad\text{ for } |(dx,dy)|\rightarrow 0 $$ where $dz$ fulfills the above tangent equation. Here $|(dx,dy)|=\sqrt{dx^2 + dy^2}$ denotes the Euclidian norm.

The division by $|(dx,dy)|$ lets us look at a scaled picture of the surface around the point $(x,y,z)$. To keep angles as they are we scale the picture evenly in all directions. The picture is always scaled such that the disturbance $(dx,dy)$ from the point $(x,y,z)$ is in the order of magnitude of 1. Even in this up-scaled picture the height $z+dz$ of the disturbed point $(x+dx,y+dy,z+dz)$ on the tangential plane fits better and better the corresponding height $f(x+dx,y+dy)$ on the curved surface.

$\sum$: The tangent plane with the local coordinates $dx$, $dy$ and $dz$ fits the better the curved surface the smaller the disturbations $dx,dy,dz$ are.


To clarify things let us consider an example. Let the curved surface be $$ z=x^2-y. $$ We pick the specific point with $x=1$ and $y=2$ yielding $z=1^2-2 = -1$. The tangent equation is $$ dz = 2x\cdot dx - dy, $$ and at our specific point $$ dz = 2 dx - dy. $$ To have a specific point on the tangent plane let us consider the differentials $dx=\frac14$ and $dy=1$ yielding $$ dz = 2\cdot\frac14 - 1 = -\frac12. $$

The location of this point on the tangent plane in 3d-space is $(x+dx,y+dy,z+dy)=\left(1+\frac14,2+1,-1-\frac12\right)=\left(\frac54,3,-\frac32\right)$.

At the same $x$- and $y$-coordinates we get on the curved surface the height $z'$ with $$ z' = f(x+dx,y+dy) = f\left(\frac54,3\right) = \left(\frac54\right)^2 - 3 = -\frac{23}{16} = -1.4375. $$ It is a little bit off the height $z+dz=-1.5$ of the corresponding point on the tangent plane.


Even if I presented here a numerical example in practice the differentials are more often used as variables to determine relations between the differentials (with their interpretation as tangent coordinates).

In the context of tangent coordinates the differential quotient $\frac{dy}{dx}=f'(x)$ is the ratio of the coordinates $dx$ and $dy$ of the tangent on the graph of $f$ at $x$.

As long as you avoid division by zero you can divide through a differential $dx$ (as tangent coordinate).

Tobias
  • 1,765
  • Tobias: "Then the equation $dz = [...]$ describes the points $(\bar x,\bar y,\bar z)=(x+dx,y+dy,z+dz)$ of the tangential plane at the point $(x,y,z)$ on the surface." -- How should be distinguished whether the indicated equation describes points/elements of a plane, and not (for instance) elements of some other surface, such as an "Osculating sphere" ("Schmiegekugel") to the given surface, at point $(x,y,z)$; or indeed any other surface containing point $(x,y,z)$? Or does that not even matter? – user12262 Jan 10 '14 at 06:30
  • I know what you mean. But no! Everything I wrote is taken to be literally. For reading the text forget what you have learned about tangent vectors on manifolds in higher semesters. The above version works whenever you consider smooth $m$-dimensional sub-manifolds of $\mathbb{R}^n$ with $m\leq n$. This approach is simple and helps with practical problems where you have some canonical coordinates (e.g., envelopes, many problems from numerics, multi-body mechanics in computer simulation and so on). – Tobias Jan 10 '14 at 07:18
  • 1
    I have added an example to show how literal one can see it. – Tobias Jan 10 '14 at 07:58
  • 1
    @user12262 A good way to grasp the "nonuniqueness" you speak of is to think of tangent vectors as equivalence classes of $C^1$ paths - because you can show that this kind of discussion is independent of the class member. – Selene Routley Jan 10 '14 at 08:47
  • Tobias: "I know what you mean." -- Fabulous. (Btw., in my comment above I'd perhaps better have written of "Osculating ellipsoid" ("Schmiegeellipsoid") instead of spheres.) "[...] whenever you consider smooth $m$-dimensional sub-manifolds of ${\mathbb R}^n$" -- Well, I still need to digest your added example and WSA-(RV)'s comment. Whenever I read "curvature" I look at http://orbit.dtu.dk/en/publications/gram-matrix-analysis-of-finite-distance-spaces-in-constant-curvature%28669948fa-a80c-4e5f-8a19-f2c7b7f99d47%29/export.html ... – user12262 Jan 10 '14 at 16:50
  • @WetSavannaAnimal aka Rod Vance: WetSavannaAnimal aka Rod Vance: "think of tangent vectors as equivalence classes of $C^1$ paths" -- Good point; "sounds familiar". (I just had not remembered this formulation in my first comment to Tobias.) So: Tobias should call the indicated equation more correctly "describing points of the tangential surface at the point $(x,y,z)$", right? And if this question weren't tagged calculus I'd now insist on clarifying how to determine 1. Which events constitute a "path" (or "surface")?, and 2. Which of those belong to the same "equivalence class"?. – user12262 Jan 10 '14 at 21:03
  • @user12262 With "I know what you mean" i acutally meant I know the stuff from [Jänich:Vektoranalysis] or [Abraham/Marsden:Tensor analysis Manifolds Applications] or [Abraham/Marsden:Math Foundations of Elasticity]. Nevertheless, if you have a distinguished linear Euclidian space (best to be expressed as $\mathbb{R}^n$) where your interesting object is a submanifold then the above version is sufficient, correct and most important easy to understand. If I remember right, it is very close to what Königsberger teaches in his book "Analysis 2". – Tobias Jan 10 '14 at 21:31
  • @user12262 Note, that the space $\mathbb{R}^n$ where our submanifold is embedded is a savior. It avoids the neccesity of the equivalence classes of curves for the definition of tangent vectors. For a given submanifold of $\mathbb{R}^n$ all curves of a tangent vector equivalence class at some point of the submanifold have the same velocity vector at that point. This is the tangent vector as defined in my answer. (Note, that the definition of a velocity vector only makes sense with the surounding $\mathbb{R}^n$.) – Tobias Jan 10 '14 at 22:04
  • @Tobias: "[...] linear Euclidian space (best to be expressed as ${\mathbb R}^n$)" -- ${\mathbb R}^n$ is not Euclidean "on its own" but only together with a suitable "Euclidean/flat" distance function "${\mathbf d}{flat} : {\mathbb R}^n \times {\mathbb R}^n \rightarrow {\mathbb R} $". "_For a given submanifold [...] all curves of a tangent vector equivalence class at some point [...] have the same velocity vector at that point of the submanifold. This is the tangent vector as defined in my answer." -- There seems a fine line between "definition" and "one more layer of obfuscation" ... – user12262 Jan 11 '14 at 08:41
  • @Tobias: p.s. I referenced S.L.Kokkendorff above especially to advertise the consideration (if not in calculus, then at least in physics) of metric spaces (or their generalizations) instead of manifolds; with the corresponding definitions of "curvature" in terms of distances (or even only in terms of distance ratios) by Gram determinants (and correspondingly of "flatness" by Cayley-Menger determinants) instead of "something having to do with coordinates". – user12262 Jan 11 '14 at 08:42
  • @user12262 Sorry, I should have written $\mathbb{R}^n$ equipped with the Eucldian norm. Thanks for the clarification. – Tobias Jan 11 '14 at 13:00
1

With the objective of keeping complexity to a minimum, the best "unifying" solution, is to think of differentials, infinitesimals, numbers, etc. as mathematical symbols to which certain characteristics, properties, and mathematical operations (rules), are applicable.

Since not all rules are applicable to all symbols, you need to learn which rules are applicable to a particular set of symbols.

Whether you are learning fractions, decimals, differentials, etc., just learn the symbols and their particular rules and operations and that will be sufficient for 99% of the time.

Guill
  • 2,493
0

I agree with the answers already posted, but I feel like they are missing an important aspect: differentials are mathematical objects, and for a physicist they are a theoretical tool (a very important one!) but still a tool, and like anything that isn't an experimental result (or an attempt to a fundamental theory) is to be used in a way that makes it useful, not in the way it "should be".

Take a wrench. If you ask a designer if the wrench is a hammer, you would most likely get a no, rightfully, because it is not. But if you need to nail down a nail, and if you find that wrenches work just fine and are more easily reachable for some reason, then go ahead and use wrenches!

Differentials in physics are often used as something they're not supposed to: in the chain rule you can treat them as fraction, in the Jacobian you can treat them as tiny elements of length in the direction of the different coordinates. Does this mean that they are fractions or tiny lengths? Absolutely not! But if it works (meaning, you get a result that is experimentally correct) there is no reason why you shouldn't use it. There are times when it won't work: in these cases, you won't have Newton's Flaming Laser Sword by your side and need to math things up. Of course then, not knowing what differentials really are leaves you knowledge incomplete, but your calculations will still be right in the above cases. As my calculus teacher used to say, "engineers always use differentials as fractions, mathematicians always as differential forms, physicist learn that they are differential forms, and then use them as fractions"

-1

The rigorous mathematical meaning of infinitesimals is given in analysis using the $\epsilon$-$\delta$ definition of a limit. As far as physics is concerned $\epsilon$ generally refers to experimental precision, or margin of error. It means that $\delta$ is a small enough number then making it any smaller makes no practical difference to predictions.

The formal $\epsilon$-$\delta$ definition of a limit strictly does not mean the end point of an infinite process, but simply means that to continue the process further is empirically meaningless.

The reason mathematicians will not use $dx$ as a number is that $dx$ is only defined as part of an expression, not as something in itself. In physics $dx$ can be taken to mean $\delta x$, a number sufficiently small that experimental precision is not affected.

Charles Francis
  • 11,546
  • 4
  • 22
  • 37
  • "$dx$ is only defined as part of an expression, not as something in itself": This has been inaccurate for about 62 years already. Please see this answer: https://physics.stackexchange.com/a/224425/100943 – Mikhail Katz Jun 06 '23 at 12:51
  • I am familiar with the work of Robinson, and I regard it as off-topic here. Robinson's infinitesimals are not numbers in the ordinary sense, but are an extension thereof. Quite apart from the fact that only a very good mathematician can follow his construction, they raise philosophical issues concerning the meaning of "existence" in mathematics, not to mention the misunderstanding by physicists of that meaning. Should we really say Robinson's infinitesimals "exist" when not a single infinitesimal can be constructed? – Charles Francis Jun 08 '23 at 05:17
  • In the axiomatic approach to nonstandard analysis, one can find infinitesimals in $\mathbb R$. Furthermore, this can be done effectively, in the sense of not using the axiom of choice at all. See this recent publication: https://arxiv.org/abs/2305.09672 Therefore these infinitesimals are numbers in the ordinary sense. They exist in a sense no different from $\mathbb R$ existing. Therefore it would be inaccurate to claim that they are off-topic here. – Mikhail Katz Jun 08 '23 at 08:27
  • @MikhailKatz, there are no inifinitesimals in the real number line. The exist only in the nonstandard extension. This is precisely the sort of dangerous mistake which negates the value of non-standard analysis for practical purposes. And yes, like the real numbers infinitesimals exist in mathematics and not in physics. They are certainly off topic here. – Charles Francis Jun 10 '23 at 09:41
  • We proved otherwise, in the leading logic periodical Annals of Pure and Applied Logic (APAL). If you find problems with our proof I would certainly be happy to hear the details. – Mikhail Katz Jun 10 '23 at 20:23
  • Of course there is a mistake in your proof. The real numbers have an accepted rigorous mathematical definition and they do not include infinitesimals. You can only change that by changing the meaning of the words, which is essentially what you do by looking at different axiom sets. In any case you are playing with mathematical structures, not physical reality and your work remains off-topic in this forum. – Charles Francis Jun 12 '23 at 09:15
  • What in your opinion is the relation between the real numbers and the real world (or what you describe as physical reality)? Our follow-up paper is currently in press at the respected journal Real Analysis Exchange; see https://u.math.biu.ac.il/~katzmik/infinitesimals.html#23c and it seems to me that the editors and referees of that journal would know something about the real numbers :-) Frankly, you need to provide more solid reasons for your views, which are apparently not very convincing to the audience of this site, based on the score your answer received. – Mikhail Katz Jun 12 '23 at 09:20
  • I should note that I agree with you that mathematicians sometimes change the meaning of words, and one has to be on guard against that. For example, in developing his constructive mathematics, Errett Bishop changed the meaning of "continuous function", and used something based on uniform continuity instead, of course in the context of intuitionistic logic. However, nonstandard analysis uses classical logic and does not change the meaning of words. For example, in axiomatic NSA, the real numbers are defined, as usual, as the set of Dedekind cuts on the rationals; ... – Mikhail Katz Jun 13 '23 at 10:03
  • ... the rationals are defined as \pm the ratios of natural numbers, and the natural numbers are the smallest inductive set. Your claim that NSA involves changing the meaning of words is in error. Similarly in error is your claim that NSA is "looking at different axiom sets", because its axioms incorporate the usual ones of ZF and certainly don't change them. The real difference is the introduction of a more expressive language, incorporating what already Leibniz referred to as the distinction between assignable and inassignable numbers... – Mikhail Katz Jun 13 '23 at 10:03
  • ... I assume the distinguished physicist Leibniz would not be off-topic for this forum. @Charles – Mikhail Katz Jun 13 '23 at 10:04
  • NSA does not change the meaning of words. However, you did change the meaning of words when you claimed an infinitesimal could be a real number. It is well established that while real numbers are useful in mathematics as used in physics, there is no empirical evidence of their physical existence. Indeed mathematics has no dependency on physics whatsoever. Also ask yourself why formal logic and set theory are no longer much studied as the foundation of mathematics. It was an ill-conceived foundation and quite unnecessary to modern mathematics. The answer I gave to the OP remains correct. – Charles Francis Jun 14 '23 at 08:26
  • In axiomatic NSA one enriches the language used by the addition of the "standard" predicate, which is a formalisation of Leibniz's distinction between assignable and inassignable numbers. This enables us to detect infinitesimals in R which were not detected before due insufficiently expressive language, as per Edward Nelson's take on this. I am fine with the statement that there is no empirical evidence of the existence of infinitesimals, so long as we acknowledge also that there is no physical evidence ... – Mikhail Katz Jun 14 '23 at 08:45
  • ... for the existence of a real number bigger than $10^{80}$ (the number of elementary particles in the universe). If you say that "mathematics has no dependency on physics whatsoever", then one cannot privilege one view of R over another based on purely empirical considerations. I am fine with category-theoretic foundations for mathematics (as opposed to set-theoretic ones). This issue is transverse to the idea that we can follow Leibniz in considering R as containing infinitesimals. – Mikhail Katz Jun 14 '23 at 08:46
  • Note that, according to this comment: https://mathoverflow.net/a/136360/28128 the ultraproduct is fundamentally a category-theoretic concept. Of course this is relevant for the model-theoretic approach to NSA, and is less relevant to the axiomatic approach, where the relevant foundations don't even require the existence of an ultrafilter. – Mikhail Katz Jun 14 '23 at 10:12