I have been rereading the first part of Baez and Muniain, on reformulating electromagnetism in the language of differential geometry.  Here are some notes; they mostly follow the book, but only the parts necessary for writing down and understanding the final equations.

Manifolds

   The spaces physicists are interested in studying are locally similar to \mathbb{R}^n.  For example, the 2-sphere S^2 – i.e., the surface x^2 + y^2 + z^2 = 1 - is locally similar to the plane \mathbb{R}^2, which is why the world looks flat.  To generalize this idea to spaces that look like \mathbb{R}^n, we define n-dimensional manifolds.

   First, we must define topological spaces.  A topological space is a set X and a collection of subsets of X, called the open sets, which satisfy: 1) the null set and X are open, 2) the intersection of any two open sets is also open, and 3) the union of any number of open sets is also open.

   Topological spaces allow the definition of continuous functions.  A function between two topological spaces, f\colon X \rightarrow Y, is continuous if given any open set U \subseteq Y in the target space, the inverse image f^{-1}U \subseteq X is open.  Basically, this means the function sends “nearby” elements of X - “nearby” meaning the elements are all members of some open set – to “nearby” elements of Y.

   Now, we can see how to use the idea of open sets to link topological spaces and \mathbb{R}^n.  Consider a topological space M, covered with open sets U_i.  For each open set, we define a continuous function, called a chart: \varphi_i\colon U_i \rightarrow \mathbb{R}^n.  If these charts are defined such that the transition function \varphi_i \circ \varphi^{-1}_j\colon \mathbb{R}^n \rightarrow \mathbb{R}^n between the two \mathbb{R}^n spaces associated with the open sets U_i and U_j is smooth, then M is called a smooth n-dimensional manifold.  This manifold is a space of elements, separated into “nearby” groups, each of which can be related to \mathbb{R}^n; thus, the manifold looks like smoothly-connected patches of n-dimensional Euclidean space.  Using these local Euclidean spaces, we can define smooth functions on the manifold in a familar manner; e.g., f\colon M \rightarrow \mathbb{R} is smooth if f \circ \varphi^{-1}_i\colon \mathbb{R}^n \rightarrow \mathbb{R} is smooth for all i.  We can then use these functions to define vector fields, and more, on the manifold, as if we were working in \mathbb{R}^n as usual.

Vector Fields

   We are used to thinking of a vector v \in \mathbb{R}^n as a coordinate n-tuple, with n components v^{\mu}: (v^1,...,v^n).  We multiply these components by the “basis vectors” and add everything up to get the vector “object.”  However, this picture is unclear for manifolds; how should one define simultaneous basis vectors for two vectors at different points on the sphere, for example?  If one thinks of these two vectors as little arrows tangent to the sphere, they might lie in different planes entirely, which makes defining basis vectors and coordinates difficult.  Thus, we would like a coordinate-free definition of vector fields.

   To start, we note that one can differentiate a function on the manifold in the direction of a vector, with the directional derivative.  Consider the usual directional derivative of f in the direction of v, which we shall write as vf; this is simply \nabla f = v^{\mu}\partial_{\mu}f.  Let us identify the vector field v not with the components v^{\mu}, but instead with the operator v^{\mu}\partial_{\mu}.  What does this mean?  Given a vector field v=v^{\mu}\partial_{\mu} and a function f on the manifold, at each “point” of the manifold we’ll take the derivative of f in the direction of v, giving us a new function v^{\mu}\partial_{\mu}f.  So combining a vector field with a function gives us another function on the manifold, which is related to derivatives of the original function.

   Now we can generalize this new picture of a vector field to a coordinate-free abstraction that contains all of the essential properties of the directional derivative.  Let C^{\infty}(M) be the set of real-valued smooth functions (which have infinitely many continuous derivatives) on the manifold MC^{\infty}(M) is a commutative algebra over \mathbb{R}, which just means that elements of C^{\infty}(M) combined with elements of \mathbb{R} obey a number of addition, multiplication, and distributive properties.  We define a vector field v on M to be a function v\colon C^{\infty}(M) \rightarrow C^{\infty}(M), obeying: 1) linearity over C^{\infty}(M), and 2) the Liebniz law v(fg) = fv(g)+v(f)g, where f,g \in C^{\infty}(M).  Let Vect(M) be the set of all vector fields on M.  It is easy to check that Vect(M) is a vector space over C^{\infty}(M).  Thus, this abstract definition satsifies our usual ideas of a vector field, without any coordinates involved.

   Finally, we can define “arrows” at each “point” of the manifold, as in our original picture of the sphere and arrows tangent to points on it.  We will just evaluate the function returned by the vector field at the relevant point p \in M, i.e. v(f)(p).  Defining the tangent vector v_p\colon C^{\infty}(M) \rightarrow \mathbb{R} by v_p(f) = v(f)(p), we think of the real number v(f)(p) as the result of differentiating the function f in the direction of the tangent vector v_p.  We define the tangent space at p, T_pM, to be the set of all v_p.

   We can see that our original picture of identifying vectors with their components stems from the nature of the tangent space of \mathbb{R}^n.  Since for any point p \in \mathbb{R}^n, the tangent vectors (\partial_{\mu})_p \in T_p\mathbb{R}^n form a basis, we sloppily identify two vectors v_p and v_q at different points to be the same if v_p^{\mu}=v_q^{\mu}.  However, in actuality v_p=v_p^{\mu}(\partial_{\mu})_p \in T_p\mathbb{R}^n and v_q=v_q^{\mu}(\partial_{\mu})_q \in T_q\mathbb{R}^n are vectors in different tangent spaces, even if v_p^{\mu}=v_q^{\mu}.  We can get away with this sloppiness in \mathbb{R}^n, but not when dealing with more complex manifolds.

   In any case, this coordinate-free definition of vector fields allows us to define even more structures on the manifold.  These will be useful in distilling the essential qualities of Maxwell’s Equations.

   Today I read a bit of Ryder’s QFT (which was $20 when I bought it on Amazon; a day after I bought it, it dropped to $14 (!), but it is now again $53?), starting over with Chapter 2.  Ryder approaches relativistic wave equations from symmetry arguments, which I like.  Below are some notes to myself on his introduction of the Dirac equation.

   First, he demonstrates the problems with the Klein-Gordon equation for scalar particles,

     (\Box + m^2)\psi = 0, \ \ \Box \equiv \eta^{\mu\nu}\partial_{\mu}\partial_{\nu} = \frac{\partial^2}{\partial t^2}-\bold{\nabla}^2.

   The first problem is obvious; this equation is obtained by the substitution of differential operators for E \rightarrow \imath \hbar \frac{\partial}{\partial t} and \bold{p} \rightarrow - \imath \hbar \bold{\nabla}, as usual in quantum theory, into the relativistic energy-momentum relation p^{\mu}p_{\mu} = E^2 - \bold{p}^2 = m^2.  But this relation gives both positive and negative energy solutions, E = \pm \sqrt{m^2 + \bold{p}^2}.  These negative energy solutions are the first problem.

   The second problem stems from attempting to modify the probabilistic interpretation of the wave function into one consistent with relativity.  Taking the usual non-relativistic probability density, \rho = \psi^{\ast} \psi, and attemping to make it the time component of a 4-vector

     j^{\mu} = (\rho, \bold{j}),

where

     \bold{j} = -\frac{\imath \hbar}{2m}(\psi^{\ast} \bold{\nabla} \psi - \psi \bold{\nabla} \psi^{\ast}) 

is the probability current, gives

     \rho = \frac{\imath \hbar}{2m} (\psi^{\ast} \frac{\partial \psi}{\partial t} - \psi \frac{\partial \psi^{\ast}}{\partial t}).

   However, since the Klein-Gordon equation is second-order in time, initial conditions for \psi and \frac{\partial \psi}{\partial t} can be chosen independently, and thus \rho is not guaranteed to be positive-definite – which makes a probabilistic interpretation difficult!

   To fix these problems, Dirac introduced his first-order equation.  Now, some texts, like Griffiths’s Elementary Particles book, attempt to introduce the Dirac equation by “factoring” the Klein-Gordon equation, and then showing that these factors must necessarily imply a first-order matrix equation – i.e., the Dirac equation.  Ryder does this too, but first derives the Dirac equation by considering the transformation of spinors under groups isomorphic to the Lorentz group.  In this way, the Dirac equation is given by symmetry principles.

   Even before showing this, as an analogy Ryder demonstrates that SU(2) is a double cover of SO(3).  I also found a good discussion of this here (there is also more detailed discussion of group representations and the Lorentz group in Maggiore and in Aitchison’s notes).  Basically, one can show that any element of SO(3), the group of orientation-preserving rotations of a 3-dimensional vector space, can be mapped to two elements of SU(2), the group of unitary transformations of a 2-dimensional complex spinor space.  This two-to-one mapping derives from the similar-but-not-identical Lie algebras of the two Lie groups, given by the commutator relations obeyed by the generators of the groups: \left[J_i,J_j\right] = \imath \epsilon_{ijk} J_k for the angular momentum operators \bold{J} that generate SO(3) vs. \left[\frac{\sigma_i}{2},\frac{\sigma_j}{2}\right] = \imath \epsilon_{ijk} \frac{\sigma_k}{2} for the Pauli matrices \bold{\sigma} that generate SU(2).

   Likewise, the generators of the Lorentz group are similar to those of other groups; however, in this case, the similarity must be made explicit.  To see this, first notice that the set of pure Lorentz transformations, the generators of which we shall call \bold{K}, is not closed; e.g., the group commutator of boosts in the x and y directions gives a rotation in the z direction.  Thus, the true Lorentz group includes both boosts and rotations.  That is, the Lie algebra relations of the Lorentz group intertwine both \bold{J} and \bold{K}, like so:

     \left[K_i,K_j\right] = -\imath \epsilon_{ijk} J_k,
     \left[J_i,K_j\right] = \imath \epsilon_{ijk} K_k,

along with the usual commutator for rotations.

   Now, we can write complex linear combinations of these 6 generators to form 6 new generators for the complexified Lorentz group:

     \bold{A_{\pm}} = \frac{1}{2}(\bold{J}\pm\imath\bold{K})

   Then, \bold{A_{\pm}} separately satisfy the commutation relation of the SU(2) group, so the Lorentz group SO(3,1) is roughly analogous to SU(2)\otimesSU(2).  The isomorphism is not exact because of the complexification; however, we can use this similarity to study the transformation effects of boosts on the two different types of spinors associated with each SU(2) group (defining the spinor space on which SU(2)\otimesSU(2) is represented as (j=\frac{1}{2},j=0)\oplus(j=0,j=\frac{1}{2})).  It can also be shown that these two spin-\frac{1}{2} spaces switch under parity (basically because \bold{J} is an axial vector, while \bold{K} is a true vector).  If we wish to include parity in our equations, it becomes necessary to consider both types of 2-component spinors, which transform differently under boosts, combined as one 4-component spinor.

   By studying how this 4-component Dirac spinor transforms under boosts, one finds the Dirac equation.  Essentially, one boosts a rest frame Dirac spinor, and then uses the equivalence of both rest frame 2-component spinors to write relations between boosted 2-component spinors of both types.  This relation can be cast back into an equation involving a set of 4×4 matrices \gamma^{\mu}  and the Dirac spinor \psi, which is the Dirac equation:

     (\gamma^{\mu}p_{\mu}-m)\psi=(\imath\gamma^{\mu}\partial_{\mu}-m)\psi=0

   The \gamma^{\mu} obey the anti-commutation relation \{\gamma^{\mu},\gamma^{\nu}\}=2\eta^{\mu\nu}\bold{I}.  Of course, one could have found this relation by “factoring” the Klein-Gordon equation,

     p^{\mu}p_{\mu} - m^2 = (\alpha^{\kappa}p_{\kappa}+m)(\beta^{\lambda}p_{\lambda}-m) = 0,

finding that the \alpha^{\mu} and \beta^{\mu} are the \gamma^{\mu} matrices, but this is far less revealing of the rich structure of the Dirac spinor representation of the Lorentz group.

   Now the Dirac equation is first-order, and a positive-definite probability density can be found.  Negative energy solutions still exist, but because of the fermionic nature of the spin-\frac{1}{2} particles described by the equation the Pauli exclusion principle can be invoked to prevent a particle from falling to infinitely negative energy.  However, for the exclusion principle to fix this problem requires that the negative energy states be filled by an infinity of particles – the so-called Dirac sea.  Dirac postulated that antiparticles would appear as “holes” in this sea, predicting the existence of the positron.  Despite the success of this prediction, this somewhat awkward picture of the Dirac sea is voided by the full QFT formulation (which allows bosonic antiparticles, despite the lack of a bosonic exclusion principle, as well); however, even QFT requires its own sea of infinite energy – i.e., the zero-point energy of the vacuum.

   Thus, by extending the Lorentz group by parity and finding how the Dirac spinors on which the group is represented transform under boosts, one can derive the Dirac equation from group theoretic and symmetry arguments!

   Kind of sketchy for the first post of real substance, but this was mostly to check out the \LaTeX capabilities.  I wrote this post bit-by-bit over a few days, and have since then branched off from Ryder to peruse Maggiore and Aitchison and Hey.  I actually think I will make Aitchison and Hey my text of choice; the presentation is clearer than Ryder, if less interesting, and there are problems given in the book and solutions online at Aitchison’s website (see above for relevant links).  Nevertheless, I hope to find a clearer, mathematically rigorous presentation of the isomorphisms and relations between the various Lie groups in the future.  This looks like a good overview.

   Hello, this is my physics blog.  I decided to start a physics blog so I could blog about physics.  Tautological!  I chose WordPress for my blog so I could use \LaTeX to write equations.  I named my blog “on the seashore” for two reasons: 1) because of the above quote by Sir Isaac Newton, and 2) because I have just completed my physics undergraduate education on one coast, and am about to begin my physics graduate education on another coast.

   For this summer, I will be reviewing some old physics and learning some new physics, to prepare for entrance exams and doing research.  Mostly I will blog notes to myself to sort ideas out and to make sure I’m understanding things.  Here’s a list of topics and books I hope to get through:

Classical Mechanics: I think I will browse through Landau and Lifshitz again to refresh my memory.
Electrodynamics: Griffiths should be fine.
Quantum Mechanics: Sakurai should do the trick here. I should probably learn some of the approximations and scattering that I skipped over during my undergrad.
Statistical Mechanics: The stat mech course at my undergrad school was awful. So I am going to read Bloch’s notes to patch up the damage.
Quantum Field Theory: I have had difficulty finding a QFT text that I like. I started with Zee, which I found interesting but far too unstructured. Next I tried Weinberg, which was too heavy on unconventional math and Weinberg’s own formalisms. I then took a course that used Peskin & Schroeder, which I find light on physical motivation and heavy on phenomenological calculations; it is also lacking in important examples - the complex scalar field is left as an exercise to the reader, and took me quite a few pages to work through!  I like Lahiri & Pal, but unfortunately don’t have my own copy. I do have a copy of Ryder, which will be my text of choice for the summer. I will supplement it with Mark Srednicki’s notes, Griffiths’s Elementary Particles book, Aitchison & Hey, and Maggiore.
General Relativity: I’ll review Carroll’s book and try to tackle Wald.
Cosmology: I’ll review notes from a course on the Cosmic Microwave Background that I took last fall semester. I hope to do research in this area, so I’d better brush up!
Mathematical Physics, etc.: I’m going to make up for my spotty theoretical mathematics background by going through Geroch’s book. I also recently was given a copy of Baez & Muniain, which I like very much and will try to complete! I also got a third of the way through Zwiebach’s string theory book during this past semester, so I will try to finish that up.
Problems: I have copies of Cahn & Nadgorny’s two compilations of physics problems taken from graduate school qualifiers. I should work through some problems, since I haven’t really touched pen and paper in a whole semester!

Follow

Get every new post delivered to your Inbox.