Wednesday, July 21, 2010

The "N" Field - part 2

To discuss my thoughts of the N in this section, I will not use Lagrangians, rather I will use Hamiltonians. The Lagrangian (L) is related to the Hamiltonian (H) as follows:



The Hamiltonian represents the total energy of a system. The Hamiltonian of a free particle is merely the kinetic energy of the particle.



There is a very simple trick to find the Hamiltonian of a charged particle in the influence of an electromagnetic field. The extension can be made by replacing the components of the relativistic four momentum with the components of the canonical four momentum.



Where A is the electromagnetic four potential. In terms of space and time pieces, this is




Making this transformation on the free particle Hamiltonian yields



Solving for E gives



So we started with the energy of a free particle, and transformed the four momentum to get the energy of a particle that is subject to the electromagnetic field.

Start with a free particle. Extend the momentum. End up with interaction terms.

The N field seems to be a similar type of extension - except in this case we aren't extending from four momentum to canonical four momentum, but rather from four potential to "canonical four potential", like this



Where u is the four velocity of the particle.

So, whereas F is defined as



We can define N as



So the point I am driving at here is this: on one hand we can derive the Hamiltonian of a particle in an electric field by starting with a free Hamiltonian, and extending the four momentum to the canonical momentum. So, what might happen if we were to start with the "free" Maxwell equations (free here means sourceless) and extend the four potential to the "canonical four potential"? My guess is:

Start with a sourceless field. Extend the potential. End up with source terms.

The free Maxwell equations are written as



Extending A to the canonical form converts F into N and the equation becomes



If the free vs. interaction Hamiltonian analogy holds true here, then this homogeneous equation may represent the inhomogeneous maxwell equations.

Splitting this up into field terms and source terms we have




The empirical form of the inhomogeneous Maxwell equations with source terms is given in terms of currents as



We can make the correlation



Of course, as stated previously, when the empirical current shows up as a source term, this represents approximating local disturbances in the potential wave as point disturbances, i.e. delta function sources. If we do not make this approximation, then j is always zero, and the local disturbances manifest themselves through the derivatives of the scalar EM field. Thus, if we do not make the approximation then we have



Or, more importantly, taking into account the fact that the four vector potential also obeys this equation, we can make a homogenious wave from the four momentum.

Saturday, June 26, 2010

The "N" Field

I have recently been exposed to a terrific idea, proposed by my friend Bill Polson, while enjoying a nice breakfast overlooking the coast.

He expressed a few ideas about Lagrangians that I think are rather revolutionary.

We usually think of an action integral is the integral of a scalar Lagrangian function along a parametrized path, with time as the parameter. The path that minimizes the result of this integral is the path that a particle takes.



Bill Polson introduced the idea of a vector valued Lagrangian. This is something I've done as well, but Bill does something rather special with his vector Lagrangian. He forms the action integral out of a line integral on a closed loop. The first branch of the loop represents the path the particle might take, while the returning branch contains a slight variation. He is then able to make a correlation, via Stokes Law, between the abstract idea of minimizing action, and the geometric properties of the vector Lagrangian field.



Bill introduces a vector field he calls N, which should contain vector and pseudo-vector portions. The action is stationary when integrated over the "area" contained inside of the closed path. He defined N as the "curl" of the vector Lagrangian field, but he meant the generalization of the curl to 4 dimensions, of course.

Now, I think this idea is rockin' as is. However, Bill went further and figured out what N needed to be in order to reproduce the Lagrangian of a charged particle in an EM field.



Again, the curl here is a generalization of the 3D curl of a vector field into 4 dimensions. Also, there are several factors such as c, charge and mass that Bill neglected for convenience.

Using the Clifford Algebra notation, this is how I would define N



Here, I have inserted the mass, charge, and speed of light constants where they are appropriate. If I replace F with its definition in terms of the four-vector potential A, this becomes



The quantity under the conjugation bar is the canonical momentum p of a particle in an EM field. Using this we can build an analogy. As p is to A, N is to F.

Now we have



Or, just



The Action integral SBP expressed in terms of Clifford Algebra notation is now



Now, let's compare this with my version of the action integral, which is an integral of a scalar Lagrangian density function over a volume.



Thus, according to this definition of action, my vector valued Lagrangian field is



I don't define an N field, but my vector Lagrangian is the same as Bill's apart from a constant factor.

Thursday, March 4, 2010

The Scalar Field

In the previous post, we saw that a homogeneous paravector wave provided an exact description of the Maxwell's equations, except there is an added scalar field, which is on the same footing as the Electric and Magnetic fields. This scalar field does not appear in standard electromagnetic theory, and we are going to find out why.

To begin, we will define the electro-magnetic bi-paravector F in terms of the potential A

A = ( V/c, A )

F = cA

F = ( ∂V/∂t + cA, - ∂A/∂t - V + ic ( × A) )

We can make the following definitions for the fields

E = -∂A/∂t - V

B = × A

φ = (1/c) ∂V/∂t + A


In terms of these fields the electro-magnetic bi-paravector is given by

F = ( cφ, E + icB )

If you want to recover the orthodox version of the electro-magnetic field tensor, then just take the vector portion of this, ignoring the scalar portion. For now, we are not going to ignore the scalar portion, so we can see what physical implications it has.

To begin with, we see that the scalar field is a Lorentz invariant. It is the same in all reference frames.

The vector fields are gauge invariant. We can express the classical concept of a gauge transformation as follows

A' → A + Χ

F' → F + cΧ

Now, if we want to get full gauge invariance, we should expect the second term to vanish. You can show that since Χ is a scalar the vector portion of the second term vanishes identically, therefore the E and B fields are trivially gauge invariant.

We can achieve gauge invariance of the scalar field, only in the case where the scalar Χ belongs to a restricted set of functions which obey the condition

Χ = 0

This expression reduces to the scalar homogeneous wave equation. The idea of a restricted gauge invariance provides for some non-trivial gauge transformations, which end up being valid, even in the context of the traditional, unrestricted, gauge principle.

For instance, consider a paravector Ψ, which is not the gradient of a scalar, but which does satisfy the expression

Ψ = 0

Such a paravector can be used as a gauge function, in the sense that

A' → A + Ψ

F' → F

Such a gauge function ends up being non-trivially gauge invariant for the vector fields. This means that the vector fields are gauge invariant, but it is due to the form of the gauge function, not because of mathematical identities.

We will escalate the concept of gauge invariance to include the non-trivial gauge functions as well. The higher principle that we used to introduce the non-trivial gauge functions requires that we restrict the traditional scalar gauge functions to only include solutions to the homogeneous wave equation. This refinement resides within the limitations of traditional electro-magnetic theory, though it may require us to rethink the physical meaning behind various gauge transformations.

For instance, the Lorenz gauge

(1/c) ∂V/∂t + A = 0

If you remember the definition of the scalar field, you see that applying the Lorenz gauge is the same as making the statement

φ = 0

Since the derivatives of the scalar field are equated with the source terms, we see that setting this field to zero, or any constant for that matter, has the physical meaning of having no source terms. Luckily, this is the case where the Lorentz gauge is employed.

We will continue discussing the scalar field in the next post, where we will ask the question, "what force does the scalar field apply to a particle?"

The Wave Equation

Note: I usually prefer to use the nabla symbol (∇) to represent the vector portion of the gradient. If you are using IE and can see the nabla symbol as an upside down triangle rather than a box, then you are lucky :(

I suggest using firefox, or another browser. Sorry microsoft, I love ya, but I need my math notation.


We are going to construct the homogeneous wave equation, for a paravector wavefunction. The reason for doing this is because I have a hunch that it may be possible to show that this homogeneous wave equation is a natural result of the definition of the derivative with respect to a paravector variable, but so far this hunch currently has the status of a hypothesis. Therefore, for the sake of my hypothesis we will proceed on this path.

To generate the wave equation, we will operate on a paravector two times with the gradient operator. Begin by letting the gradient operate on an arbitrary paravector that we will call Ψ. Remember that this must be done in a specific way in order to maintain relativistic significance.

Θ = Ψ

This quantity is a bi-paravector. Let's act on this bi-paravector with another gradient operation in a relativistically correct way, and set the result to zero, for a homogeneous wave.

Θ∂ = Ψ = 0

This is the expression for a homogeneous paravector wave. Let's see if this wave equation resembles anything that we have any physical intuition for.

Let's express the gradient and the wavefunction in terms of their scalar and vector parts

= ( (1/c) ∂/∂t, - )

Ψ = ( ψ, ψ )

When we act with the gradient operator, we treat it as if we were multiplying it, in the following manner

Ψ = ( (1/c) ∂/∂t, - )( ψ, -ψ )

= ( (1/c) ∂ψ/∂t + ∇•ψ, -(1/c) ∂ψ/∂t - ψ + i× ψ )

For the sake of brevity, rather than writing out all of these terms, we will just consider that the resulting bi-paravector has three parts: a scalar part, a real vector part, and an imaginary vector part.

Θ = ( θ, θR + iθI )

θ = (1/c) ∂ψ/∂t + ψ

θR = -(1/c) ∂ψ/∂t - ψ

θI = × ψ

Now we apply the second gradient to the bi-paravector

Θ∂ = ( θ, ΘR + iΘI )( (1/c) ∂/∂t, - ) = 0

( (1/c) ∂θ/∂t - θR - iθI, (1/c)∂θR/∂t + (i/c) ∂θI/∂t - θ + i × θR - × θI ) = 0

Now, this is a long and hairy equation, which involves four pieces: a real scalar, an imaginary scalar, a real vector and an imaginary vector. Physically speaking, there is a scalar term, a pseudo-scalar term, a vector term, and a pseudo-vector term. If the result of this expression is truly zero, then each of these four peices must independantly equal zero as well. We will set them to zero, and rearrange them slightly

θR = (1/c) ∂θ/∂t

θI = 0

× θR = - (1/c) ∂θI/∂t

× θI = -θ + (1/c) ∂θR/∂t

Can you see it yet? The first time I saw this, I about peed my pants.

These equations resemble the Maxwell equations in their differential form. We can make the following substitutions in order to get the equations closer in form to Maxwell's equations

θR → cE

θIB

These substitutions constrain the wave function Ψ to take on the role of the electro-magnetic four-potential

ΨA = ( V/c, A )


Using these assignments of our wave variables, we can express the wave equation in a from that is very similar to the Maxwell equations.

E = ∂θ/∂t

B = 0

× E = - ∂B/∂t

× B = -θ + (1/c2) ∂E/∂t

Now, other than the discovery that the Maxwell equations are a manifestation of a paravector wave equation, there are a few other interesting tidbits you should take note of.

First, there is no magnetic monopole, and there is no way to introduce one without messing up the covariance of the expression.

Second, the source terms are represented by the derivatives of a scalar field. What is this scalar field?

Third, it is the homogeneous wave equation which represents the electro-magnetic field with sources. We usually introduce inhomogeneous terms in order to represent sources.

Gradient Operator

I think some of the physics that is encoded in the algebra manifests itself mainly in the derivatives. For instance, the true meat of the complex algebra is the analytic functions. Can we come up with some type of space-time version of an analytic function? I've made a few attempts, and I've seen other attempts, but in general I'm just not satisfied. There is no parallel that is as clean and clear as the analytic functions of a complex variable. The troubles with derivatives in the algebraic representation of space-time all boil down to non-commutativity. I'll keep searching.

In the mean time, let's discuss a derivative operator, which I believe is a valid operator, though I wish it would arise in a more clean fashion. Since I am not satisfied with any particular derivation of this operator, I will just introduce it without derivation: the algebraic gradient

= ∂μeμ

Here the notation μ makes reference to contra-variant (raised index) partial derivative with respect to the μth coordinate. Remember that a standard partial derivative is defined with the index lowered.

μ = ∂/∂xμ

If we use the minkowski metric to raise the index, we must change the sign on the spatial terms of the gradient.

= (1/c) ∂/∂t e0 - ∂/∂x e1 - ∂/∂y e2 - ∂/∂z e2

Now, in tensor algebras, we use index balancing in order to construct relativistically significant quantities. In this algebra we don't deal with the component indexes much, but we still need to be concerned with making relativistically significant quantities.

We can use the fact that the gradient transforms like a para-vector. Thus, if we want the gradient to act on a paravector, it needs to do so like this

A

Which results in a biparavector.

If the gradient acts on a biparavector it must do so like this

B∂ or B

Which results in a paravector. Note that the gradient acting from the right is a distinct operation due to non-commutativity.

And if the gradient acts on a spinor, it must do so like this

S

which results in another spinor. I only know that this expression is relativistically significant, but since I haven't yet seen this operation in action, I have no physical intution as to its significance.

Friday, February 26, 2010

The Classical Spinor

In the previous post we discussed the different relativistic significant quantities. We will now use these quantities to quantify physical parameters.

Consider a particle that is travelling on a path with arbitrary acceleration. At any point on the path, it is possible to find a co-moving reference frame, or a frame where the particle appears to be instantaneously at rest. We will call this the "rest frame", even though we must continually change this frame as the particle progresses.

The term "rest" in relativity does not mean "motionless", for even if the particle is not moving in space, it is moving in time. Strictly speaking it is moving in time at the speed of light. In the rest frame of the particle, the velocity is always along the time direction. Thus, we can represent the velocity of the particle with a scalar.

u = c

If we want to transform this velocity from the rest frame back to the frame where we are observing the particle, we need to apply a Lorentz transformation.

u' = LuL = cLL

As we have previously seen, a para-vector such as u' can be a composition of spinors. Thus, we can consider the spinor L to characterize the velocity of the particle. As part of his development of the Algebra of Physical Space, W. E. Baylis chooses to assign a special name to the spinor L. He calls it the Classical Spinor, or Classical Eigen-Spinor, and denotes it Λ. Baylis uses units where c = 1, but we aren't going to do that here. Therefore we need to include this factor in the definition of the classical spinor.
Λ = (c1/2) L

This is called the classical spinor, because it is the spinor which is representative of the classical trajectory of the particle, and so in some respect representative of the particle itself. Using the classical spinor has manay advantages. For instance, we not only can determine the velocity of the particle, but we can also determine the spatial orientation of the coordinate system that the particle resides in.

We previously saw that the Lorentz transformation can be given in an exponential form, in terms of the generators of the transformation. We can likewise construct the classical spinor from generators, which are functions the proper time of the particle τ. For instance, we can express a particle that is spinning at a constant rate as

Λ = (c1/2) e(1/2) ωτ

In this case, the generator of the transformation is the vector ωτ. We call the vector ω the spatial rotation rate, and require that it is purely imaginary. The classical spinor can describe a particle with quantum-like spin, without the need of any quantum postulates. For instance, the classical spinor changes sign upon a full rotation, and requires two full rotations before the sign is restored. In this sense, a particle represented by a classical eigenspinor can be associated with a spin 1/2 particle, like an electron.

If we want a particle that can accelerate as well as spin, then we allow the rotation rate to have real as well as imaginary parts. If the rotation rate has arbitrary real and imaginary parts, we call it the space-time rotation rate, and designate it as Ω.

A classical eigenspinor with a constant space-time rotation rate represents a particle that is spinning and accelerating at a constant rate.
Λ = (c1/2) e(1/2) Ωτ

We can take a derivative with respect to τ in order to acheive what Baylis calls the equation of motion for the classical spinor.
dΛ/dτ = (1/2) ΩΛ

This equation ends up being valid, even when the space-time rotation rate is not constant.

So what are the possible values the Ω can take? Knowing that the magnitude of the 4-velocity of a particle is constant, we also know that the magnitude of the classical spinor is also constant. This constraint helps us determine the restriction on Ω.

ΛΛ = c e(1/2) (Ω + Ω) τ

In order for Λ to be constant, we need the factor involving the exponent to go away. This will happen if the quantity in the exponent involving Ω is always zero. This quantity in the exponent just happens to represent the scalar portion of Ω.

There are two ways of interpreting this result. We may say that the physical quantity that we associate with Ω can not have a scalar portion, or we can say that the physical quantity can have a scalar portion, but the scalar portion does not contribute to Λ. Though Baylis uses the first interpretation, I personally prefer the second. If you want to take the second interpretation, then you need to modify the expression for the equation of motion of Λ.


dΛ/dτ = (1/2) <Ω>VΛ

The reason I prefer this interpretation, is because it introduces a symmetry. The symmetry here is that we can freely change the scalar portion of Ω without changing the resulting trajectory of the particle. This symmetry should have physical consequences, as all symmetries do. If these physical consequences can be justified, then this form of the equation will be justified.

Tuesday, February 23, 2010

Para-Vectors, Bipara-Vectors, and Spinors

We can think of a para-vector as the linear combination of a real scalar with a real vector. The para-vector is the equivalent to the four-vector in tensor based formalisms. If we were using tensor language, the four-vector would have been defined in terms of its transformation properties. Likewise, we can define a para-vector from it's transformation properties.

For instance, I can say that an object is a para-vector without making reference to the state of the components, but merely by noting the way that it transforms. Para-vectors transform like this:

A' = LAL

As previously stated, a bipara-vector is the product of 2 paravectors. However, relativity dictates that any physically significant quantity is covariant, or able to maintain the same form after a lorentz transformation. Consider what will happen if we merely multiply two para-vectors together

C = AB

Now let's apply a lorentz transformation to the equation

C' = LALLBL

Due to the LL factor that ends up getting sandwiched in between A and B, we see that C is not a covariant quantity - it changes it's form after the transformation. However, consider the definition

C = AB

C' = LALL B L

C' = LAB L

C' = LCL

In the third line we use the fact that LL = 1. This allows the sandwich factor to be "disolved", and allows C' to take the same form as C. Thus, according to this definition, C is a covariant quantity. We have also derived the transformation rule for a bipara-vector. We can use this transformation law as the definition of a bipara-vector.

Note that in both cases, the para-vector transforms to a para-vector, and the bipara-vector transforms to a bipara-vector. Both transformation laws also preserve the "group" property of the transformation. In other words, if I were to perform 2 transformations in a row, the result could be equivalent to some single overall transformation.

If we must have a transformation law in order for a quantity to be physically significant, we ask: if L itself is physically significant, is it a para-vector, or a bipara-vector? In order to answer this, consider the action of two sequential transformations

A' = L2LALL2

B' = L2LBL L2

In both of these cases L acts before L2. This set of sequential transformations would have been equivalent to a single transformation by a composite L'.

L' = L2L

We can consider here that L has been transformed by L2. If we consider this to be a valid transformation law, then we see that L does not transform like a para-vector, or a bipara-vector. We call quantities that transform in this way "Spinors". Spinors transform with only a single application of the transformation like this:

S' = LS

Spinors behave like basic building blocks of relativistically significant quantities. For instance, we can use two spinors to construct a para-vector.

A = S1S2

This quantity obeys the transformation law of a para-vector.

We can also use two spinors to construct a bipara-vector

B = S1S2

This quantity obeys the transformation law of a bipara-vector.

To recap, here are our 3 relativistically significant quantities

S' = LS ; spinor

A' = LAL ; para-vector

B' = LBL ; bipara-vector

Monday, February 22, 2010

Lorentz Transformations

Now we begin to introduce a little physics. (The form of the multiplication law is already hiding a bit of physics.)

We are going to derive a Lorentz transformation. To be specific, we are going to derive a proper Lorentz transformation, in other words, the transformation can include boosts and rotations, but not inversions.

We may expect that a general transformation can be obtained by the following rule

A' = S A R

We require two elements, S and R, since multiplication on the left is a distinct operation from multiplication on the right.

Using this general form, we first require that the transformation does not change the real-ness, or imaginary-ness of A. We can do this by assuming that A is real, and setting A' equal to the Hermitian conjugate of A'. This condition is stated as

R A S = S A R

Since A is entirely real, this condition can only be satisfied if

R = S

Thus, the form of the proper Lorentz transformation must be

A' = S A S

The other condition on a Lorentz transformation is that it leaves the space-time interval invariant. The space-time interval is the length of the displacement in space-time. We know how to find such a length - we use a dot product, which has previously been defined for our algebra. In other words, a Lorentz transformation must be invariant to the dot product defined for our algebra.

<A'B'>S = <AB>S

We can expand the left hand side of the equation as follows

(1/2) (S A SS B S + S B SS A S) = S S(S S)<AB>S

Thus the dot product remains invariant only if the factor involving S is equal to 1. For a proper Lorentz transformation, this is accomplished if

S S = 1

We give the symbol L to such a quantity that satisfies these conditions. The components of L can be parametrized in terms of a direction N and an angle θ.

L = ( cosh(θ/2), N sinh(θ/2) )

In this parametrization θ is allowed to be real, imaginary, or complex. If θ is purely real then it represents the rapidity of a Lorentz boost in the direction of N. If θ is purely imaginary then it represents the rotation angle of a spatial rotation around an axis N. If θ is complex, then there is a combination of boost and rotation, in a screw-like motion.

We can also use an exponential form to describe L. For instance, a general Lorentz transformation can be written as

L = e(1/2) (Γ + iΘ)

Here Γ and Θ are pure vectors that are entirely real, and they represent the 6 generators of Lorentz transformations. Using this form, we can represent the Lorentz transformation as

A' = e(1/2) (Γ + iΘ) A e(1/2) (Γ - iΘ)

This form is reminiscent of the group theoretic operator approach that is used in quantum mechanics. This will not be the last time we are reminded of quantum mechanics...

(Scalar, Vector) Notation

We have already determined how the basis elements of the algebra multiply with each other. However, if each element can have up to 8 components, then the product of two of these elements could have up to 64 terms. This is not very friendly.

In this post we will present the algebraic multiplication law in terms of familiar vector operations such as dot and cross products.

We can represent an arbitrary element of the algebra as a linear combination of 4 pieces, using (scalar, vector) notation

A = (a + ib, C + iD)

Here the scalar portions are designated by lowercase italicized symbols, while the vector portions are uppercase non-italicized symbols. These 4 quantities represent the scalar, pseudo-scalar, vector, and pseudo-vector quantities.

We will first consider the multiplication of purely elements that are purely real.

A = (a, A)
B = (b, B)

The multiplication of two such elements results in 16 terms. These terms can be written using the vector dot and cross products as

AB = (ab + A ∙ B, aB + bA + iA × B)

The right hand side of this equation contains 3 distinct portions, a scalar, a vector, and a pseudo-vector term. Now let A and B represent general elements. This can be done in the following way

A = C + iD
B = E + iF
AB = CB - DF + iCF + iDB


Here C, D, E, and F are all purely real algebraic elements.

If an algebraic element is purely real, it is called a para-vector. Some examples of para-vectors are position, or momentum

x = (ct, X)
p = (E/c, P)

We have applied the appropriate factor of c in both cases, in order for the scalar portion to have the correct scale with respect to the vector portion - relativistically speaking. We see that we have associated known relativistic four vectors with algebraic para-vectors.

The multiplication of two para-vectors results in a bi-para-vector. Such a quantity usually corresponds with a multi-indexed tensor in standard relativistic treatments.

Imaginary Numbers, Spatial Inversions, and Duals

Here is the situation so far. We have an 8 dimensional algebra which has two conjugate operations. The conjugate operations allow us to split any element of the algebra into 4 sub categories:
  • Real Scalar
  • Imaginary Scalar
  • Real Vector
  • Imaginary Vector

We have already determined the physical significance of the scalar vs. vector categories - we use this to encode the space/time split. Can we determine a physical significance for the real and imaginary split as well?

These categories are defined by their behavior when one of the two conjugates are applied. For instance, objects in the scalar category are unaffected by the algebraic conjugate, and objects in the real category are unaffected by the hermitian conjugate.

How do these categories behave when both conjugates are applied at the same time?

  • Real Scalar - unchanged
  • Imaginary Scalar - flips sign
  • Real Vector - flips sign
  • Imaginary Vector - unchanged

This behavior can be explained if we state that both conjugates applied simultaneously has the physical meaning of spatial inversion. We should already know that there are 4 different quantities that behave differently under spatial inversion.

  • Scalars - unchanged under spatial inversion
  • Pseudo-Scalars - flips sign under spatial inversion
  • Vectors - flips sign under spatial inversion
  • Pseudo-Vectors - unchanged under spatial inversion

If we assign the physical meaning of spatial inversion to the action of both conjugates applied at the same time, then we can determine an actual physical meaning for the imaginary portion of the algebra - namely, purely imaginary quantities are pseudo-quantities.

For instance, we might represent both time and volume with a real number. However, we would expect volume to change signs upon spatial inversion, whereas time should not.

Likewise, the Electric field should change signs upon a spatial inversion, but the Magnetic field should not.

We have known about pseudo-quantities for a very long time. However, it is the usual practice to place these pseudo-quantities in the same 4 dimensional space as the non-pseudo-quantities, with the stipulation that "you can't add vectors with pseudo-vectors". In other words, a physics equation cannot contain both vector and pseudo-vector pieces.

The reason for this is because these pieces are linearly independent. The introduction of the imaginary unit in the algebra helps us distinguish pseudo-quantities, and also enforces the linear Independence of these quantities.

In 3 dimensions, the primary example of a pseudo-vector is the cross product

C = A × B

the example of a pseudo-scalar is the triple product

D = A ∙ (B × C)

With this identification of the physical meaning behind the real/imaginary split, we can safely state that the algebra represents an 8 dimensional space, without needing to postulate any wierd parallel universes. Rather we merely provide 4 dimensions for normal quantities, and 4 linearly independent dimensions for pseudo-quantities.

So if we are using an 8 dimensional space, how come we are using a 4 dimensional metric? The answer is that the "pseudo" half, or imaginary half, of the space uses a metric that is implied, and which can be determined. The pseudo metric is the same as the non-pseudo metric, except for an overall sign change. Remember that we chose the (1, -1, -1, -1) metric. The pseudo metric is then (-1, 1, 1, 1).

Most texts on relativity state that the overall sign of the metric is unimportant. We see here that the sign of the metric distinguishes the pseudo space from the non-pseudo space. So would it have made a difference if we had chosen the (-1, 1, 1, 1) metric to begin with? The answer is no. I might still choose the time component to correspond with the scalar basis element. I would have to choose i as the scalar basis element. I would still be able to derive the rest of the algebra. The only difference would be the assignment of i to pseudo quantities.

In other words, the choice of metric is related to a duality principle, which has much more physical significance than a mere sign convention. In fact, the duality principle that allows us to have a choice of metrics is the same duality principle which allows us to compose a dual electromagnetic field.

In order to find the dual representation of any algebraic element, we need merely multiply by -i. This operation is equivalent to the Hodge dual used in other representations. The only problem with this dual operator is that it doesn't exactly undo itself. Meaning, two applications of -i results in an overall sign change.

I personally like to use the following for a type of dual operator, since it un-does itself nicely:

A' = iA
A'' = iA' = i(iA) = A

This duality is a very profound part of nature, or at least in our representation of nature. It seems like an easy thing to overlook, and therefore it probably contains a bounty of rich truths - hidden in plain sight.

The Hermitian Conjugate

Now that we have complex scalars to work with, we need to determine how to start to determine what the imaginary portion of the algebra physically represents.

To do this, we will introduce the Hermitian conjugate. The Hermitian conjugate is designated by a superscripted dagger, like A.

When the Hermitian conjugate acts on a scalar quantity, it has the effect of a standard complex conjugate.

a = a*

When the Hermitian conjugate acts on a product of algebraic elements, it reverses the order of the elements, similar to the action of the algebraic conjugate

(AB) = B A


If the Hermitian conjugate does nothing when it acts on an element, that element is called "Hermitian" or real. Anti-hermitian, or imaginary elements change sign when acted on by the Hermitian conjugate. The basis elements are assumed to be real elements. Thus, in order to take the Hermitian conjugate of a given element, we merely take the complex conjugate of the components

A = (aμ)*eμ


Using the algebraic conjugate we were able to split an algebraic element into a scalar and vector portion, which we physically identify with the time and space portions. We can make a similar split using the Hermitian conjugate into real and imaginary portions.

<A>R = (1/2) (A + A)
<A>I = (1/2) (A - A)


When using basis elements that are derived from the Minkowski metric, we see that we can change the sign on the i's merely be reversing the order of all products of basis elements. For this reason, the Hermitian conjugate is also sometimes referred to as the "Reversal operator". In other words, if we can represent any element as the product of a set of real elements, the Reversal operator merely reverses the order of all of the factors - which ends up having the same effect as the Hermitian conjugate.

The Minkowski Metric

As we have seen, we can begin to establish a multiplication rule for the algebra, as long as we know what the metric of the underlying vector space is.

For our purpose, we will consider the Minkowski Metric of special relativity as our starting point. We will use the (1, -1, -1, -1) convention of this metric. Using this metric requires that we introduce 4 basis elements. The fundamental identity applied to a 4x4 matrix gives us 16 expressions that will encode how we multiply these 4 basis elements.

When we say we want to use the Minkowski metric, we are actually implying that we want to discuss space-time, not just any old vector space which has a Minkowski norm.

At this point, we have a decision to make regarding how we will represent space and time in this algebra. Although space and time have the same footing in special relativity, we know from practical experience that there is a physical distinction between space and time. So far, in our algebra, the only thing that we have which might encode this distinction is the ability to seperate quantities out into scalar and vector portions.

If we decide to associate the scalar portion of an element with the time component, and the vector portion of the element with the spatial components, then we also make a statement about the physical interpretation of the conjugate operation. We are saying that the algebraic conjugate does nothing to the time component since it is a pure scalar, and it flips the sign of any space component which are pure vectors.

If we make the decision to encode the space/time boundary with the scalar/vector boundary, then we can immediately determine the identity of e0. The basis vector associated with time must be the basis associated with scalars.

e0 = 1

Making this identification automatically solves 7 of the 16 expressions introduced by the fundamental identity. We now only need to figure out the last 9, which correspond with how the spatial basis elements multiply with each other.

eiej = -ejei ; ij


eiei = -1

Since the algebraic conjugate only applies an overall change of sign to the purely spatial basis elements, we can recast these equations into

eiej = - ejei ; ij

(ei)2 = 1

The first equation states that if we swap the order of any 2 spatial basis elements, we need to change the sign of the product. The second equation states that any basis element multiplied by itself is equal to 1.

As of yet we have not determined how to represent the product of 2 spatial basis elements. Before we can do this, we need to take a look at the product of all 3 distinct spatial basis elements

e1e2e3

We don't really know what this quantity is yet, but we do know that if we change the order of the basis elements, it will change the overall sign of the quantity. We also know that applying an algebraic conjugate to any individual factor will also change the sign of the product.

The first thing we are going to do is apply the algebraic conjugate to this triple product.

e1e2e3 = e3 e2 e1

Since each of these elements is a pure spatial quantity, the action of the conjugate accumlates to a single minus sign on the product. The application of the conjugate has applied an overall odd permutation to the order of the elements, which results in a second minus sign. Thus we have the identity

e1e2e3 = e1e2e3


In otherwords, the algebraic conjugate does nothing to the triple product. From our definition, then, the triple product is a scalar. We may assume that the triple product must therefore be 1 - the only scalar with unit magnitude, however consider what happens if we square the triple product

(e1e2e3)(e1e2e3) = -e1e2e3e3e2e1 = -e1e2e2e1 = -e1e1 = -1

Thus, the triple product is a scalar that results in -1 if it is squared. The only quantity that can do this is the imaginary unit, i. Thus we can make the identification

e1e2e3 = i

Consider multiplying the triple product on the right by e3

e1e2e3e3 = e1e2 = ie3

We can perform a similar trick in order to determine the products of the other spatial basis vectors. The end result is the following list of multiplication rules

(eμ)2 = 1; all μ
e0eμ = eμ; all μ
e1e2 = -e2e1 = ie3
e2e3 = -e3e2 = ie1
e3e1 = -e1e3 = ie2

The good news is that these rules encapsulate the entire multiplication rule for the algebra. Since we are using the Minkowski metric, we have basically built special relativity right into the algebra itself. The bad news is that our scalars are no longer real numbers, rather they are complex numbers. Aside from not knowing up front what physical significance the imaginary unit has, we have also doubled the size of our algebra from 4 dimensions to 8. We still only have 4 basis vectors, but our scalars are now a 2 dimensional sub-algebra. No need to fear - the complexities of complex scalars will be addressed shortly.

The Underlying Vector Space

Since it is possible to represent physical quantities with vectors, we desire to somehow connect our algebra to this vector space. We can state that the algebra has all of the linear properties that the vector space has. The only thing that a vector space has, which our algebra does not have, is a dot product.

The dot product of a vector space introduces a "norm" function, where we can identify a particular scalar value that is characteristic of an element of the vector space. We have, up to now, defined two characteristic scalar values. We will use the value of the determinant to represent the "norm", or the "length" of the element.

So, how are we going to go about associating this algebra with a vector space?

First, we will assign a set of basis elements to the algebra. These basis elements will be denoted as a bold lower case e. Our algebra represents a multi-dimensional linear space, and so we will need to distinguish the basis elements that correspond to each dimension. This is done by a subscripted index variable, as is standard in most relativistic notation - i.e. eμ.

Every element of the algebra can now be considered as a linear combination of these basis vectors. The scalars that multiply the basis vectors are called the components. The components are generally represented by lower case italicized symbols with superscripted index variables - i.e. aμ.

The index variables represent the particular index we are interested. We will use standard relativistic conventions with these index variables. First, if an upper and lower index match, then we sum over all values of the index. Second, if the index can represent any of the space-time dimensions then it is represented by a lower case greek letter. If the index represents only one of the spatial dimensions, then it is represented by a lowercase latin letter.

Using the linear combination of basis elements, as well as the index conventions, we can represent an element A in terms of its components aμ like this:

A = aμeμ

We expect that the components of a particular element of the algebra should be the same as the components of the corresponding quantity in the associated vector space.

Now, lets consider what might happen if we try to associate the determinant of an element A with the dot product of the components of A.

AA = aμaνgμν

Here g is the metric of the associated vector space. We can expand the left hand side in terms of the components of A.

AA = aμeμaνeν

Since AA is already a scalar, we can take the scalar part without changing anything.

AA = aμaν (1/2) (eμeν + eνeμ)

We can now form a relationship bewtween the products of the basis elements of the algebra and the metric of the associated vector space

(1/2) (eμ eν + eνeμ) = gμν

This relationship is the fundamental identity which allows us to define multiplication of the basis vectors in such a way that we are assured that the components of the element in the algebra are identical to the components of the vectors in the associated vector space. This relationship also DEFINES the multiplication rules bewtween the basis elements. Thus, the metric is what defines the multiplication rule for the algebra. Since the metric is a non-trivial aspect of physics, we have a clue here that the multiplication rule for the algebra actually carries substantial physics with it.

This relationship between basis elements and the metric is very similar to what is called the "Fundamental Identity of a Clifford Algebra". The only difference is that in the Clifford Algebra version does not contain any algebraic conjugates.

Of course, my version is better :)

The dot product between two different elements is defined as

<AB>S = aμbνgμν

If you would like to generalize this to a generalized "dot product" that can be associated with any square matrices, then it would have the following form

(1/n) Tr(A Adjoint(B)) = aμbνgμν

Here, n is the dimension of the matrix.

Determinant and Trace

So far we have discovered 2 scalars that can characterize a particular element of the algebra. The first is given by

AA = a

If we were to represent our algebra by square matrices, then this quantity would be the determinant of the matrix which represents A.

The second characterizing scalar is called the "scalar part" of A, and is given by

<A>S = 1/2(A + A)

To see how these two scalars relate to each other, we need to introduce the exponential function. Consider that we have an element that is expressed as the exponent of A

eA

Taking the exponent of an algebraic element that we don't know much about seems kind of ambiguous. However, if you must, consider that the exponent converges from the infinite sum

eA = (1/n!) An

Now let us determine what happens if we take the algebraic conjugate of eA.

First, we know that the conjugate commutes with respect to addition, so the conjugate gets applied to each term of the infinite sum seperately.

Next, we know that the conjugate only affects An, since 1/n! is a scalar.

Finally, we know that the conjugate of An is just the nth power of the conjugate of A

An = An

The end result of all of this is

eA = eA

Multiplying eA by its conjugate produces a scalar, which we have identified with the determinant of eA.

eA eA = eAeA = det(eA)

Now, if we multiply the exponent of two real numbers together, this is the same as exponentiating the sum of the two numbers. For instance

exey = ex+y

This rule does not apply generally since the algebraic elements do not commute with respect to multiplication. However, the elements A and A DO commute, and so the exponential rule CAN be applied in this instance.

eA eA = eA+A

The factor in the exponent is a multiple of the scalar part of A as we have previously defined.

Thus we can relate the determinant of A with the scalar portion of A through the exponential function

det(eA) = ea eA = eA + A = e2<A>s

This equation can be compared to a parallel equation from the determinant theory of square matrices

det(eA) = eTr(A)

Thus we see that if we were to use square matrices to represent the elements of our algebra, then the "scalar part" is merely a multiple of the trace of the matrix.

In the theory of Lie Algebra, a matrix that has a trace of zero is used to represent the generator of a specific transformation. In our algebra, such a generator would be an element whose scalar portion is zero.

Sunday, February 21, 2010

Scalars and Vectors

The algebraic conjugate operation affords us the ability to observe a distinction between different kinds of quantities. First, we know from the definition of the conjugate

AA = a

that when the algebraic conjugate acts on a scalar, it does nothing. For instance

a = AA = A A = AA = a

We are going to make this another defining property of a scalar - it is invariant to the algebraic conjugate.

Now there are several ways to define a scalar in math and physics, and so far I have made reference to none of these. Rather I have so far stated that a scalar is
  • able to commute with respect to multiplication by any element of the algebra
  • invariant with respect to application of the algebraic conjugate
We will see that these defining properties of a scalar lead to the other definitions that we know so well. We will also see that these 2 defining properties end up being mutually satisfied - at least for the end result of our physical algebra.

AA is a characteristic scalar associated with A. We can find another characteristic scalar. Rewrite A into terms that have symmetric and anti-symetric combinations with A.

A = 1/2(A + A) + 1/2(A - A)

If you expand the expression on the right hand side, you will see that it reduces to A. The expression on the right hand side has 2 terms. The first term is invariant to the algebraic conjugate, and thus we see that the first term is a scalar term. The application of the word "scalar" in this context becomes very similar to the original historical usage introduced by Hamilton.

If we remove the scalar part, what are we left with? The algebra represents multi-component objects, and the scalar part only represents single component objects. Thus we can assume that there are several components associated with the remaining portion, if we strip the scalar portion away.

If we apply the algebraic conjugate to the second, or anti-symmetric term, it changes sign. This is to be the defining property of a Vector. Again, the term vector has a host of meanings, and definitions. The usage as applied here is different than the standard relativistic version of the word, but it is similar to the original usage coined by Hamilton. We are going to be talking about vectors much more in future posts.

Since we know that can always decompose any element into a scalar and a vector part, we will use a special notation to represent this

<A>S = 1/2(A + A)

<A>V = 1/2(A - A)

And we will use the algebraic conjugate to detect a pure scalar or a pure vector
  • If the conjugate does nothing to the algebraic element, it must be a pure scalar
  • If the conjugate flips the sign of the element, it must be a pure vector

The very naive idea of a scalar and a vector is that a scalar is a single component quantity, and a vector is a multi-component quantity. As the algebra develops we will see that this is a pretty reasonable first approach to these quantities.

Most elements in the algebra have a linear combination of both a scalar and a vector part. The word we use to talk about a linear combination of a scalar and a vector is "para-vector". The meaning behind a para-vector is closer to the meaning of the relativistic vector.

Friday, February 19, 2010

Algebraic Conjugate

In order to more fully define the operation of division, we need to introduce a special operation called the algebraic conjugate.

The algebraic conjugate acts on elements of the algebra. An element that has been acted on by the algebraic conjugate operation is denoted by the symbol with a bar over the top, such as A. The algebraic conjugate is defined such that

AA = a

We don't really care yet what the value of a is, since we haven't really defined the conjugate, or the multiplication. What does matter is that a is a scalar. Using this definition of the algebraic conjugate, we can relate it to the "division" or inverse operation.

A-1 = 1 / A = (1 / A) ( A / A ) = A / AA = (1 / a) A

Now, a is a real number, whose inverse is defined. This relationship between the inverse, the conjugate, and a is present in an identity from matrix algebra:

A-1 = (1 / Det(A) ) Adjoint(A)

Thus, if we were using matrices to represent our physical quantities, a would be the determinant of the matrix and A would be the matrix Adjoint.

If we make the requirement that the conjugate operation should be linear, then we have the following property

A + B = A + B

We can use the relationship between the conjugate and the inverse to show that

AB = B A

Finally, since we know that taking an inverse twice should do nothing, we can again employ the relationship between the conjugate and the inverse to show that taking the conjugate twice also does nothing.

Algebraic Representation

In physics, we need a way to represent physically meaningful concepts in a quantifiable way. When discussing things like time, or mass, we can choose real numbers to represent these concepts quantitatively. The real numbers form an algebra, which basically means that if you add them or multiply them, the result is also a real number.

The principle form of representation for most physical objects is a vector, which does not belong to an algebra. This means that there is no intrinsic product defined so that two vectors can be multiplied by each other to form a vector. This lack of definition of multiplication in our basic representation of physical quantities removes understanding of how these quantities interplay with each other.

If we want to solve the universe we need to figure out how to represent our physical quantities as algebras rather than vector spaces, so we don't miss out on the added information provided by this multiplication.

We will employ a linear algebra for the representation. What this means is that although we can decompose a quantity like position into its principle components, the full representation can be achieved by a square matrix, AND this representation is preferred over the component form.

An element of our physical algebra will be denoted by a bold symbol, such as A.

The product of two such elements can be written as

AB = C

The omission of the multiplication symbol distinguishes an algebraic multiplication from other types of multiplication such as dot products, or cross products. At this point, looking at an equation such as this is pretty pointless, since we don't know exactly how to multiply the elements together, even if we know the physical meaning of the symbols. We will discover how to multiply the elements of our physical algebra by first examining what types of properties this multiplication is expected to have.

In general, the members of an algebra are not expected to commute with respect to multiplication. What this means is

ABBA

generally. In other words, the order of multiplication is usually important.

There are usually elements in the algebra that commute with every other element in the algebra. We call these special elements scalars and denote them by a non-bold lowercase italicized symbol, such as a.

In other words, we know that for scalars we can always re-arrange the order of multiplication.

aB = Ba

always.

Common scalar values in physics represent quantities such as time, or mass. We are used to representing these scalar quantities with real numbers, and we will continue to do so. This means that we require the real numbers to be present in our algebraic representation of the universe.

Since the real numbers are included in our algebra, we know that we have access to two special real numbers, 0 and 1. These two numbers help us define the operations of subtraction and division.

For instance, if we know that

A + B = 0

We can rewrite the sum

C + B = C - A

Likewise, if we know that

AB = 1

We can rewrite the product

CB = CA-1

We cannot strictly call this a "division", since there would be two possible definitions - given that we cannot in general rearrange the order of multiplication.