The MEASUREMENT EQUATION
of a generic radio telescope
AIPS++ Implementation Note nr 185

J.E.Noordam
(jnoordam@nfra.nl)

15 February 1996, version 2.0

File:  /aips++/nfra/185.latex Symbols File:  /aips++/nfra/megi-symbols.tex

Abstract: This note is a step towards an ‘official’ AIPS++ description of the Measurement Equation, based on an agreed set of names and conventions. The latter have been defined in a separate TeX file, and can (should) be used in subsequent AIPS++ documents to ensure consistency.

Contents

1 INTRODUCTION
2 THE M.E. FOR A SINGLE POINT SOURCE
 2.1 The feed-based instrumental Jones matrices
 2.2 The Jones matrix of a Tied Array feed
 2.3 Jones matrices for multiple beams
3 THE FULL MEASUREMENT EQUATION
 3.1 Summing and averaging
 3.2 interferometer-based effects
4 POLARISATION COORDINATES
5 GENERIC FORM OF JONES MATRICES
 5.1 Ionospheric Faraday rotation (Fi(ρ, ri))
 5.2 Atmospheric gain (Ti(ρ, ri))
 5.3 Fourier Transform kernel (Ki(ri.ρ))
 5.4 Projection matrix (Pi) if γxa = γyb
 5.5 Projection matrix (Pi) if γxa = γyb
 5.6 Voltage primary beam (Ei(ρ))
 5.7 Position-independent receptor cross-leakage (Di)
 5.8 Commutation (Y i)
 5.9 Hybrid (Hi)
 5.10 Electronic gain (Gi)
 5.11 Do we need a configuration matrix (Ci)?
6 THE ORDER OF JONES MATRICES
 6.1 Overview of commutation properties
 6.2 Overview of Jones matrix forms
 6.3 Allowable changes of order
 6.4 VisJones and SkyJones
  6.4.1 Tied Array
A APPENDIX: CONVENTIONS
 A.1 Some definitions
 A.2 Labels, sub- and super-scripts
 A.3 Coordinate frames
 A.4 Matrices and vectors
 A.5 Miscellaneous parameters

1 INTRODUCTION

The matrix-based Measurement Equation (ME) of a Generic Radio Telescope was developed by Hamaker, Bregman and Sault [2] [3], based on earlier work by Bregman [1]. After discussion by Noordam [5] and Cornwell [6] [7] [8] [9] [10] [11], the M.E. has been adopted as the generic foundation of the uv-data calibration and imaging part of AIPS++. In the not too distant future, an ‘official’ AIPS++ description of the ME will be needed, with agreed conventions and nomenclature (see Appendix A). This note is a step towards that goal.

The heart of the M.E. is formed by the 2 × 2 feed-based ‘Jones’ matrices, which describe the effects of various parts of the observing instrument on the signal. The main section of this document is devoted to describing the basic form of the Jones matrices in linear and circular polarisation coordinates. Another section discusses the conditions under which their order may be modified (matrices do not always commute).

It is expected that the details of the M.E. (and of this note) will be refined during the first few iterations of design and implementation of AIPS++. But the structure of the M.E. formalism as presented here appears to be rich enough to accomodate all existing and planned radio telescopes. This includes ‘exotic’ ones like cylindrical mirrors, phased arrays, and interferometer arrays with very dissimilar antennas. Further refinements should only require the addition of new Jones matrices, or devising new expressions for existing matrix elements.

In order to test this bold assertion, the various institutes might endeavour to model their own telescopes in terms of the precise and common language of the M.E., using this note as a reference. The following ‘rules’ are probably good ones:

It is also good to realise that there are two basic forms of ME, which should not be confused: In the physical form, each instrumental effect is modelled separately by its own matrix. This is useful for simulation purposes. In the mathematical form, effects are ‘lumped together’ if they cannot be solved for separately. Example: the various contributions to the receiver gain, and tropospheric gain.

Acknowledgements: The author has greatly benefited from detailed discussions with Jayaram Chengalur, Jaap Bregman, Johan Hamaker, Tim Cornwell, Wim Brouw and Mark Wieringa.

2 THE M.E. FOR A SINGLE POINT SOURCE

For the moment, it will be assumed that there is a single point source at an arbitrary position (direction) ρ = ρ(l,m) w.r.t. the fringe-tracking centre, and that observing bandwidth and integration time are negligible. Multiple and extended sources, and the effects of non-zero bandwidth and integration time will be treated for the Full Measurement Equation in section 3.

For a given interferometer, the measured visibilities can be written as a 4-element ‘coherency vector’ V ij, which is related to the so-called ‘Stokes vector’ I(l,m) of the observed source by a matrix equation,

V ij = vipjp vipjq viqjp viqjq = (JiJj)S I Q U V l,m (1)

The subscripts i and j are the labels of the two feeds that make up the interferometer. The subscripts p and q are the labels of the two output IF-channels from each feed.1

The ‘Stokes matrix’ S is a constant 4 × 4 coordinate transformation matrix. It is discussed in detail in section 4 below. The real heart of the M.E. is the ‘direct matrix product’ Ji Jj of two 2 × 2 feed-based Jones matrices.

The ‘Stokes-to-Stokes’ transmission of a Stokes vector through an ‘optical’ element may be described by multiplication with a 4 × 4 Mueller matrix ij [2] [3]. Using equation 1:

Iout(l,m) = S1V ij = S1(J i Jj)SIin(l,m) = ij(l,m)Iin(l,m) (2)

Mueller matrices are useful in simulation, when studying the effect of instrumental effects on a test source I(l,m). They can be easily generalised to the full M.E. (see section 3).

2.1 The feed-based instrumental Jones matrices

It will be assumed (for the moment) that all instrumental effects can be factored into feed-based contributions, i.e. any interferometer-based effects are assumed to be negligible (see section 3). The 4 × 4 interferometer response matrix Jij then consists of a ‘direct matrix product’2 Ji Jj of two 2 × 2 feed-based response matrices, called ‘Jones matrices’. The reader will note that this factoring is the polarimetric generalisation of the familiar ‘Selfcal assumption’, in which the (scalar) gains are assumed to be feed-based rather than interferometer-based.

The 2 × 2 Jones matrix Ji for feed i can be decomposed into a product of several 2 × 2 Jones matrices, each of which models a specific feed-based instrumental effect in the signal path:

Ji = Gi[Hi][Yi]BiKiTiFi = Gi[Hi][Yi](DiEiPi)KiTiFi (3)

in which

 

Fi(ρ,ri)ionospheric Faraday rotation  

Ti(ρ,ri)atmospheric complex gain  

Ki(ρ.ri)factored Fourier Transform kernel  

Piprojected receptor orientation(s) w.r.t. the sky  

Ei(ρ)voltage primary beam  

Diposition-independent receptor cross-leakage  

[Yi]commutation of IF-channel

[Hi]hybrid (conversion to circular polarisation coordinates)  

Gielectronic complex gain (feed-based contributions only)

Matrices between brackets ([ ]) are not present in all systems. Bi is the ‘Total Voltage Pattern’ of an arbitrary feed, which is usually split up into three sub-matrices: DiEiPi. Jones matrices that model ‘image-plane’ effects depend on the source position (direction) ρ. Some also depend on the antenna position ri. Of course most of them depend on time and frequency as well. The various Jones matrices are treated in some detail in section 5.

Since the Jones matrices do not always commute with each other, their order is important. In principle, they should be placed in the ‘physical’ order, i.e. the order in which the signal is affected by them while traversing the instrument. In practice, this is not always possible or desirable. Section 6 discusses the implications of choosing a different order.

2.2 The Jones matrix of a Tied Array feed

The output signals from the two IF-channels of a ‘tied array’ is the weighted sum of the IF-channel signals from n individual feeds. A tied array is itself a feed (see definition in appendix A), modelled by its own Jones matrix. For a single point source, we get:

Jitiedarray = Q i nwinJin (4)

and for an interferometer between two tied arrays i and j with n and m constituent feeds respectively:

Jij = (Ji Jj) = (Q i Qj) n mwinwjm(Jin Jjm) (5)

See also section 6.4. The matrix Qi models electronic gain effects on the added signal of the tied array feed i. The Qi can be solved by the usual Selfcal methods, in contrast to instrumental errors in the constituent feedbefore adding. The latter will often cause decorrellation, and thus closure errors in an interferometer.

Since a tied array feed can be modelled by a Jones matrix, it can be combined with any other type of feed to form an interferometer. Examples are the use of WSRT and VLA as tied arrays in VLBI arrays. Note that this is made possible by factoring the Fourier Transform kernel Kij(uij.ρ) into Ki(ri.ρ) and Kj(rj.ρ), and including the latter in the Jones matrices of the individual feeds (see equ 28).

Obviously, the primary beam of a tied array can be rather complicated, but it is fully modelled by equ 4. Moreover, the contributing feeds in a tied array are allowed to be quite dissimilar. It is nor even necessary for their receptors (dipoles) to be aligned with each other! Thus, equation 4 can also be used to model ‘difficult’ telescopes like Ooty or MOST, or an element of the future Square Km Array (SKAI). This puts the crown on the remarkable power of the Measurement Equation.

2.3 Jones matrices for multiple beams

Using the definition in appendix A, each beam in a multiple beam system should be treated like a separate logical feed, modelled by its own Jones matrix. Any communality between them can be modelled in the form of shared parameters in the expressions for the various matrix elements.

3 THE FULL MEASUREMENT EQUATION

3.1 Summing and averaging

For k ‘real’ incoherent sources, observed with a ‘real’ telescope, equ 1 becomes:

V ij = 1 ΔtΔfdtdf k 1 ΔlΔmdldmJi JjSI(l,m) (6)

The visibility vector V ij is integrated over the extent of the sources (dldm), over the integration time (dt) and over the channel bandwidth (df). Integration over the aperture (dudv) is taken care of by the primary beam properties.

There are only four integration coordinates, whose units are determined by the flux density units in which I is expressed: energysecHzbeam. These coordinates define a 4-dimensional ‘integration cell’. If the variation of V (f,t,l,m) is linear over this cell, integration is not necessary:

V ij = kV 0k(f0,t0,l0,m0) (7)

in which V 0k is the value for source k at the centre of the cell, for Δf = 1Hz and Δt = 1sec. If the variation of V (f,t,l,m) over the cell can be approximated by a polynomial of order 3, then it is sufficient to calculate only the 2nd derivative(s) at the centre of the cell:

V int = kV 0k + 1 12(2V 0k f2 (Δf)2 + 2V 0k t2 (Δt)2 + 2V 0k l2 (Δl)2 + 2V 0k m2 (Δm)2) (8)

Here it is assumed that the 2nd derivatives are be constant over the cell, i.e. the cross-derivatives V 0 p1p2 are zero.

3.2 interferometer-based effects

Until now, we have assumed that all instrumental effects could be factored into feed-based contributions, i.e. we have ignored any interferometer-based effects. This is justified for a well-designed system, provided that the signal-to-noise ratio is large enough (thermal noise causes interferometer-based errors, albeit with a an average of zero). However, if systematic errors do occur, they can be modelled:

V ij = Xij(Aij + MijV ij) (9)

The 4 × 4 diagonal matrix X, the ‘Correlator matrix’, represents interferometer-based corrections that are applied to the uv-data in software by the on-line system. Examples are the Van Vleck correction. In the newest correlators, it approaches a constant (x).

Xij = xipjp 0 0 0 0xipjq 0 0 0 0xiqjp 0 0 0 0xiqjq x𝒰 (10)

The 4 × 4 diagonal matrix M represents multiplicative interferometer-based effects.

Mij = mipjp 0 0 0 0mipjq 0 0 0 0miqjp 0 0 0 0miqjq 𝒰 (11)

The 4-element vector Aij represents additive interferometer-based effects. Examples are receiver noise, and correlator offsets.

Aij = aipjp aipjq aiqjp aiqjq 0 (12)

In some cases, interferometer-based effects can be calibrated, e.g. when they appear to be constant in time. It will be interesting to see how many of them will disappear as a result of better modelling with the Measurement Equation. In any case, it is desirable that the cause of interferometer-based effects is properly understood (simulation!).

4 POLARISATION COORDINATES

In the 2 × 2 signal domain, the electric field vector E of the incident plane wave can be represented either in a linear polarisation coordinate frame (x,y) or a circular polarisation coordinate frame (r,l). Jones matrices are linear operators in the chosen frame:

V + i = vip viq = J+ i ex ey orV i = J i er el (13)

For linear polarisation coordinates, equation 1 becomes:

V + ij = (J+ i J+ j )(EE) = (J+ i J+ j ) exex exey eyex eyey = (J+ i J+ j )S+I(l,m) (14)

and there is a similar expression for circular polarisation coordinates. Thus, as emphasised in [2], the Stokes vector I(l,m) and the coherency vector V ij represent the same physical quantity, but in different abstract coordinate frames. A ‘Stokes matrix’ S is a coordinate transformation matrix in the 4 × 4 coherency domain: S+ transforms the representation from Stokes coordinates (I,Q,U,V) to linear polarisation coordinates (xx,xy,yx,yy). Similarly, S transforms to circular polarisation coordinates (rr,rl,lr,ll). Following the convention of [4], we write:3

S+ = 1 2 1 10 0 0 0 1 i 0 01 i 1 1 0 0 S = 1 2 10 0 1 0 1 i 0 01 i 0 1 0 0 1 (15)

S-matrices are almost unitary, i.e. except for a normalising constant: (S)1 = 2(S)T . S cannot be factored into feed-based parts. The two Stokes matrices are related by:

S = ()S+S+ = (1 (1))S (16)

with4

= 1 2 1 i 1 i 1 = 1 2 1 1 i i (18)

Most Jones matrices will have the same form in both polarisation coordinate frames. But if a Jones matrix is expressed in terms of parameters that are defined in one of the two frames, it will have two different but related forms. This is the case for Faraday rotation Fi, receptor orientation Pi, and receptor cross-leakage Di, in which the orientation w.r.t. the x,ccY frame plays a role. The two forms of a Jones matrix A can be converted into each other by the coordinate transformation matrix and its inverse:

A = A+1A+ = 1A (19)

The conversion may be done by hand, using (the elements a,b,c,d may be complex):

ac d b 1 = 0.5 (a + b) i(c d)(a b) + i(c + d) (a b) i(c + d) (a + b) + i(c d) (20)

1 ac d b = 0.5 (a + b + c + d) i(a b c + d) i(a b + c d) (a + b c d) (21)

Applying these general expressions to rotation Rot(α) and ellipticity Ell(α,α) matrices (see Appendix for their definition), the conversions are:

Rot(α)1 = Diag(expiα,expiα) Rot(α,β)1 = seeequation34 Ell(α,α)1 = Rot(α) (22)

1Rot(α) = Ell(α,α) 1Ell(α,α) = Diag(expiα,expiα) (23)

Usually, all matrices in a ‘Jones chain’ will be defined in the same coordinate frame. An exception is the case where linear dipole receptors are used in conjunction with a ‘hybrid’ Hi to create pseudo-circular receptors:

Ji = G i HiD+ i E+ i P + i K+ i T + i F + i (usingS = S+) = G i (HiD+ i 1)E+ i P + i K+ i T + i F + i (usingS = S+) = G i D i E+ i P + i K+ i T + i F + i (usingS = S+) = G i D i E i P i K i T i F i (usingS = S+) = G i D i E i P i K i T i F i (usingS = S) (24)

in which Hi represents an electronic implementation of the coordinate transformation matrix . All these expressions are equivalent in the sense that, in conjunction with the indicated Stokes matrix, they produce a coherency vector in circular polarisation coordinates. The choice of which expression to use depends on whether one wishes to model the feed explicitly in terms of its physical (dipole) properties, or whether one wishes to regard is as a ‘black box’ circular feed with unknown internal structure.

5 GENERIC FORM OF JONES MATRICES

In this section, the ‘generic’ form of various 2 × 2 feed-based instrumental Jones matrices (operators) will be treated in some detail.

It will be noted that for each matrix, the 4 elements have been given an ‘official’ name (e.g. fixx). The (possibly naive) idea is that, if the structure of the Measurement Equation is more or less complete, these ‘standard’ matrix elements could be referred to explicitly by their official names in other AIPS++ documents (and code), for instance to replace them with specific expressions for particular telescopes or purposes.

The subscript convention is as follows: yibp is an element of matrix Y for feed i, which models the ‘coupling factor’ for the signal going from receptor b to IF-channel p. Where possible, the expressions have been reduced to matrices like the diagonal matrix (Diag), rotation matrix (Rot) etc. These are defined in the Appendix.

5.1 Ionospheric Faraday rotation (Fi(ρ,ri))

The matrix F + i represents (ionospheric) Faraday rotation of the electric vector over an angle χi w.r.t. the celestial x,y-frame. Since χi is defined in one of the polarisation coordinate frames, there will be two different forms for Fi (see also section 4). For linear polarisation coordinates:

F + i (ρ,ri) = fixxfiyx fixyfiyy = Rot(χi) (25)

In circular polarisation coordinates, the matrix F i is a diagonal matrix which introduces a phase difference, or rather a delay difference. It expresses the fact that ionospheric Faraday rotation is caused by a (strongly frequency-dependent) difference in propagation velocity between right-hand and left-hand circularly polarised signals when travelling through a charged medium like the ionosphere. In terms of the Faraday rotation angle χi (see above), we get:

F i (ρ,ri) = firrfilr firlfill = F + i 1 = Diag(expiχi ,expiχi ) (26)

In principle, the Faraday rotation angle is a function of source direction and feed position: χi = χi(ρ,ri). However, Faraday rotation is a large-scale effect, so it will usually have the same value for all sources in the primary beam: χi = χ(ri). For arrays smaller than a few km, the rotation angle will usually also be the same for all feeds: χi = χ(t). These assumptions reduce the number of independent parameters considerably.

5.2 Atmospheric gain (Ti(ρ,ri))

The matrix T + i represents complex atmospheric gain: refraction, extinction and perhaps non-isoplanaticity. Since T + i does not depend on a polarisation coordinate frame, there is only one form:

T + i = T i = Ti(ρ,ri) ti0 0ti = Mult(ti) (27)

The matrix is diagonal because the atmosphere does is not supposed to cause cross-talk. The diagonal elements are assumed to be equal, because the atmosphere is not supposed to affect polarisation.

Atmospheric effects in the ‘pupil-plane’ (i.e. originating directly above the feeds) can be modelled with a complex gain. It is less clear how to deal with effects that originate higher up in the atmosphere, i.e. between pupil plane and image plane.

A phase screen over the array can be modelled as ti = expiψi in which the phase is assumed to be a low-order 2D polynomial as a function of the feed position r: ψi = a0(t) + a1(t)ri + a2(t)ri2 +

5.3 Fourier Transform kernel (Ki(ri.ρ))

The matrix Ki represents the Fourier Transform kernel, which can also be seen as a phase weight factor). It is factored into feed-based parts in order to be able to model a tied array (see section 2.2). Since Ki does not depend on the polarisation coordinate frame, there is only one form:

K+ i = K i = Ki(ri.ρ) = kiaa 0 0 kibb = ki 0 0ki = Mult(ki) (28)

in which ki = 1 n expi2πri.ρλ, which depends on the projected feed position ri and the source direction ρ = ρ(l,m) w.r.t. the fringe tracking centre ρftc, and n = 1 l2 m2 1 0.5(l2 + m2).

If kiaa = kibb, the interferometer matrix Kij = (Ki Kj) is a 4 × 4 diagonal matrix with equal elements. This is equivalent to a multiplicative factor of the familiar form kij = kikj = 1 n expi2π(rirj).ρλ = 1 n expiuij.ρ, i.e. the Fourier Transform kernel or ‘phase weight’ for the baseline uij. For small fields, n 1, so uij.ρ = (ul + vm + w(n 1)) (ul + vm) becomes a 2D FT.

The receptors of a feed are practically always co-located, i.e. they have the same phase-centre: ria = rib = ri, so kiaa = kibb = ki. But note that it is possible to model a receptors that are not co-located, i.e. riarib. It is not immediately obvious why one would want to do such a thing, but it is good to know that the formalism allows it.

5.4 Projection matrix (Pi) if γxa = γyb

The ‘Projection matrix’ models the projected orientation of the receptors w.r.t. the electrical x,y frame on the sky, as seen from the direction of the source (see also section 5.6 below). Since the orientations are defined in one of the polarisation coordinate frames, there will be two different forms for Pi (see section 4). For linear polarisation coordinates:

P + i = pixapiya pixbpiyb cosγxa sinγxa sinγxa cosγxa = Rot(γxa) (29)

in which γxa is the projected angle between the positive x-axis and the orientation of receptor a (see also Appendix ??). There is an implicit assumption here that the feed has perpendicular receptors and is fully steerable, which is the case for the majority of existing telescopes. See the next section for the case where the projected orientations are not perpendicular (γxaγyb).

For circular polarisation coordinates:

P i = pirapila pirbpilb = P + i 1 = Diag(expiγxa ,expiγxa ) (30)

It is sometimes useful to introduce an intermediate coordinate frame, attached to the feed i. In that case: γxa = γxi + γia = β + γia. The ‘offset’ angle γia between receptor a and the frame of feed i will be zero in most cases. The angle β is the parallactic angle, i.e. the angle between two great circles through the source, and through the celestial North Pole and the local zenith respectively. This parallactic angle is zero for an equatorial feed, and varies smoothly with HA(t) for an alt-az feed:

sinβ = cosLATsinHA cosβ = cosDECsinLAT sinDECcosLATcosHA (31)

5.5 Projection matrix (Pi) if γxaγyb

The M.E. formalism must also be able to deal with more ‘exotic’ antennas like parabolic cylinders (Arecibo, MOST) or horizontal dipole arrays (SKAI). In those cases, the projected angles of the two receptors will generally not be equal, i.e. γxaγyb.

NB: The angle γyb of receptor b is defined w.r.t. the y-axis rather than the x-axis. This ensures that γyb = γxa, so that matrix P + i reduces to a simple rotation Rot(γxa), in the common case described in section 5.4 above.

For linear polarisation coordinates P + i becomes a ‘pseudo-rotation’ (compare with equ 29 above):

P + i = pixapiya pixbpiyb cosγxa sinγxa sinγyb cosγyb = Rot(γxa,γyb) (32)

For circular polarisation coordinates:

P i = pirapila pirbpilb = P + i 1 (33) = 0.5 cosγxa + cosγyb + i(sinγxa + sinγyb)cosγxa cosγyb i(sinγxa sinγyb) cosγxa cosγyb + i(sinγxa sinγyb)cosγxa + cosγyb i(sinγxa + sinγyb)

The future large radio telescopes may have feeds in the form of dipole arrays, possibly tilted over an angle α towards the South w.r.t. the local horizontal plane. In that case, the projected angle γxa between a North-South (NS) dipole and the x-axis differs from the projected angle γyb between an East-West (EW) dipole and the y-axis (I hope this is correct now):

cosγxa = cosHAsinDECcos(LAT α) cosDECsin(LAT α) sinγxa = sinHAcos(LAT α) cosγyb = cosHA sinγyb = sinHAsinDEC (34)

5.6 Voltage primary beam (Ei(ρ))

The effects of the primary beam are ignored by [2], which deals implicitly with on-axis sources observed by feeds with fully steerable parabolic mirrors. The AIPS++ M.E. must of course deal with the general case, including ‘exotic’ telescopes like Arecibo, MOST and SKAI. To this end, we define a total voltage pattern matrix Bi, which fully describes the conversion of the incident electric field (V/m) into two voltages (V):

B+ i (ρ) = bixabiya bixbbiyb B i (ρ) = birabila birbbilb (35)

NB: Since the Jones matrix Ji is feed-based, it deals with voltage beams. The power beam for interferometer ij is modelled by Bi Bj. Note that the formalism deals implicitly with interferometers between feeds with quite dissimilar primary beams.

In practice, it is often convenient to split the matrix Bi into a chain of sub-matrices:

This is most useful in the common case of a fully steerable parabolic antenna. The voltage patterns of its feed(s) have a fixed shape, which are rotated and translated w.r.t. the sky when pointing the antenna in different directions. What remains after splitting off Pi and Di(ρ) is an (approximately) real and diagonal matrix Ei which decsribes the position-dependent primary beam attenuation and the position-dependent leakage (see also equation 38 below):

E+ i (ρ) = E i (ρ) = Ei(ρ) = eiaaeiba eiabeibb Diag(eiaa,eibb) (36)

As an example, the diagonal elements of E+ i for an idealised axially symmetric gaussian beam and dipole receptorswould look like:

eiaa = exp[( l ia σa(1 + 𝜖a))2 + ( m ia σa(1 𝜖a))2] eibb = exp[( l ib σb(1 + 𝜖b))2 + ( m ib σb(1 𝜖b))2] (37)

Note that the two receptor beams are each described in their own coordinate frame l ia,m ia and l ib,m ib projected on the sky (see Appendix A). The projection matrix Pi only takes care of electrical rotation, but not of the rotation of the voltage beam on the sky!.

Equation 37 illustrates that the voltage beam of a dipole receptor will be slightly elongated in the direction of the dipole by a factor (1 + 𝜖), even if the mirror is perfectly circular and symmetrical. Obviously, the two asymmetric voltage beams of a feed will not coincide, because they are oriented differently. The resulting position-dependent difference is one cause of off-axis instrumental polarisation.

In reality, things will be more complicated, especially for off-axis sources. For instance, standing waves between the primary mirror and the frontend box, or scattering off support legs, may cause position-dependent leakage terms. Since these cannot be part of Di, they must be modelled as off-diagonal elements of Ei itself.

In general, Ei will be more complicated for antennas with less symmetry. In some exotic cases, it may not be very useful to split off Di or even Pi, although it is always allowed. In any case, the M.E. formalism offers a framework for the ful description of the primary beam of any radio telescope that can be conceived.

5.7 Position-independent receptor cross-leakage (Di)

The off-diagonal elements eiba and eiab of Ei describe ‘leakage’ between receptors, i.e. the extent to which each receptor is sensitive to the radiation that is supposed to be picked up by the other one.

It is customary to split off the position-independent part eiba and eiab of this leakage into a separate matrix Di:

Ei (ρ) = eiaa eiba + e iba eiab + e iab eibb 1 eiba eibb eiab eiaa 1 eiaaeiba eiabeibb = diaadiba diabdibb eiaaeiba eiabeibb = DiEi(ρ) (38)

Usually, the position-dependent leakage coefficients eiba and eiab are assumed to be zero, but that is not always justified.

If the leakage coefficients are determined empirically by calibration, it is not necessary to know the details of the leakage mechanism. It is sufficient to solve for the elements of Di. In that case, there is only one form:

D+ i = D i = Di = diaadiba diabdibb (39)

But in many cases, position-independent leakage can be physically explained by deviations ϕ from the nominal receptor position angles (see Pi), and by deviations 𝜃 from nominal receptor ‘ellipticities’ 𝜃. For linear polarisation coordinates:

D+ i = diaadiba diabdibb = Ell(𝜃ia,𝜃ib)Rot(ϕia,ϕib) Ell(𝜃ia,𝜃ia)Rot(ϕia) (40)

The sign gives the approximation for a well-designed system. Often the two receptors are mounted in a single unit, so position angle deviations caused by mechanical bending of the feed structure are the same for both: ϕia = ϕib. One might also argue that ellipticity should be a reciprocal effect, so that 𝜃ib = 𝜃ia. This is roughly consistent with WSRT experience, and these two assumptions are implicit in equ 27 of [3]. However, for high accuracy polarisation measurements, the parameters for each receptor should be at least partly independent.

For circular polarisation coordinates (see equ 22):

D i = D+ i 1 = (Ell(𝜃 ia,𝜃ib)1)(Rot(ϕ ia,ϕib)1) Rot(𝜃ia)Diag(expiϕia ,expiϕia ) (41)

Again, the sign gives the approximation for ϕia = ϕib and 𝜃ib = 𝜃ia. See equation 34 for an expression for (Rot(ϕia,ϕib)1) where ϕiaϕib. The expression for (Ell(𝜃ia,𝜃ib)1) with 𝜃ib 𝜃ia is similar, but with real coefficients, as expected for circular polarisation coordinates.

5.8 Commutation (Yi)

In some systems, the receptor signals can be switched (commuted) between IF-channels for calibration.

Yi = 10 0 1 orYi = 01 1 0 (42)

5.9 Hybrid (Hi)

In some cases, circularly polarised receptors consist of linearly polarised dipoles, followed by a ‘hybrid’. The latter is an electronic implementation of the coordinate transformation matrix from linear to circular polarisation coordinates:

Hi (43)

See equation 18 for the definition of . If no hybrid is present, Hi is the unit matrix. Any gain effects in these electronic components are ignored, or rather they are assumed to be ‘absorbed’ by the gain matrix Gi.

5.10 Electronic gain (Gi)

The matrix Gi represents the product of all complex electronic gain effects per output IF-channel p and q. It models the effects of all feed-based electronics (amplifiers, mixers, LO, cables etc). (The correlator causes interferometer-based effects, which are discussed in section 3).

G+ i = G i = gippgiqp gipqgiqq gip 0 0 giq = Diag(gip,giq) (44)

The sign indicates that electronic cross-talk is assumed to be absent in well-designed systems, i.e. gipq = giqp = 0. Since this kind of crosstalk is not necessarily reciprocal, gipqgiqp.

In reality, Gi will be a product of many electronic gain matrices, one for each linear electronic component in the system: Gi = GiLNAGimixersGicablesGiIFsystem Although a solver will not be able to distinguish these different effects from each other, but it is useful for simulation of instrumental effects.

5.11 Do we need a configuration matrix (Ci)?

NB: This section is a little polemical, and should disappear when things are more settled.

There has been some debate about the concept of a ‘configuration matrix’ Ci, as proposed by [2], which models the nominal feed configuration. It represents an idealised coordinate transformation ‘from the frame of the rotating antenna mount to the electronic voltage frame’. It models any rotation of the receptors w.r.t. ‘the antenna mount’, which must be added to the ‘parallactic’ rotation Pi of the antenna w.r.t. the sky. Ci also models the hybrid Hi if present, but it ignores the primary beam Ei. Any deviations from this idealised behaviour are covered by the ‘leakage’ matrix Di.

However, the proposed Ci is most suitable for the special case of fully steerable parabolic antennas. The introduction of an intermediate antenna coordinate frame seems an unnecessary complication in those cases where the mirror is not steerable, or is absent entirely (like in a dipole array). Moreover, Ci violates the rules of modelling by lumping together two effects that have nothing to do with each other, and do not even occur at the same point in the signal path.

In principle it is a good idea to have one matrix that models the transition from electric fields (V/m) to electric voltages (V), and this is precisely what Bi does. This very general matrix can be split up if relevant into sub-matrices like Pi, Ei and Di. The matrix Hi has no part in this, since it represents a rearranging of electronic signals (V), just like Yi (and will come after Yi if present!). The projection matrix Pi takes care of the entire orientation angle of the receptors w.r.t. the sky, which is the only thing that really counts.

6 THE ORDER OF JONES MATRICES

The Jones matrices in equation 3 generally do not commute, so their order is important. In principle, the matrices must be placed in the ‘physical’ order, i.e. the order of the signal propagation path. But in the equations that are enshrined in existing reduction packages, this is often not the case. This begs the question why these ‘wrong’ equations seem to produce so many good (even spectacular) results. The question is especially important since a different order often results in considerable gains in computational efficiency.

The answer is that, for existing (arrays of) circularly symmetric parabolic feeds, many Jones matrices can be approximated by matrices that do commute with at least some of the others.

6.1 Overview of commutation properties

We will analyse this in terms of those special matrices (see Appendix for their definition), whose commutation properties are:

In order to study the general implications of changing the order of multiplication, we take the two products m.M and M.m of two general matrices (whose elements may be complex):

ac d b AC D B = aA + cDaC + cB dA + bDdC + bB AC D B ac d b = aA + dCcA + bC aD + dBcD + bB (45)

The difference (i.e. commutation error) between the two matrix products can be expressed as a matrix Δ:

mM = Mm+Δ = Mm+ cD dC c(A B) + C(a b) d(A B) D(a b) (cD dC) (46)

Thus, by taking the wrong matrix order, one makes the following fractional errors of the following order in the result:  
- in the diagonal elements: of the order of ca, i.e. the ratio of non-diagonal and diagonal elements of the original matrices (which is often small).  
- in the off-diagonal elements: in the order of (a b)a, i.e. they will be smaller as the diagonal elements of the original matrices are more equal.

If one of the two matrices is diagonal, e.g. c = d = 0 then this reduces to:

mM = Mm+ 0 C(a b) D(b a) 0 (47)

The (not very surprising) conclusion is that the error caused by taking the wrong matrix order is smaller when one of the matrices is diagonal, and the values of its diagonal elements are almsot equal.

6.2 Overview of Jones matrix forms

It is sufficient to discuss the commutation properties of the feed-based Jones matrices because, if Ai commutes with Bi and Aj with Bj, then (Ai Aj) commutes with (Bi Bj):

(Ji Jj) = (A iBiZi) (AjBjZj) = (A i Aj)(B i Bj)(Z i Zj) (48)

Inspecting the various Jones matrices separately:

 

F + i = pure rotation Rot (χi )  

F i = diagonal matrix Diag (exp iχi , exp iχi )  

T + i ,T i = multiplication Mult (ti )  

Ki= multiplication Mult (exp iρ .r i )    if ria = rib (virtually always the case)  

P + i = pure rotation Rot (γxa )    if γxa = γyb  

P i = diagonal matrix Diag (exp iγxa , exp iγxa )    if γxa = γyb  

P + i = pseudo-rotation Rot (γxa , γyb )    if γxaγyb  

P i = A general matrix    if γxa γyb  

E+ i ,E i = diagonal matrix Diag (eiaa , eibb )    if no cross-leakage (eiab = eiba = 0)  

= multiplication Mult(ei)    if also eiaa = eibb for all ρ  

D+ i ,D i unit matrix 𝒰   if small leakage, i.e. (diab diba 0)  

D+ i = Ell (𝜃ia , 𝜃ib ) Rot(ϕia,ϕib)  

Ell(𝜃ia,𝜃ia) Rot(ϕia)    if 𝜃ib = 𝜃ia and ϕib = ϕia  

D i = (Ell (𝜃ia , 𝜃ib )1 ) (Rot(ϕia,ϕib)1)  

Rot(𝜃ia) Diag(expiϕia,expiϕia)    if 𝜃ib = 𝜃ia and ϕib = ϕia  

[Yi]= anti-diagonal matrix: a problem, if present....  

[Hi]= effectively hidden if present, see equation 24  

Gi= diagonal matrix Diag (gipp , giqq )    if no cross-talk

Problems are caused predominantly by matrices with non-zero off-diagonal elements like Di, Yi, and Pi if γxaγyb. Of these, only Di is present in all telescopes. Pi will be a problem for SKAI, bacause γxaγyb.

6.3 Allowable changes of order

The following changes in the order of Jones matrices is allowed, but only under the indicated conditions. NB: Some Jones matrices will commute if it can be assumed that the observed source is compact, dominating, unpolarised and near the centre of the field. This is often the case.

6.4 VisJones and SkyJones

The Jones matrices may split up in two groups: Ji = JvisiJsky i . In these terms, the full M.E. (ignoring normalisation factors, see equ 6) becomes:

V ij = dtdf(Jvis i Jvis j ) kdldm(Jsky i Jsky j )SIk (49)

We now see the reason for placing the integration over f and t to the left of the sum over k sources. Since it is computationally advantageous to minimise the number of Jones matrices that operate in the image plane, it must be investigated whether Jones matrices that do not depend on the source position can be moved to the left in the chain, using the rules in section 6.3 above. Depending on the chosen coordinate system, (and always keeping in mind the conditions for re-ordering Jones matrices), the following split appears to be the maximum obtainable:

Jvis i = K0 i(GiTi)D+ i P + i F + i (usingS = S+) (50) = K0 i(GiTiF i )D i P i (usingS = S) (51) Jsky i = EiK i (52)

This is what is done implicitly in some existing reduction packages.

6.4.1 Tied Array

For a tied array (ignoring integration and weight factors for the moment), equation 5 becomes:

V ij = (Qi Qj) n m(Jvis in Jvis jm ) k(Jsky ink Jsky jmk )SIk (53)

Under extremely favourable conditions, i.e. if:  
- individual feed beams per tied array are identical.  
- Faraday rotation is the same for an entire tied array  
- All receptors of a tied array have the same orientation.  
- receptor cross-leakages are small.  
- tied array feed signals are corrected before adding.  
- there are no delay errors.  
then equation 53 can be reduced to:

V ij = (Qi Qj)(P i Pj)(F i Fj) k(Eik Ejk) n m(Kink Kjmk)SI k (54)

References

[1]   J.D.Bregman, J.E.Noordam Matrix formalism for Interferometric Polarisation Calibration. Internal proposal to AIPS++ project, April 1993.

[2]   J.P.Hamaker, J.D.Bregman, R.J. Sault Understanding Radio Polarimetry I: Mathematical foundations. Accepted by Astronomy and Astrophysics, Sept 1995. (For a preprint, see http:://www.nfra.nl/ hamaker).

[3]   R.J.Sault, J.P.Hamaker, J.D.Bregman Understanding Radio Polarimetry II: Instrumental calibration of an interferometer array. Accepted by Astronomy and Astrophysics, Sept 1995. (For a preprint, see http:://www.nfra.nl/ hamaker).

[4]   J.P.Hamaker, J.D.Bregman Understanding Radio Polarimetry III: Interpreting the IAU/IEEE definitions of the Stokes parameters Submitted to Astronomy and Astrophysics, Oct 1995. (For a preprint, see http:://www.nfra.nl/ hamaker).

[5]   J.E.Noordam Some practical aspects of the matrix-based Measurement Equation of a generic radio telescope. AIPS++ Implementation note 182 (June 1995)

[6]   T.J.Cornwell Calibration and Imaging using the Measurement Equation for the Generic Interferometer. AIPS++ Implementation note 183 (July 1995)

[7]   T.J.Cornwell The Generic Interferometer I: Overview of Calibration and Imaging AIPS++ Implementation note 183 (August 1995)

[8]   T.J.Cornwell The Generic Interferometer II: Image Solvers AIPS++ Implementation note ... (revised version, Aug 1995) developing

[9]   T.J.Cornwell The Generic Interferometer III: Analysis of Calibration and Imaging AIPS++ Implementation note ... (Nov 1995) developing

[10]   T.J.Cornwell, M.H.Wieringa The Generic Interferometer IV: Design of Calibration and Imaging AIPS++ Implementation note ... (Dec 1995) developing

[11]   T.J.Cornwell The Generic Interferometer V: Specification of Calibration and Imaging AIPS++ Implementation note ... (Sept 1995) developing

[12]   A.R.Thompson, J.M.Moran, G.W.Swenson Interferometry and Synthesis in Radio Astronomy. John Wiley and Sons (1986)

[13]   R.A.Perley, F.R.Schwab, A.H.Bridle Synthesis Imaging in Radio Astronomy. Astronomical Society of the Pacific Conference Series, Vol 6 (1989)

A APPENDIX: CONVENTIONS

A consistent nomenclature and precise definitions are extremely important for a software package like AIPS++, which aspires to be a ‘world reduction package’, and to which workers with a large spacetime separation are supposed to contribute. One of the most sensitive areas in this respect is the Measurement Equation, which underlies the central subject of uv-calibration and imaging.

However, it is not easy to define, adopt and enforce the use of a suitable set of conventions. This appendix is a hopefully useful step in that process. It proposes coordinate conventions and some definitions (notably the one for feed!), and lists symbols that have been defined in a separate TeX file (referred to as \include(megi-symbols) in this LaTeX document). The TeX syntax is shown in small print (e.g. \FeedI), for easy reference.

A.1 Some definitions

The following definitions are displayed in a distinctive font throughout the text of this document in order to emphasize that they have been defined explicitly.

A.2 Labels, sub- and super-scripts

 

i,j\FeedI,\FeedJ feed labels  

a,b\RcpA,\RcpB receptor labels, two per feed.  

p,q\IFP,\IFQ IF-channel labels,two per feed.  

r,l\RPol,\LPol circular polarisation (right, left)  

x,y\XPol,\YPol linear polarisation (N-S, E-W)  

 

A+,AA\ssLin,A\ssCir superscripts for linear and circular polarisation  

Ai,AijA\ssI,A\ssIJ feed subscripts

The subscript convention of matrix elements is as follows: Yibp refers to a matrix element of matrix Y for feed i, which models the coupling of the signal going from receptor b to IF-channel p.

A.3 Coordinate frames

Fig 1 gives an overview of the coordinate system(s) used. All angles on the Sky are measured counter-clockwise, i.e. in the direction North through East. When relevant, ‘axis’ means ‘positive axis’ (e.g. the positive x-axis). It is important to make a distinction between:

The beam frame(s): In order to calculate the effects of the primary beam on the signal of a source in direction ρ(l,m), the shape and position of the voltage beams of each receptor on the Sky has to be calculated. For fully steerable parabolic antennas, which have constant beamshapes, this can be done most conveniently in coordinate frames defined by the projected position angles of the receptors. To allow for the fact that the two beams of a feed are closely coupled, an intermediate feed-frame is defined also.

The electrical frame: For the polarisation of the signal, the only relevant parameters are the projected angles w.r.t. the ‘electrical’ axes x and y defined by the IAU.

NB: In order to see that two frames are needed, consider that Faraday rotation rotates the electric vector, but not the beam on the sky.

 

Frame of the entire telescope (single dish or array):  

r\vvAntPos Projected feed (receptor?) position vector  

u,v,w\ccU,\ccV,\ccW Projected baseline coordinates  

u\vvUVW Projected baseline vector u (u, v, w)  

 

Electrical frame on the sky (IAU definition):  

x,y\ccX,\ccY IAU electrical frame on the sky.  

z\ccZ propagation direction of incident field.  

γxy\aaXY Angle from x-axis to y-axis ( = π2)  

x,y\ccXPol,\ccYPol linear polarisation coordinates.  

r,l\ccRPol,\ccLPol circular polarisation coordinates.  

 

Sky frame (w.r.t. fringe stopping centre):  

l,m,n\ccL,\ccM,\ccN Coordinates (direction cosines)  

ρ\vvLMN Source direction vector ρ (l, m)  

ρ ftc\vvFTC Fringe Tracking Centre ρ ftc (RA , DEC , f)  

ρ mc\vvMC Map Centre ρ ftc (l, m)  

γlm\aaLM Angle from l-axis to m-axis ( = π2)  

γlx\aaLX Angle from l-axis to x-axis ( = π2)  

 

Coordinate frame of feed i, projected on the sky:  

l i,m i\ccLI,\ccMI Coordinates  

li0,mi0\ccLIO,\ccMIO Origin (l, m) of feed-frame.  

γli\aaLI Angle from l-axis to l i-axis  

γxi\aaXI Angle from x-axis to l i-axis ( = γlx + γli)  

 

Coordinate frame of receptor a of feed i, projected on the sky:  

l ia,m ia\ccLIA,\ccMIA Coordinates  

l ia0,m ia0\ccLIAO,\ccMIAO Origin (l i,m i) of receptor-frame.  

γia\aaIA Angle from l i-axis to l ia-axis  

γxa\aaXA Angle from x-axis to l ia-axis ( = γlx + γli + γia)  

 

Coordinate frame of receptor b of feed i, projected on the sky:  

l ib,m ib\ccLIB,\ccMIB Coordinates  

l ib0,m ib0\ccLIBO,\ccMIBO Origin (l i,m i) of receptor-frame.  

γib\aaIB Angle from l i-axis to l ib-axis  

γyb\aaYB Angle from y-axis (!) to l ib-axis ( = γxy γlx + γli + γib)


PIC

A (rather crowded) overview of the various coordinate frames for the Measurement Equation. See also the text. The origin of the Sky frame (l,m) is defined by the fringe stopping centre. The origin of the feed-frame (l i,m i) is defined by the pointing centre of feed i. The ‘pointing centres’ of the voltage beams of receptors a and b (marked with a and b) define the origins of the receptor-frames (l ia,m ia) and (l ib,m ib). The shapes and position offsets of these voltage beams are exaggerated, in order to emphasise that they do not necessarily coincide.


The coordinates l ia,m ia and l ib,m ib of the frames of receptora and b in equ 37 are related to the celestial coordinate frame l,m in a two-step process. First we define an intermediate feed-frame l i,m i for feed i, projected on the Sky:

l i m i = Rot(γli) l li0 m mi0 (55)

in which (li0,mi0) is the Pointing Centre of feed i, and Rot(γli) is a rotation over the projected angle γli between the positive l-axis of the Sky frame and the l i-axis of the feed-frame.

The voltage beams themselves are best modelled in a receptor-frame (see equ 37), again projected on the Sky. For receptor a we have:

l ia m ia = Rot(γia) l i l ia0 m i m ia0 (56)

The matrix Rot(γia) represents a rotation over the angle γia between the positive l i-axis of the feed-frame and the l ia-axis of the relevant receptor-frame. For receptor b:

l ib m ib = Rot(γib) l i l ib0 m i m ib0 (57)

(l ia0,m ia0) and (l ib0,m ib0) represent pointing offsets of receptor a and b respectively. These can be used to model ‘beam-squint’ of feeds that are not axially symmetric.

A.4 Matrices and vectors

The following matrices and vectors play a role in the Measurement Equation:

 

I\vvIQUV Stokes vector of the source (I,Q,U,V).  

V ,v\vvCoh,\vvCohEl Coherency vector, and one of its elements.  

 

S\mmStokes Stokes matrix, conversion between polarisation representations.  

S+\mmStokes\ssLin Conversion to linear representation.  

S\mmStokes\ssCir Conversion to circular representation.  

 

\mmMueller Mueller matrix: Stokes to Stokes through optical ‘element’  

 

X,x\mmXifr,\mmXifrEl Correlator matrix (4 × 4).  

M,m\mmMifr,\mmMifrEl Multiplicative interferometer-based gain matrix (4 × 4).  

A,a\vvAifr,\vvAifrEl Additive interferometer-based gain vector.

The following feed-based Jones matrices (2 × 2) have a well-defined meaning:

 

J,j\mjJones,\mjJonesEl Jones matrix, and one of its elements.  

 

F,f\mjFrot,\mjFrotEl Faraday rotation (of the plane of linear pol.)  

T,t\mjTrop,\mjTropEl Atmospheric gain (refraction, extinction).  

P,p\mjProj,\mjProjEl Projected receptor angle(s) w.r.t. x, y frame  

B,b\mjBtot,\mjBtotEl Total feed voltage pattern (i.e. B = DEP.  

E,e\mjBeam,\mjBeamEl Traditional feed voltage beam.  

C,c\mjConf,\mjConfEl Feed configuration matrix (...).  

D,d\mjDrcp,\mjDrcpEl Leakage between receptora and b.  

H,h\mjHybr,\mjHybrEl Hybrid network, to convert to circular pol.  

G,g\mjGrec,\mjGrecEl feed-based electronic gain.  

K,k\mjKern,\mjKernEl Fourier Transform Kernel (baseline phase weight)  

K0,k0\mjKref,\mjKrefEl FT kernel for the fringe-stopping centre.  

K ,k \mjKoff,\mjKoffEl FT kernel relative to the fringe-stopping centre.  

Q,q\mjQsum,\mjQsumEl Electronic gain of tied-array feed after summing.  

 

Some special matrices and vectors:  

Zero\mmZero Zero matrix  

0\vvZero Zero vector  

𝒰\mmUnit Unit matrix  

Diag(a,b)\mjDiag Diagonal matrix with elements a, b  

Mult(a)\mjMult Multiplication with factor a  

Rot(α[,β])\mjRot [pseudo] Rotation over an angle α, β  

Ell(α[,β])\mjEll Ellipticity angle[s] α, β  

\mjLtoC Signal conversion from linear to circular.  

1\mjCtoL Signal conversion from circular to linear.

Definitions of some special matrices:

Diag(a,b) a0 0 b Diag(a,a) = Mult(a) = a 10 0 1 (58)

A ‘pure’ rotation Rot(α) is a special case of a ‘pseudo rotation’ Rot(α,β):

Rot(α,β) cosα sinα sinβ cosβ Rot(α) Rot(α,α) = cosα sinα sinα cosα (59)

Ellipticity:

Ell(α,β) cosα isinα i sin β cos β Ell(α) Ell(α,α) = cosαisinα i sin α cos α (60)

A.5 Miscellaneous parameters

 

β\ppParall Parallactic angle, form North pole to zenith  

HA\ppHA Hour Angle  

RA\ppRA Right Ascension  

DEC\ppDEC Declination  

LAT\ppLAT Latitude on Earth  

 

t\ccT Time  

f\ccF Frequency  

 

χ\ppFarad Faraday rotation angle  

a\ppAmpl Amplitude  

ψ\ppPhase Phase  

ζ\ppPhaseZero Phase zero  

ϕ\ppRcpPosDev Dipole position angle error  

𝜃\ppRcpEllDev receptor ellipticity

1The generic IF-channel labels p and q are known as X and Y for WSRT and ATCA, and R and L for the VLA. They should not be confused with the two receptors a and b, since the signal in an IF-channel may be a linear combination of the receptor signals.

2Also called the outer matrix product, or tensor product, or Kronecker product. See [2].

3In one influential book [12], the factor 0.5 is omitted from S. This is clearly incorrect, since a single receptor can never measure more than one half of the total flux of an unpolarised source.

4One might argue that a more consistent form of would be an expression in terms of the ± π4 ellipticities that are intrinsic to a circular receptor:

alternative = Ell(π4,π4) = 1 2 1i i 1 = 10 0 i (17)

However, a choice for a different should not be made lightly, since it would affect the deeply entrenched form of the Stokes matrices.