Return to the homepage

SFXC workshop 2025 • Wide-field processing

This page outlines the wide-field correlation tutorial that was presented as part of the first SFXC workshop, held on 21–23 September 2025 at the Joint Institute for VLBI in Europe (JIVE). For more information and resources regarding the SFXC workshop, see the workshop webpage or return to the homepage.

This tutorial is an introduction to the method of wide-field VLBI and the associated correlation implementation in SFXC. By the conclusion of this tutorial, you should be able to:

On this page

  1. Introduction
  2. Project preparation and data download
  3. Correlator preparation
  4. Running the correlator
  5. Post processing
  6. Current & future developments
  7. Confirming the outcome
  8. Resources

A. Introduction

A1. Smearing

Wide-field VLBI is a specialised observing mode which correlates multiple positions across the primary beam of the interferometer. Wide-field VLBI correlation faced a fundamental challenge: to image a large fraction of the primary beam, the correlator must use ultra-fine temporal and frequency resolution to avoid:

Smearing is proportional to the baseline length (as shown in Figure A1), therefore doing this for the entire primary beam of a VLBI array produces huge datasets (often many 10s of terabytes) and demands extreme computational resources, which is increasingly impractical with modern VLBI arrays’ higher bit rates.

drawing

Figure A1 - Example image illustrating the effects of smearing on the imaging of extragalactic sources. The background image is from the short baselines of MeerKAT which shows no smearing in the image. However, when looking at higher resolution (as with e-MERLIN) smearing affects the source which is far from the delay tracking centre.

A2. Multiple phase centre observing

Instead of correlating the whole beam at full resolution, software correlators implement the multiple phase centre observing mode (Deller et al. 2011). This process has three distinct steps which are as follows (and shown in the Figure A2):

  1. Initial correlation at high resolution
    The correlator internally processes the data with a fine time and frequency resolution. This retains the large field-of-view as smearing is kept to a minimum.

  2. Make copies & phase rotate to different positions within the primary beam The observer specifies multiple positions within the primary beam. These could be:
    • Sources of interest (e.g., calibrator + targets).
    • Or a grid to cover the field. We then phase rotate the delay tracking centre to these positions.
  3. Average the phase rotated data sets:
    • Average to a manageable smearing (e.g., 30–60″ field of view).
    • Produces small (∼GB) datasets per phase centre instead of a single massive file.
    • Each dataset can be calibrated and imaged independently and in parallel.

Important - if you wish to use SFXC to perform wide-field correlations of interferometric arrays with shorter baselines (e.g., e-MERLIN), it may be simpler to carry out a standard correlation with higher time and frequency resolution to prevent smearing from affecting sources near the edge of the primary beam.

drawing

Figure A2 - Diagram illustrating the wide-field correlation process when using the multiple phase centre observing technique, including the field-of-views defined by smearing after internal wide-field correlation, along with the shift and averaging steps involved.

B. Project preparation and data download

To prepare for the practical part of this tutorial, we need to ensure that you have the software and utilities available to execute all parts of the tutorial.

Utilities required

Software required

Folder structure

With the software installed, we now wish to set up the data, supporting scripts and configuration files. We recommend making a new folder (called for example n24l2_mpcc) for this tutorial so that we do not inadvertently affect other tutorials.

A recommended file structure and an accompanying set of command line entries which can produce this file structure can be found below. Note that the final wget command will re-download the baseband data, so you could copy the baseband data across to the raw_data folder to save time. Note that, we will be only using the No0005 scan so we only require a sub-set of the baseband data.

└── n24l2_mpcc/
    ├── calibration/
    ├── N24L2_delays/
    ├── raw_data/
    │   ├── n24l2_cm_no0005.vdif
    │   ├── n24l2_de_no0005.vdif
    │   ├── n24l2_ef_no0005
    │   └── n24l2_hh_no0005
    ├── flag_weights.py
    ├── n24l2_calibration.py
    ├── n24l2.ctrl
    ├── n24l2.vix
    └── run_correlation_post_process.bash
mkdir -p n24l2_mpcc/calibration
cd n24l2_mpcc
wget https://www.jb.man.ac.uk/~radcliff/sfxc_workshop/n24l2.ctrl
wget https://www.jb.man.ac.uk/~radcliff/sfxc_workshop/run_correlation_post_process.bash
wget https://www.jb.man.ac.uk/~radcliff/sfxc_workshop/n24l2_calibration.py
wget https://www.jb.man.ac.uk/~radcliff/sfxc_workshop/flag_weights.py
wget -t45 -l1 -r -nd https://archive.jive.nl/sfxc-workshop/n24l2/ -A "n24l2*vix"
mkdir N24L2_delays
mkdir raw_data
cd raw_data

If you are on the JIVE cluster – symlink the raw data to this directory using ln -sf /data/n24l2/files/*no0005* .

Otherwise download just the No0005 scan using:

wget -t45 -l1 -r -nd https://archive.jive.nl/sfxc-workshop/n24l2/ -A "n24l2*no0005*"

Cheat script If you wish to skip/run a selection of steps, you can run the run_correlation_post_process.bash script. Note that you will need to edit the ### -- INPUTS -- ### section to make it work.

C. Correlator preparation

C1. Calculate wide-field correlation parameters

As was shown in Figure A2, a key step in wide-field correlation is:

  1. Deciding what constitutes the acceptable smearing-constrained field-of-view for the internal wide-field correlation step — which determines how far you can extend from the original delay tracking centre (often the primary beam maximum). This is controlled by the fft_size_correlation and sub_integr_time parameters in the SFXC control file for bandwidth and time smearing, respectively.
  2. Deciding what is the acceptable smearing-constrained field of view for the phase-rotated data sets involves determining how far you can go from the phase-rotated data set before smearing starts to affect the images. This is governed by the number_channels and integr_time parameters in the SFXC control file for bandwidth and time smearing, respectively.

We therefore need to determine the values of the four parameters which control the smearing. Some key considerations before we determine these values.

  1. Are the positions selected to be targeted clustered within a single area (e.g., close to the primary beam maximum)? If so, consider reducing the FoV for the initial wide-field correlation to decrease the computational overhead. You might also want to move the delay tracking centre if they are systematically clustered in a specific direction.
  2. Do the positions selected have large uncertainties associated with them? If so you may want to reduce the averaging of the phase shifted data-sets so that you can search a larger area around the positions.
  3. Consider the output file sizes which can get very large very quickly. An estimate for the output file size is given by $\simeq \frac{N_\mathrm{pc}\cdot N_\mathrm{hr}\cdot N_{\mathrm{sta}}\left(N_{\mathrm{sta}}+1\right)\cdot N_{\mathrm{SB}}\cdot N_\nu\cdot N_{\mathrm{pol}} \cdot f}{74565.4 \cdot t_{\mathrm{int}}} \text { GB. }$, where $N_\mathrm{pc}$ = number of phase centres, $N_\mathrm{hr}$ = number of hours observing, $N_{\mathrm{sta}}$ = number of stations participating, $N_{\mathrm{SB}}$ = number of subbands, $N_\nu$ = number of channels per subband, $N_{\mathrm{pol}}$ = number of output polarisation products, $t_\mathrm{int}$ = integration time and $f$ is a scaling factor that empirically has been seen to be $\sim 1.4$ for $N_\nu \geq 1024$ and 1 otherwise. This size includes both the baselines and auto-correlations Campbell et al. (2019).

Calculate the smearing values

Given these considerations, you should have an idea of what field-of-view you want for your observation and individual phase centres. We can now determine the frequency and time resolutions that correspond to these fields-of-views. The formulae described by Wrobel et al., (1995) for the max FoV corresponding to 10\% smearing are:

Bandwidth smearing: $\text{FoV} \lesssim 49.^{\prime \prime}5 \frac{1}{B} \frac{N_\nu}{\mathrm{BW_{SB}}}$ or $\lesssim 49 .^{\prime \prime} 5 \frac{1}{B} \frac{N_{\mathrm{SB}} N_\nu}{\mathrm{BW_{tot}}}$

Time smearing: $\text{FoV} \lesssim 18.^{\prime \prime}56 \frac{\lambda}{B} \frac{1}{t_{\mathrm{int}}}$

where $\mathrm{BW_{SB}}$ = bandwidth per subband (MHz), $\mathrm{BW_{tot}}$ = total bandwidth (MHz), $N_\mathrm{SB}$ = Number of subbands, $N_\nu$ = number of frequency channels per subband (FFT size), $\text{FoV}$ = field-of-view (arcseconds) at a given baseline, $B$ = baseline length (km), $\lambda$ = wavelength (cm) and $t_\mathrm{int}$ = integration time (seconds).

To help you calculate these values, there is a widget below which calculates the smearing FoV values for different channelisation, observing frequencies and integration times together with a table containing values corresponding to short (Western European) baselines and global VLBI. These values are from Campbell et al. (2019).

Interactive smearing FoV calculator

Computes upper-limit FoVs (arcsec & arcmin) due to (10%) bandwidth and time smearing.

If BWtot = NSB · BWSB, both bandwidth forms match.
B=2500 km, Nν=2048, BWSB=32 MHz, tint=1 s, λ=0.18 m

Bandwidth smearing

arcsec
arcmin

49.5″ · (1000/Bkm) · (Nν/BWSB)

Time smearing

arcsec
arcmin

18.56″ · (λ/B) · (1/tint) → implemented as 18.56″ · (λcm / (Bkm/1000)) · (1/t), i.e. 18.56″ · (100000·λm / Bkm) · (1/t)

Bandwidth smearing
BWSB
(MHz)
Nν FoV
(B=2,500 km)
(arcsec)
FoV
(B=10,000 km)
(arcsec)
3220481267.20316.80
32512316.8079.20
323219.804.95
1620482534.40633.60
16512633.60158.40
163239.609.90
2204820275.205068.80
25125068.801267.20
232316.8079.20
Time smearing
λ
(cm)
tint
(s)
FoV
(B=2,500 km)
(arcsec)
FoV
(B=10,000 km)
(arcsec)
18.01.00133.2033.30
18.00.25532.80133.20
6.01.0044.4011.10
6.00.25177.6044.40
1.31.009.622.40
1.30.2538.489.62

C2. Edit the control (ctrl) file

Now that you have some approximate numbers for the time and bandwidth averaging needed for both steps of the correlation, we can edit the control file to set the four parameters that control the smearing.

If you inspect the n24l2.ctrl file, you can see that is set up already for standard correlation apart from the paths to the data and output files. Edit the file to replace <path_to_tutorial> parts with your real path. There is a command line entry below that you can use too.

Cheat script:

sed -i.bak 's|<path_to_tutorial>|/your/path/to/n24l2_mpcc|g' n24l2.ctrl

Replace /your/path/to/ with the real path to the n24l2_mpcc folder.

Next, we can convert the control file to do multiple phase centre correlation. We first specify this by entering:

multi_phase_center: true

Then we use internal wide-field correlation is governed by:

And the averaging of the phase shifted data by:

Finally we need to ensure that multi-phase centre correlation is enabled:

Cheat script:

jq '.multi_phase_center = (.multi_phase_center // true) | .sub_integr_time = (.sub_integr_time // 13056) | .fft_size_correlation = (.fft_size_correlation // 16384)' n24l2.ctrl > n24l2.ctrl.tmp && mv n24l2.ctrl.tmp n24l2.ctrl

C3. Edit the vix file

With the control file now ready, we need to edit the vix file. First, search for the $SOURCE section of the VIX file and add in the locations of the new positions to be correlated on. This must be in the format that is specified below:

def J0854_off;
  source_name = J0854_off;
  ra = 08h54m48.8749270s;
  dec = 20d06'31.140851";
  ref_coord_frame = J2000;
enddef;

Next we need to include the the new source names in all relevant scans in the $SCHED section as shown below:

scan No0005;
   start = 2024y144d12h47m00s;
   mode = sess224.L1024;
   station = Cm : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Da : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = De : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Ef : 0 sec : 600 sec : 0.000000000 GB :   : &ccw : 1;
   station = Hh : 0 sec : 600 sec : 0.000000000 GB :   :   : 1;
   station = Ir : 0 sec : 600 sec : 0.000000000 GB :   : &ccw : 1;
   station = Jb : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Kn : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Mc : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Nt : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = O8 : 0 sec : 600 sec : 0.000000000 GB :   :   : 1;
   station = Pi : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Tr : 0 sec : 600 sec : 0.000000000 GB :   : &n : 1;
   station = Wb : 0 sec : 600 sec : 0.000000000 GB :   :   : 1;
source = J0854+2006; source = J0854_off;

Cheat script:

ed -s n24l2.vix << 'ED'
407a
def J0854_off;
source_name = J0854_off;
ra = 08h54m48.8749270s;
dec = 20d06'31.140851";
ref_coord_frame = J2000;
enddef;
.
wq
ED

sed -i.bak -E 's/^[[:space:]]*source[[:space:]]*=[[:space:]]*J0854\+2006;[[:space:]]*$/source = J0854+2006; source = J0854_off;/' >n24l2.vix

Note that there is no limit to the number of sources that you can specify here, and it is just a matter of defining the coordinates in the $SOURCE section and then adding the source name in the relevant scans ($SCHED section).

D. Running the correlator

Once this is complete, we can just run sfxc, as shown in the previous tutorials:

mpirun sfxc n24l2.ctrl n24l2.vix

This will produce two output files instead of one – corresponding to each of the phase centres. The files are appended with the source name so you should have N24L2.cor_J0854+2006 and N24L2.cor_J0854_OFF in your current working directory.

E. Post processing

Now that these two output correlations are completed, we can continue with the standard post-processing steps. The only difference with standard correlation is that you do the steps with each .cor file separately. This is highly parallelisable!

Convert to Measurement Set with j2ms2:

Firstly, we convert into a measurement set, remembering that we need to define the setup reference station due to the mixed bandwidths correlation that was performed.

j2ms2 -o n24l2_1_1.ms eo:setup_ref_station=Ef N24L2.cor_J0854+2006
j2ms2 -o n24l2_2_1.ms eo:setup_ref_station=Ef N24L2.cor_J0854_off

Note that the normal practice is that the first measurement set will also contain the correlated scans of the calibrator sources as well. This is because the calibration that will be performed on the calibrators will be identically applicable to all of the phase centres.

Flag low weights

We then flag the low weights in these data. Note that this is using a modified flag_weights.py that uses CASA.

casa --nologger --log2term -c flag_weights.py n24l2_1_1.ms 0.7 True
casa --nologger --log2term -c flag_weights.py n24l2_2_1.ms 0.7 True

Convert MS to FITS-IDI with tConvert:

Finally, we convert each of the measurement sets into a FITS IDI file (one per each phase centre).

tConvert n24l2_1_1.ms n24l2_1_1.IDI
tConvert n24l2_2_1.ms n24l2_2_1.IDI

If IDI files exceed 2 GB, they may be split into ~1.9 GB chunks (as on the EVN archive).

F. Confirming the outcome

Finally, with the data correlated and post-processed, we can now calibrate these data to illustrate the phase shifting technique employed. To do this, copy your IDI files into the calibration folder and then change directory into that folder. This is just to ensure that we start the calibration process with a clean slate. With multiple phase centre observing, the data-sets are corrupted by the majority of the same calibration effects, therefore you only need to calibrate a single phase centre to calibrate them all. Only direction-dependent effects such as primary beam corrections require different corrections per phase centre.

In this tutorial, we are just going to fringefit these data to reveal the source location and are omitting other calibration steps such as flux scaling, amplitude corrections etc. If you wish to learn how to reduce the data, please refer to the JIVE VLBI school 2025.

Firstly, we will convert the two IDI files into a CASA-compatible measurement set,

importfitsidi(fitsidifile='n24l2_1_1.IDI',
              vis='n24l2_A.ms')
importfitsidi(fitsidifile='n24l2_2_1.IDI',
              vis='n24l2_B.ms')

We will then fringe-fit the phase centre with the source located at the delay tracking centre,

fringefit(vis='n24l2_A.ms',
          caltable='n24l2.sbd',
          refant='EF',
          corrdepflags=True,
          parang=True)

With the solutions obtained, we can now apply these solutions to the measurement sets to align the phases and the visibilities should now constructively interfere where the source is located.

applycal(vis='n24l2_A.ms',
         gaintable=['n24l2.sbd'],
         parang=True)
applycal(vis='n24l2_B.ms',
         gaintable=['n24l2.sbd'],
         parang=True)

To visualise this, we can generate an image of both phase centres.

tclean(vis='n24l2_A.ms',
       field='J0854+2006',
       imagename='J0854+2006_IM_dirty',
       imsize=2500,
       cell='1mas',
       niter=0)
tclean(vis='n24l2_B.ms',
       field='J0854_OFF',
       imagename='J0854_OFF_IM_dirty',
       imsize=2500,
       cell='1mas',
       niter=0)

These images will be extremely ugly as we are only using a few seconds of data. However, if you visualise this with the CASA viewer or CARTA, you should see that there is an amplitude spike at the source position, which is offset from the delay tracking centre by 0.5 arcseconds (as expected!). We visualise this in Figure F1, which shows a Declination slice at a fixed Right Ascension showing the fringe spikes at the expected source location with one phase centre being offset. In Figure F2, we clean this data-set to illustrate the location of the source. Note that we have fixed the restoring beam.

drawing

Figure F1 - A central Right Ascension slice of the two (uncleaned) images of the two phase centres showing the increase in fringe amplitude height at the expected location of the source.

drawing

Figure F2 - Cleaned images of both phase centres. The restoring beam is circular for visualisation purposed with the peak pixel identified by the red arrow.

Cheat script: Run the n24l2_calibration.py using CASA i.e., casa -c n24l2_calibration.py

G. Current & future developments

singularity exec --env CALC_DIR=/home/azimuth/n24l2/sfxc/sfxc/lib/calc10/data --bind /home:/home sfxc_ipp.simg mpirun sfxc n24l2_mpcc.ctrl n24l2_mpcc.vix
singularity run --app j2ms2 jive-casa.simg -o n24l2_1_1.ms N24L2.cor_J0854+2006
singularity run --app j2ms2 jive-casa.simg -o n24l2_2_1.ms N24L2.cor_J0854_off
casa --nologger --log2term -c flag_weights.py n24l2_1_1.ms 0.7 True
casa --nologger --log2term -c flag_weights.py n24l2_2_1.ms 0.7 True
singularity run --app tConvert jive-casa.simg n24l2_1_1.ms n24l2_1_1.IDI
singularity run --app tConvert jive-casa.simg n24l2_2_1.ms n24l2_2_1.IDI

Conversion generally uses helper software from the jive-casa repository: https://code.jive.eu/verkout/jive-casa.

H. Resources

Some technical papers/memos on wide-field correlation

1. Deller, A. T., et al., “DiFX-2: A More Flexible, Efficient, Robust, and Powerful Software Correlator”, PASP, 123(901), 275 (2011). DOI: 10.1086/658907
2. Morgan, J. S., et al., “VLBI imaging throughout the primary beam using accurate UV shifting”, A&A, 526, A140 (2011). DOI: 10.1051/0004-6361/201015138
3. Keimpema, A., et al., “The SFXC software correlator for very long baseline interferometry: algorithms and implementation”, Experimental Astronomy, 39(2), 259–279 (2015). DOI: 10.1007/s10686-015-9462-z 4. Campbell, B., “Field of View Calculations for SFXC”, EVN JIVE documentation, revised 3 January 2019. Retrieved from the EVN website. DOI: not available.
5. Wrobel, J. M., “VLBI Observing Strategies”, Very Long Baseline Interferometry and the VLBA, ASP Conf. Ser., 82, 411 (1995). DOI: 1995ASPC…82..411W

A selection of science papers using wide-field correlation

6. Middelberg, E., et al., “Mosaiced wide-field VLBI observations of the Lockman Hole/XMM”, A&A, 551, A97 (2013). DOI: 10.1051/0004-6361/201220374
7. Middelberg, E., et al., “Wide-field VLBA observations of the Chandra Deep Field South”, A&A, 526, A74 (2011). DOI: 10.1051/0004-6361/201015406
8. Deller, A. T., & Middelberg, E., “mJIVE-20: A Survey for Compact mJy Radio Objects with the Very Long Baseline Array”, AJ, 147, 14 (2014). DOI: 10.1088/0004-6256/147/1/14
9. Herrera Ruiz, N., et al., “The faint radio sky: VLBA observations of the COSMOS field”, A&A, 607, A132 (2017). DOI: 10.1051/0004-6361/201731163
10. Radcliffe, J. F., et al., “An ultra-deep multi-phase centre VLBI survey of GOODS-N: Observations and data reduction”, A&A, 619, A48 (2018). DOI: 2018A&A…619A..48R
11. Radcliffe, J. F., et al., “Nowhere to hide: Radio-faint AGN in the GOODS-N field”, A&A, 649, A27 (2021). DOI: 10.1051/0004-6361/202038591
12. Fenech, D., et al., “Wide-field global VLBI and MERLIN combined monitoring of supernova remnants in M82”, MNRAS, 408(1), 607–621 (2010). DOI: 10.1111/j.1365-2966.2010.17144.x
13. Morgan, J. S., et al., “Wide-field VLBI Observations of M31: A Unique Probe of the Ionized Interstellar Medium of a Nearby Galaxy”, ApJ, 768(1), 12 (2013). DOI: 10.1088/0004-637X/768/1/12
14. Chi, S., et al., “Deep, wide-field, global VLBI observations of the Hubble Deep Field North (HDF-N) and flanking fields (HFF)”, A&A, 550, A68 (2013). DOI: 10.1051/0004-6361/201220783
15. Spingola, C., et al., “A novel search for gravitationally lensed radio sources in wide-field VLBI imaging from the mJIVE-20 survey”, MNRAS, 483(2), 2125–2153 (2019). DOI: 10.1093/mnras/sty3189


Content built by Jack Radcliffe.

Built with ♥ — Markdown + HTML + CSS + Prism.js + a bit of AI + Jack Radcliffe (2025)