Observation
The VLBI observation of JVAS B1938+666 used for this work was performed using a global VLBI array combining antennas from the Very Long Baseline Array (VLBA) and European VLBI Network (EVN) at 1.7 GHz for 14 h at a data recording rate of 512 Mbits s−1 (ID: GM068; Principal Investigator: McKean). The observing strategy followed a standard phase referencing mode only for the VLBA part (phase calibrator J1933+654). Throughout the observations, scans on the fringe finder 3C454.3 were also included every ~4 h for the bandpass calibration. The correlation of the data from the 19 antennas was performed at the Joint Institute for VLBI-European Research Infrastructure Consortium (JIV-ERIC) and resulted in a dataset with eight intermediate frequencies, each with 8-MHz bandwidth and divided into 32 channels for two circular polarizations (RR and LL).
The data were processed within the Astronomical Image Processing System (aips)43 package following a typical calibration procedure for phase-referenced observations. The resulting deconvolved image was obtained with a restoring beam of 7.4 mas × 4.7 mas at a position angle of 32.1 degrees east of north33. The noise, assumed to be Gaussian and uncorrelated, was measured directly in the Fourier domain by subtracting time-adjacent visibilities to remove the source signal, then computing the root mean square (r.m.s.) within 30-minute time intervals. The triangle of highly sensitive baselines between Effelsberg, Jodrell Bank and Westerbork were flagged to prevent them from dominating the model. The baselines Green Bank–Hancock, Green Bank–Owens Valley, Green Bank–Pie Town and Los Alamos–Pie Town were also flagged for several 1- to 2-h time intervals owing to strong radio frequency interference.
The surface-brightness distribution of the background radio source consists of two radio lobes (hot spots), the brighter of which is lensed into an extended gravitational arc, whereas the fainter one is doubly imaged (Fig. 1). The physical properties of the source are derived and discussed in ref. 33.
Bayesian inference
We use the visibility-plane gravitational lens modelling software PRONTO34,35,36,37,38 to jointly infer the pixellated source surface-brightness vector s and lens parameters ηH from the observed visibilities d, as well as to obtain the Bayesian log-evidence for each model parametrization H. At each likelihood evaluation (the first level of inference), we obtain the maximum a posteriori source sMP for a given ηH and source regularization weight λs by solving
$${A}\,{{\mathbf{s}}}_{{\rm{MP}}}=(DL)^{\rm{T}}{C}^{-1}{\mathbf{d}},$$
(1)
where
$$A\equiv \left[{(DL)}^{\rm{T}}C^{-1}DL+{\lambda }_{{\mathbf{s}}}R_{{\mathbf{s}}}\right].$$
(2)
Here \(L({{\boldsymbol{\eta }}}_{H})\) is the lens operator, which maps light from the source plane to the lens plane, \({\lambda }_{{\boldsymbol{s}}}R_{{\boldsymbol{s}}}\) is the source prior covariance, which enforces the lens equation by penalizing strong surface-brightness gradients and \(C^{-1}\) is the noise covariance of the data. We solve equation (1) using a preconditioned conjugate gradient solver, where the Fourier operator \(D\) is implemented using a non-uniform fast Fourier transform. We refer the reader to ref. 37 for further details on the method.
In the second level of inference, we sample the lens parameters ηH and source regularization weight λs. The posterior is
$$P({{\boldsymbol{\eta }}}_{H},{\lambda }_{{\mathbf{s}}}| {\mathbf{d}})=\frac{P({\mathbf{d}}| {{\boldsymbol{\eta }}}_{{H}},{\lambda }_{{\mathbf{s}}})\,P({{\boldsymbol{\eta }}}_{{H}})\,P({\lambda }_{{\mathbf{s}}})}{P({\mathbf{d}}| {H}\;)}.$$
(3)
The likelihood (which is the evidence from the source-inversion step) is
$$\begin{array}{rcl}2\log P({\mathbf{d}}| {{\boldsymbol{\eta }}}_{{\rm{H}}},{\lambda }_{{\mathbf{s}}})&=&-{\chi }^{2}-{\lambda }_{{\mathbf{s}}}{{\mathbf{s}}}_{{\rm{MP}}}^{\rm{T}}R_{{\mathbf{s}}}\,{{\mathbf{s}}}_{{\rm{MP}}}-\log \det A\\ &&+\log \det ({\lambda }_{{\mathbf{s}}}R_{{\mathbf{s}}})+\log \det (2\uppi C^{-1}).\end{array}$$
(4)
This expression follows from the marginalization over all possible sources s when the noise and source prior are both Gaussian. As \({{{{\boldsymbol{R}}}}}_{{\mathbf{s}}}\) and \({{{{\boldsymbol{C}}}}}^{-1}\) are sparse, the terms containing them are straightforward to evaluate. Computing \(\log \det A\) is non-trivial; we approximate it using the preconditioner from the inference on sMP as described in ref. 37. The χ2 term is
$${\chi }^{2}={(DL{{\mathbf{s}}}_{{\rm{MP}}}-{\mathbf{d}})}^{\rm{T}}C^{-1}(DL{{\mathbf{s}}}_{{\rm{MP}}}-{\mathbf{d}}).$$
(5)
We optimize its evaluation using the fast-χ2 technique from ref. 38.
We compared different model parametrizations, H, using the log-evidence \(\log {{\mathcal{E}}}_{{\rm{H}}}\equiv \log P({\boldsymbol{d}}| {\rm{H}})\), which was obtained by marginalizing over all parameters. We used the nested sampling algorithm MULTINEST44 to compute the log-evidence and to sample the posterior distributions of ηH and λs.
Our method takes both noise and priors to be Gaussian. From a practical standpoint, these assumptions make our analysis computationally tractable by allowing the marginalization over all possible configurations of the source and pixellated potential corrections (‘Gravitational imaging’) using a single linear solution. Although it is the case that non-Gaussian noise statistics can, in principle, arise from calibration errors or correlated instrumental effects, we do not expect this to be a major issue given the high signal-to-noise ratio of the observation. Nevertheless, we tested our analysis on several independent sets of calibration solutions, including on individual spectral windows, and found our results to be robust across all cases. Most reassuringly, we found that the gap in the bright arc induced by object \({\mathcal{V}}\) cannot be suppressed by the self-calibration process.
Similarly, although there exist techniques for applying highly informative generative machine-learning models as non-Gaussian priors for ‘realistic’ sources and potentials (for example, ref. 45), these are susceptible to systematic biases learned from the training data (which are themselves taken from numerical simulations). False-positive subhalo detections are a particularly relevant concern for this work. Instead, we opt for highly uninformative Gaussian priors, which are agnostic to the shapes of the source surface brightness and the linearized potential corrections, which thus maximizes the ability of the data to drive the inference.
Parametric lens model components
We now introduce the surface mass density profiles used for parametric modelling in this work. All quantities labelled κ are in units of the critical surface mass density for strong lensing,
$${\varSigma }_{{\rm{cr}}}=\frac{{c}^{2}\,{D}_{\rm{s}}}{4\uppi \,G\,{D}_{\rm{ls}}\,{D}_{\rm{l}}},$$
(6)
where Ds, Dl and Dls are the angular diameter distances from observer to source, from observer to lens and from lens to source, respectively. Assuming a Planck 2015 (ref. 46) cosmology, \({\varSigma }_{{\rm{cr}}}=1.50\times 1{0}^{11}\,{M}_{\odot }\,{{\rm{arcsec}}}^{-2}\) for JVAS B1938+666 (zs = 2.059, zl = 0.881).
We model the smooth galaxy-scale mass distribution (the macromodel) as an elliptical power-law mass distribution (for example, ref. 47), with a projected surface mass density of
$$\kappa (\xi\;)=\frac{3-\gamma }{2}{\left(\frac{{R}_{{\rm{E}}}}{\xi }\right)}^{\gamma -1},$$
(7)
which we define in terms of an effective Einstein radius RE and the elliptical radius ξ2 ≡ x2q + y2/q, where q is the axis ratio, and γ is the three-dimensional logarithmic slope, where γ = 2 is isothermal. The additional parameters x0, y0 and θ0 set the position and orientation of the profile. We use FASTELL48 to compute the corresponding deflection angles.
To account for non-ellipticity in the lens galaxy (for example, diskiness or boxiness), we include multipole perturbations of orders m = 3 and 4, parametrized as
$${\kappa }_{m}(r,\theta )={\left(\frac{r}{1{\rm{arcsec}}}\right)}^{-(\gamma -1)}\left[{a}_{m}\sin (m\theta )+{b}_{m}\cos (m\theta )\right],$$
(8)
which we write in circular polar coordinates here for readability. The coefficients am and bm encode the amplitude and orientation of each multipole term and γ is fixed to that of the underlying elliptical power-law model (equation (7)). We use Gaussian priors for am and bm with μ = 0 and σ = 0.01, a choice motivated by the results of numerical simulations of massive elliptical galaxies (for example, ref. 49), which is also consistent with previous lens modelling results23,38,50. In addition, we include an external shear term in the macromodel, with strength Γ and direction θΓ.
For parametric modelling of the low-mass perturbers \({\mathcal{A}}\) and \({\mathcal{V}}\) (Table 1), we use the following spherically symmetric mass profiles. Models PJ_free and PJ_tidal are truncated isothermal profiles (PJ; for example, refs. 47,51) with a projected surface mass density of
$$\kappa (r)=\frac{{m}_{{\rm{tot}}}}{2\pi \,{r}_{t}^{2}{\Sigma }_{{\rm{cr}}}}\,\left(\frac{1}{x}-\frac{1}{\sqrt{{x}^{2}+1}}\right),$$
(9)
where rt is the truncation radius, x = r/rt and mtot is the total mass. We leave rt as a free parameter for PJ_free, whereas for PJ_tidal we set rt to be the tidal radius
$${r}_{t}=\frac{2R}{3}{\left(\frac{{m}_{{\rm{tot}}}}{2{M}_{{\rm{lens}}}( < R)}\right)}^{\frac{1}{3}}.$$
(10)
Here R is the three-dimensional radius to the centre of the lens and Mlens (<R) is the mass of the main lens enclosed within R; see ref. 20. Because we have no knowledge of the three-dimensional position of a low-mass perturber within the lens, we take R to be the projected distance to the perturber in the plane of the lens.
In the case of NFW_sub, we model perturber \({\mathcal{A}}\) as a spherical NFW profile47,52, with a surface mass density of
$$\kappa (x)=2{\kappa }_{s}\frac{1-F(x)}{{x}^{2}-1},$$
(11)
where x = r/rs, κs = ρsrs/Σcr and
$$F(x)=\left\{\begin{array}{ll}\frac{1}{\sqrt{{x}^{2}-1}}\,{\tan }^{-1}\sqrt{{x}^{2}-1}\quad &x > 1\\ \frac{1}{\sqrt{1-{x}^{2}}}\,{\tanh }^{-1}\sqrt{1-{x}^{2}}\quad &x < 1\\ 1\quad &x=1.\end{array}\right.\,$$
(12)
We found that using a mass-concentration-redshift relationship for CDM haloes (for example, ref. 53) leads to virial concentrations that are much too low to produce the required lensing effect for a halo of a given virial mass (see also refs. 20,23). Therefore, we leave the concentration of NFW_sub as a free parameter in this work. The redshift is fixed to that of the main lens.
GI
GI (refs. 17,24,34,39,54) is a technique for recovering pixellated (non-parametric) corrections to the lensing convergence. GI is well suited for finding local overdensities that have not been accounted for in the parametric mass model, ηH, which remains fixed during the GI procedure. Our GI formulation augments the lens operator (for example, equation (1)) with an extra block column representing linearized degrees of freedom in the lensing potential:
$$L_{{\rm{GI}}}\equiv \left[L\,| \,-LG_{{\mathbf{s}}}G_{{\boldsymbol{\psi }}}\right],$$
(13)
where
$${G}_{{\mathbf{s}}}\equiv \frac{\partial {\mathbf{s}}}{\partial {\mathbf{x}}}$$
(14)
is the matrix of source gradients and
$${G}_{{\boldsymbol{\psi }}}\equiv \frac{\partial {\boldsymbol{\alpha }}}{\partial {\boldsymbol{\psi }}},$$
(15)
is the matrix of derivatives of the deflection angles α with respect to the pixellated lensing potential ψ. Operator \(L_{{\rm{GI}}}\) acts on the concatenated vector of the pixellated source and linearized potential corrections
$${\mathbf{r}}\equiv \left(\begin{array}{c}{\mathbf{s}}\\ {\boldsymbol{\psi }}\end{array}\right),\,\,\Delta {\mathbf{r}}\equiv \left(\begin{array}{c}\Delta {\mathbf{s}}\\ \Delta {\boldsymbol{\psi }}\end{array}\right)$$
(16)
Similarly, we define the block regularization matrix
$${R}_{{\rm{GI}}}\equiv \left(\begin{array}{cc}{\lambda }_{{\mathbf{s}}}{R}_{{\mathbf{s}}}&0\\ 0&{\lambda }_{{\boldsymbol{\psi }}}{R}_{{\boldsymbol{\psi }}}\end{array}\right).$$
(17)
For this work, we apply three different forms for the potential regularization operator \({R}_{{\boldsymbol{\psi }}}\), checking them against one another to ensure the robustness of our results. These penalize total mass, surface density gradients or surface density curvature. In each case, \(R_{{\boldsymbol{\psi }}}\) is assembled using standard finite-difference operators for the gradient and Laplacian on a Cartesian grid.
It can be shown that the proper linearization over ψ yields a linear system analogous to equation (1):
$$\left[{(DL_{{\rm{GI}}})}^{\rm{T}}C^{-1}(DL_{{\rm{GI}}})+R_{{\rm{GI}}}\right]\,\Delta {{\mathbf{r}}}_{{\rm{MP}}}=-{(DL_{{\rm{GI}}})}^{\rm{T}}C^{-1}(DL{\mathbf{s}}-{\mathbf{d}})-{R}_{{\rm{GI}}}\,{\mathbf{r}}.$$
(18)
Formally, equation (18) represents a single Newton iteration in the space of s and ψ. After iterating until convergence, it can be shown that the likelihood with respect to the regularization weights, \(\log P({\mathbf{d}}| {\lambda }_{{\mathbf{s}}},{\lambda }_{{\boldsymbol{\psi }}})\) is obtained by substituting s → r, \({\lambda }_{{\mathbf{s}}}{R}_{{\mathbf{s}}}\to {R}_{{\rm{GI}}}\) and \({L}\to {R}_{{\rm{GI}}}\) in equation (4). A single likelihood evaluation in our GI procedure thus consists of the following.
-
(1)
Initialize r = 0.
-
(2)
Solve for ΔrMP.
-
(3)
Update r = r + ΔrMP.
-
(4)
Repeat steps 2 and 3 until convergence. We find that 10 iterations are generally sufficient for convergence; in practice, we use 20.
-
(5)
Evaluate \(\log P({\mathbf{d}}| {\lambda }_{{\mathbf{s}}},{\lambda }_{{\boldsymbol{\psi }}})\) using equation (4).
Note that our GI implementation differs from that of ref. 54, who used a single iteration per likelihood evaluation. Our approach of iterating to convergence allows the GI algorithm to more easily recover compact features in the lensing potential.
We use a standard simplex optimization algorithm to maximize the likelihood with respect to the regularization weights λs and λψ, where each likelihood evaluation requires 20 subiterations as described above. We used starting values for λs and λψ of 109 and 1013, respectively, with initial logarithmic step sizes of 0.5 dex. After optimizing for λs and λψ, we convert ψ into corrections to the lensing convergence (surface mass density) by means of \({\kappa }_{{\rm{GI}}}=-\frac{1}{2}{\nabla }^{2}{\boldsymbol{\psi }}\).
Our use of Gaussian priors for the linearized potential corrections was motivated by the need for an uninformative and computationally fast regularization term. A natural consequence of this choice is that ψ and hence κGI, prefer to resemble Gaussian random fields in regions where the data are uninformative. To aid in our interpretation of the GI results, we apply a 3σGI threshold to the convergence maps, treating any convergence over the threshold as a real feature. We compute σGI from the r.m.s. of the convergence corrections within the mask used for modelling the lensed images. The value of σGI must be empirically estimated in this way, as the operator-based iterative solver framework (‘Bayesian inference’) precludes direct manipulation of the posterior covariance matrix.
Expected number counts of detectable subhaloes
We express the differential mass function for WDM subhaloes in terms of the ‘half-mode mass’ Mhm, as
$$\frac{{\rm{d}}n}{{\rm{d}}m}={m}^{\alpha }{\left[1+{\left({\alpha }_{2}\frac{{M}_{{\rm{hm}}}}{m}\right)}^{\beta }\right]}^{\gamma }$$
(19)
with α = −1.9, α2 = 1.1, β = 1.0 and γ = −0.5 (refs. 55,56). The mass m is the total mass of a PJ subhalo. The expected number of subhaloes in the mass range \({M}_{\min }=1{0}^{6}\,{{M}}_{\odot }\) to \({M}_{\max }=1{0}^{7}\,{{M}}_{\odot }\) is then
$${\mu }_{{\rm{sub}}}={A}_{{\rm{sens}}}\,{f}_{{\rm{sub}}}\,\frac{{M}_{{\rm{lens}}}( < 2{R}_{{\rm{E}}})}{4\uppi {R}_{{\rm{E}}}^{2}}\frac{\mathop{\int}\nolimits_{\ln {M}_{\min }}^{\ln {M}_{\max }}m\frac{{\rm{d}}n}{{\rm{d}}m}\,{\rm{d}}\ln m}{\mathop{\int}\nolimits_{\ln {M}_{\min }}^{\ln {M}_{\max }}{m}^{\alpha +2}\,{\rm{d}}\ln m},$$
(20)
where RE is the Einstein radius of the lens and Mlens(<2RE) is the total projected mass of the lens within twice the Einstein radius. The variable fsub is the fraction of dark matter (normalized with respect to the CDM mass function) contained in subhaloes; we use fsub = 0.012 (ref. 57). The variable Asens is the area around the lensed arcs that we expect to be sensitive to the presence of subhaloes, which we obtain by thresholding the deconvolved image at 5σ above the noise for a total area of \({A}_{{\rm{sens}}}=1.06\times 1{0}^{-2}\,{{\rm{arcsec}}}^{2}\). This area roughly corresponds to one primary beam width (~5 mas) along each bright arc. Defining Asens in such a tight region around the brightest parts of the lensed arcs is an intentionally conservative choice made to ensure that all subhaloes ≳106 M⊙ within this region have been detected. Our use of a constant sensitivity within Asens is a rough but conservative approximation to more sophisticated sensitivity mapping techniques (for example, refs. 28,56).
Parametric modelling: macromodel and \({\mathcal{A}}\) only
The results of the fully parametric modelling are summarized in Table 1. We first consider those fully parametric models consisting of the macromodel and the previously detected perturber \({\mathcal{A}}\) (‘Parametric lens model components’). Because \({\mathcal{A}}\) lies ~50 mas away from the nearest lensed radio emission, its redshift cannot be robustly constrained by our VLBI observation alone. For this work, we therefore assume that \({\mathcal{A}}\) lies at the redshift of the main lens.
The best parametric models for \({\mathcal{A}}\), in terms of Bayesian log-evidence \(\Delta \log {\mathcal{E}}\), are truncated isothermal subhalo models (PJ), with either free or fixed (tidal) truncation radii being equally preferred (\({\mathcal{A}}=\) PJ_free or \({\mathcal{A}}=\) PJ_tidal; \(\Delta \log {\mathcal{E}}=16\)). We additionally tested an NFW profile with free virial concentration (\({\mathcal{A}}=\)NFW_sub; \(\Delta \log {\mathcal{E}}=13\)) as well as a completely smooth macromodel (\({\mathcal{A}}=\)None; \(\Delta \log {\mathcal{E}}\equiv 0\)). For simplicity, we chose the best model with fewer free parameters, \({\mathcal{A}}=\) PJ_tidal, as our fiducial model for the GI procedure. We found that the choice of profile \({\mathcal{A}}\) does not affect the inferred properties of \({\mathcal{V}}\).
GI detection of \({\mathcal{V}}\)
The results of the GI modelling, applied to the parametric model with \({\mathcal{A}}=\) PJ_tidal, are shown in Fig. 1 and Supplementary Fig. 1. For three different regularization types (penalizing the convergence, gradient of convergence or curvature of convergence) and two different grid resolutions (NGI = 512 and NGI = 1,024, corresponding to pixel sizes of ΔxGI = 3.5 mas and ΔxGI = 1.8 mas, respectively), we consistently recover a compact density correction well above the 3σGI threshold used to identify bona fide features in the surface mass density that were unaccounted for during the initial parametric modelling. We label this feature \({\mathcal{V}}\), as it was discovered using a VLBI observation of JVAS B1938+666.
Defining \({m}_{80,{\mathcal{V}}}\) to be the cylindrical mass contained within a projected radius of 80 pc on the lens plane (‘Parametric modelling: macromodel, \({\mathcal{A}}\) and \({\mathcal{V}}\)’), we found \(8.3\times 1{0}^{5}\,{{M}}_{\odot }\le {m}_{80,{\mathcal{V}}}\le 1.8\times 1{0}^{6}\,{{M}}_{\odot }\) for the six GI runs. The scatter in GI masses is due to the large number of degrees of freedom in the pixellated convergence map, combined with the GI technique’s lack of prior knowledge on the approximate sphericity of gravitationally bound astrophysical objects. GI is an important technique for identifying and estimating the masses of positive density corrections to a parametric lens model; however, it is necessary to verify detection \({\mathcal{V}}\) with independent parametric modelling.
Parametric modelling: macromodel, \({\mathcal{A}}\) and \({\mathcal{V}}\)
We repeat the fully parametric modelling procedure, this time also including a parametric profile for perturber \({\mathcal{V}}\). The results are summarized in Table 1 and Supplementary Fig. 3. We assume, for this work, that perturber \({\mathcal{V}}\) lies at the lens redshift of z = 0.881, using a truncated isothermal profile (PJ; ‘Parametric lens model components’). We find that, regardless of the choice of profile for detection \({\mathcal{A}}\), the inclusion of a parametric model for \({\mathcal{V}}\) increases the log-evidence by at least \(\Delta \log {\mathcal{E}} > 332\) (>25σ) in all cases. Models \({\mathcal{A}}=\) PJ_free, \({\mathcal{V}}=\) PJ_free and \({\mathcal{A}}=\) PJ_tidal, \({\mathcal{V}}=\) PJ_free are equally preferred. As before, we chose \({\mathcal{A}}=\) PJ_tidal, \({\mathcal{V}}=\) PJ_free with fewer free parameters for a more detailed discussion, and treat it as our ‘best model’, with a significance of \(\Delta \log {\mathcal{E}}=364\) over the smooth macromodel and \(\Delta \log {\mathcal{E}}=348\) (26σ) over \({\mathcal{A}}=\) PJ_tidal, \({\mathcal{V}}=\)None. For the rest of the discussion, it is implied that \({\mathcal{A}}=\) PJ_tidal.
In the \({\mathcal{V}}=\) PJ_free parametrization, \({\mathcal{V}}\) has a total mass of \({m}_{{\mathcal{V}}}=(2.82\pm 0.26)\times 1{0}^{6}\,{M}_{\odot }\) and a truncation radius \({r}_{\rm{t},{\mathcal{V}}}=149\pm 18\,{\rm{pc}}\). To express the inferred mass in a way that is independent of \({r}_{\rm{t},{\mathcal{V}}}\), we found 80 pc to be the radius at which the enclosed (projected) mass \({m}_{80,{\mathcal{V}}}\) is uncorrelated with \({r}_{\rm{t},{\mathcal{V}}}\). For \({\mathcal{V}}=\) PJ_free, \({m}_{80,{\mathcal{V}}}=(1.13\pm 0.04)\times 1{0}^{6}\,{M}_{\odot }\), which is consistent with the gravitational imaging results (Fig. 3). For \({\mathcal{V}}=\) PJ_tidal, \({m}_{80,{\mathcal{V}}}=(1.07\pm 0.04)\) agrees with \({\mathcal{V}}=\) PJ_free, which indicates that the enclosed mass at 80 pc is indeed particularly well constrained by the data. The position of \({\mathcal{V}}\) is constrained to 194 μas precision in the y direction and 86 μas in the x direction. For all models, \({m}_{80,{\mathcal{V}}}\), \({x}_{{\mathcal{V}}}\) and \({y}_{{\mathcal{V}}}\) were consistent within the 1σ uncertainties regardless of the choice of profile \({\mathcal{A}}\).
For the gravitational imaging models, we compared three different regularization types (penalizing the convergence, gradient of convergence or curvature of convergence) and two different grid resolutions (NGI = 512 and NGI = 1,024, corresponding to pixel sizes of 3.5 mas and 1.8 mas, respectively). The profiles were consistent to within ~50% at a radius of 80 pc (vertical dashed line). The discrete steps in the GI curves are due to the pixellated nature of the convergence corrections. Shaded regions denote the 1σ uncertainties in the enclosed mass profiles.
The tidal truncation radius of \({\mathcal{V}}=\) PJ_tidal (equation (10)) is \({r}_{\rm{t},{\mathcal{V}}}=53\pm 1\,{\rm{pc}}\), which is a factor of three smaller than that inferred using PJ_free. Relative to \({\mathcal{V}}=\) PJ_free, \({\mathcal{V}}=\) PJ_tidal is disfavoured by \(\Delta \log {\mathcal{E}}=-16\), which suggests that this smaller value of \({r}_{\rm{t},{\mathcal{V}}}\) is too compact to produce the required lensing effect. This is likely to be a result of the use of the projected two-dimensional separation as a proxy for the three-dimensional distance between the lens galaxy and the perturber; any offset of \({\mathcal{V}}\) along the line of sight would increase the three-dimensional distance and hence rt, alleviating this issue.
Unlike the Keck AO observation of JVAS B1938+666, in which perturber \({\mathcal{A}}\) lies on top of the infrared arc, the projected lens-plane distance between \({\mathcal{A}}\) and the nearest lensed radio emission is 400 pc; therefore, we expect only the cylindrical mass out to a projected radius of 400 pc, which we define as \({m}_{400,{\mathcal{A}}}\), to be well constrained by the VLBI data. For \({\mathcal{A}}=\) PJ_tidal, we find \({m}_{400,{\mathcal{A}}}=(5.0\pm 0.8)\times 1{0}^{7}\,{M}_{\odot }\) and \({r}_{\rm{t},{\mathcal{A}}}=243\pm 20\,{\rm{pc}}\). For \({\mathcal{A}}=\) PJ_free, \({m}_{400,{\mathcal{A}}}=(5.2\pm 0.7)\times 1{0}^{7}\,{M}_{\odot }\) and \({r}_{\rm{t},{\mathcal{A}}}=15\pm 20\,{\rm{pc}}\). Hence, \({m}_{400,{\mathcal{A}}}\) is well constrained and consistent between the two, whereas \({r}_{\rm{t},{\mathcal{A}}}\) is unconstrained and simply reflects the tidal radius relationship and the log-uniform prior (respectively) used for these profiles. For comparison, Despali et al. (ref. 20 and subsequent private communication) find a mass of \({m}_{400,{\mathcal{A}}}=(7.7\pm 0.1)\times 1{0}^{7}\,{{M}}_{\odot }\) for their best subhalo model of the Keck AO observation.
Source link