Cross-matching catalogues
Cross-matching catalogues#
Before we can crossmatch our two catalogues we first have to import the raw data. Every astronomy catalogue tends to have its own unique format so we’ll need to look at how to do this with each one individually.
We’ll look at the AT20G bright source sample survey first.
The raw data we’ll be using is the file table2.dat
from this page in the VizieR archives, but we’ll use the filename bss.dat
from now on.
Every catalogue in VizieR has a detailed README file that gives you the exact format of each table in the catalogue.
The full catalogue of bright radio sources contains 320 objects. The first few rows look like this (scroll right to see it all):
1 00 04 35.65 -47 36 19.1 0.87 0.04 0.97 0.06 0.90 0.04 0.995 0.030 17.63 Q 1.F.11.C PKS 0002-478
2 00 10 35.92 -30 27 48.3 0.74 0.03 0.72 0.04 0.63 0.03 0.315 0.009 0.419 0.013 1.19 La01 19.59 Q 1.F.11.. PKS 0008-307
3* 00 11 01.27 -26 12 33.1 0.64 0.07 0.82 0.07 0.69 0.03 0.210 0.006 1.096 Wr83 19.53 Q 4.F.44.C PKS 0008-264
The catalogue is organised in fixed-width columns, with the format of the columns being:
1: Object catalogue ID number (sometimes with an asterisk)
2-4: Right ascension in HMS notation
5-7: Declination in DMS notation
8-: Other information, including spectral intensities
We only need coordinates for crossmatching.
We can load specific columns with the usecols argument in NumPy’s loadtxt
function:
import numpy as np
cat = np.loadtxt('bss.dat', usecols=range(1, 7))
print(cat[0])
We’ve skipped the ID column, since the ID number is always the same as the row number.
Fixed-width columns and loadtxt
loadtxt
does not work for fixed-width columns if values are missing.
Since there are no missing ID, RA and dec values it is fine for loading the first few columns of the BSS catalogue.
%%capture # Suppress output
# Save your previous code, so you can run them without having to copy and paste again.
%run 2a.ipynb
%run 2c.ipynb
# Write your crossmatch function here.
from scipy.spatial import cKDTree
import numpy as np
def import_bss():
dat = np.loadtxt('data/bss.dat', usecols=range(1,7))
ret = []
for idx, row in enumerate(dat):
hRA, mRA, sRA, hDEC, mDEC, sDEC = row
RA = hms2dec(hRA, mRA, sRA)
DEC = dms2dec(hDEC, mDEC, sDEC)
ret.append( (idx+1, RA, DEC) )
return ret
def import_super():
dat = np.loadtxt('data/super.csv', delimiter=',', skiprows=1, usecols=[0,1])
ret = zip(np.linspace(1,len(dat), len(dat)), dat[:,0], dat[:,1])
return list(ret)
def crossmatch(cat1, cat2, max_dist):
dat1 = np.array(cat1)[:,1:]
dat2 = np.array(cat2)[:,1:]
tree = cKDTree(dat2)
d2d, idx = tree.query(dat1, distance_upper_bound=max_dist*4) # allow 400% margin of error
dists = []
for id1, id2 in enumerate(idx):
if id2<len(dat2):
ra1, dec1 = dat1[id1]
ra2, dec2 = dat2[id2]
dist = angular_dist(ra1,dec1,ra2,dec2)
if dist<max_dist:
dists.append(dist)
else:
dists.append(5*dist)
else:
dists.append(10*max_dist)
dists = np.array(dists)
matches = list(zip(1+np.where(dists<=max_dist)[0], 1+idx[dists<=max_dist], dists[dists<=max_dist].tolist()))
no_matches = (1+np.where((idx==tree.n)|(dists>max_dist))[0]).tolist()
return matches, no_matches
bss_cat = import_bss()
super_cat = import_super()
# First example in the question
max_dist = 40/3600
matches, no_matches = crossmatch(bss_cat, super_cat, max_dist)
print(matches[:3])
print(no_matches[:3])
print(len(no_matches))
# Second example in the question
max_dist = 5/3600
matches, no_matches = crossmatch(bss_cat, super_cat, max_dist)
print(matches[:3])
print(no_matches[:3])
print(len(no_matches))
[(1, 2, 0.00010988610938711933), (2, 4, 0.0007649845967243825), (3, 5, 0.00020863352870694596)]
[5, 6, 11]
9
[(1, 2, 0.00010988610938711933), (2, 4, 0.0007649845967243825), (3, 5, 0.00020863352870694596)]
[5, 6, 11]
40
Write a find_closest
function that takes a catalogue and the position of a target source (a right ascension and declination) and finds the closest match for the target source in the catalogue.
Your function should return the ID of the closest object and the distance to that object.
The right ascension and declination are in degrees.
The catalogue list has been loaded by import_bss from the previous question.
The full 320 object BSS catalogue is contained in data/bss.dat
for you to test your code on.
Here’s an example of how your function should work:
>>> cat = import_bss()
>>> find_closest(cat, 175.3, -32.5)
(156, 3.7670580226469053)
And here’s another example:
>>> cat = import_bss()
>>> find_closest(cat, 32.2, 40.7)
(26, 57.729135775621295)
# Write the find_closest function here.
def find_closest(cat, ra, dec):
pass
You can use this to test your function.
cat = import_bss()
# First example from the question
print(find_closest(cat, 175.3, -32.5))
# Second example in the question
print(find_closest(cat, 32.2, 40.7))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/Users/arunkannawadi/Desktop/code/data-driven-astronomy/data_driven_astronomy/notebooks/week2/2d.ipynb Cell 9 in <cell line: 2>()
<a href='vscode-notebook-cell:/Users/arunkannawadi/Desktop/code/data-driven-astronomy/data_driven_astronomy/notebooks/week2/2d.ipynb#ch0000011?line=0'>1</a> # First example from the question
----> <a href='vscode-notebook-cell:/Users/arunkannawadi/Desktop/code/data-driven-astronomy/data_driven_astronomy/notebooks/week2/2d.ipynb#ch0000011?line=1'>2</a> print(find_closest(cat, 175.3, -32.5))
<a href='vscode-notebook-cell:/Users/arunkannawadi/Desktop/code/data-driven-astronomy/data_driven_astronomy/notebooks/week2/2d.ipynb#ch0000011?line=3'>4</a> # Second example in the question
<a href='vscode-notebook-cell:/Users/arunkannawadi/Desktop/code/data-driven-astronomy/data_driven_astronomy/notebooks/week2/2d.ipynb#ch0000011?line=4'>5</a> print(find_closest(cat, 32.2, 40.7))
NameError: name 'cat' is not defined