We got data to determine the 3-dimensional coordinates, absolute magnitude, and morphological type (to some extent) of each of the 250 000 (at the time of typing) galaxies in the SDSS. The SQL command used to grab this from the SDSS servers was:
select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r from specObj s join photoObj p on s.bestObjID=p.objID where specClass=2
Actually, not quite: we ran it in three parts. Note that this should not be taken as an example of how to write an SDSS query - it is not optimized!
select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r from specObj s join photoObj p on s.bestObjID=p.objID where specClass=2 and s.z <= 0.1 select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r from specObj s join photoObj p on s.bestObjID=p.objID where specClass=2 and s.z <= 0.3 and s.z > 0.1 select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r from specObj s join photoObj p on s.bestObjID=p.objID where specClass=2 and s.z > 0.3
The round commands refer to the precision with which data is reported. The number of decimal places given by Sloan is 6 or 7, so going further is pointless.
specClass is 2 for galaxies, and 3 or 4 for quasars.
Note that we were able to get commands like the above to work because we called it from a Fermilab server only accessible to those in institutions affiliated with the SDSS. Other servers limit the number of results returned at a time.
You have the choice of getting your results back from the Sloan server(s) in HTML or CSV (comma separated values) format.
perl csv2dat.pl sloangalaxies.csv sloangalaxies.dat 2
Either way, suppose you end up with a file "sloangalaxies.dat" that Matlab can read.
Suppose you want to bin the galaxies into N types. You have, somewhere else, found N pictures representing galaxies, and want to place each picture in a square at some orientation - that orientation forms a plane defined by vectors u=(u1 u2 u3) and v=(v1 v2 v3) in R3.
Get your pictures, and save them as Q x Q .sgi files. Suppose they are called gal1.sgi, gal2.sgi, ... ,etc. Q must be a power of 2. We got our pictures using the SDSS search tools and a image retrieval utility that Mark had written for an earlier project of his. (SDSS also has an image retrieval utility, but this one is much easier to use. You save a .csv file with ra and dec positions found by the first SDSS utility referred to above.)
Here is the distribution of eclass values in the 250K SDSS release 2 galaxies:
Open Matlab, and make sure the .m files above are in your path (with an addpath command or using the GUI). Then type
S = load ('sloangalaxies.dat'); z = S(:,3); posred = find(z>=0); S = S(posred,:); okpetmag = find (petmag > -9999); S = S(okpetmag,:); M = radecz2xxyyzz(S);
Note the 'single quotes' - Matlab hates "double quotes".
Only galaxies with positive redshift were considered (there are galaxies with negative redshift - but very few, e.g. 1 in the 250 000 galaxies in Sloan's second data release).
The 24 galaxies with no Petrosian magnitude measurements (default = -9999) were also removed.
M is a G x 8 matrix representing G galaxies, with for each galaxy:
What you have is a matrix with the columns above. What you want is one with the columns below.
Luminosity was set to 1000.2 * positive absolute magnitude, where
positive absolute magnitude = - absolute magnitude
absolute magnitude = apparent magnitude - 5 log10(luminosity distance) - k-correction - 25
luminosity distance = (1+z)(comoving distance in megaparsecs))
k-correction = -2.5 (1+alpha) log10 (1+z)
alpha was taken to be -0.5. (Actually, it should only be -0.5 for quasars, and should depend on eclass, but that would have taken too much work.)
A few notes to non-astronomers: The 'absolute' in 'absolute magnitude' refers to the brightness of an object independent of how far away it is, and not to being positive. Also, brighter stars have a lower magnitude. Greek astronomers decided that the brightest stars had (apparent) magnitude 0 and the faintest had (apparent) magnitude 6. The graph below shows the distribution of positive absolute magnitudes for the 250K galaxies in the SDSS release 2 set.
We wanted to make size proportional to radius (Petrosian radius behaves more consistently than other radii for galaxies at different distances), but couldn't as we couldn't work out how to get Partiview to get polysize set for each point.
Assume for concreteness that N, the number of galaxy bins, is 10 (the code will work for other positive integer values of N quite fine). In Matlab, type
petmag = M(:,6); dist = M(:,7); z = M(:,8); al = -0.5 kcorr = -2.5 * (1+al) * log (1+z) / log(10); lumdist = (1+z) .* dist; absmag = petmag - 5*log(lumdist)/log(10) - kcorr - 25; lumin = 100.^(-0.2 * absmag); M(:,5) = -absmag; M(:,6) = lumin; numbins = 10; M = addcoarsecol (M,4,numbins); P = size(M,1); M = horzcat (M, randompolyori(P)); save sloangalaxies.speck M -ascii
So now you have a text file sloangalaxies.speck that has all the galaxy positions, picture types (labels), and orientations. This is almost Partiview speck file format, but you need some headers. Open sloangalaxies.speck in a text editor and place the following lines at the start of it.
datavar 0 eclass datavar 1 mag datavar 2 lumin datavar 3 distmpc datavar 4 redshift datavar 5 pictype datavar 6 ori texturevar pictype polyorivar ori texture -a 1 gal1.sgi texture -a 2 gal2.sgi texture -a 3 gal3.sgi texture -a 4 gal4.sgi texture -a 5 gal5.sgi texture -a 6 gal6.sgi texture -a 7 gal7.sgi texture -a 8 gal8.sgi texture -a 9 gal9.sgi texture -a 10 gal10.sgi
You should also create a CF file, say sloan.cf, with the following parameters. You can change the parameters - read the Teuben-Levy Partiview manual for some documentation on what parameters mean. Or guess and tinker and guess and tinker...
filepath +:. object g1=galaxies include sloangalaxies.speck eval psize 1e4 eval ptsize 2 10 eval lum mag 5 25 eval slum 0.1 eval polysize 7 eval polysides 4 eval cmap spirell.cmap eval color eclass eval polymin 0.01 20 eval alpha 0.850 eval points on eval poly on eval label off eval censize 0 eval fade sph eval clip 0.1 100000000
Smaller values for the near clip distance led to annoying missing polygons for us in the WMAP sphere. Thanks to Selden Ball for suggesting this fix. It was in a different context, but he documented it!
At this stage, assuming you have Partiview installed, calling some version (this is the I-hope-we-dont-have-to-worry-about-paths version) of this command gives something workable, from which you can take pretty pictures.
At the moment, sloangalaxies.speck is a huge file (e.g. with Release 1 data it's got 100K objects, and with Release 2 data it'll have about 250K objects). It is useful to split this into smaller files for at least two reasons:
To do this, say
perl splitspecktx.pl sloangalaxies.speck sloan.cf sloansplit.cf
Note that you don't have to say anything about how many files you want to split sloangalaxies.speck into - it works that out by itself. The file sloansplit.cf is based on sloan.cf but has the groups split properly. Very nice! Works for this application - may not work for yours...
Will do as before, but much faster.
For reference purposes, the distribution of ra, dec and z values in Sloan's release 2 galaxies is shown below. Note that the vertical axes are not the same.
The z-distribution isnt quite clear, so here it is in more detail, just for the cases where z<1. Of the remaining 15 galaxies, 7 have z values between 1 and 2, 3 satisfy 2 < z < 3, 2 satisfy 3 < z < 4, and 3 satisfy 4 < z < 5.