How Galaxies were processed in the Sloan Galaxy Flythrough exhibit

Getting galaxy data from SDSS

We got data to determine the 3-dimensional coordinates, absolute magnitude, and morphological type (to some extent) of each of the 250 000 (at the time of typing) galaxies in the SDSS. The SQL command used to grab this from the SDSS servers was:

 select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r
from specObj s join photoObj p on s.bestObjID=p.objID 
where specClass=2 

Actually, not quite: we ran it in three parts. Note that this should not be taken as an example of how to write an SDSS query - it is not optimized!

 select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r
from specObj s join photoObj p on s.bestObjID=p.objID 
where specClass=2 and s.z <= 0.1

 select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r
from specObj s join photoObj p on s.bestObjID=p.objID 
where specClass=2 and s.z <= 0.3 and s.z > 0.1

 select all s.ra,s.dec,round(s.z,7),round(s.eClass,7),p.petroRad_r,p.petroMag_r
from specObj s join photoObj p on s.bestObjID=p.objID 
where specClass=2 and s.z > 0.3

The round commands refer to the precision with which data is reported. The number of decimal places given by Sloan is 6 or 7, so going further is pointless.

specClass is 2 for galaxies, and 3 or 4 for quasars.

Note that we were able to get commands like the above to work because we called it from a Fermilab server only accessible to those in institutions affiliated with the SDSS. Other servers limit the number of results returned at a time.

You have the choice of getting your results back from the Sloan server(s) in HTML or CSV (comma separated values) format.

Either way, suppose you end up with a file "sloangalaxies.dat" that Matlab can read.

Get picture files ready

Suppose you want to bin the galaxies into N types. You have, somewhere else, found N pictures representing galaxies, and want to place each picture in a square at some orientation - that orientation forms a plane defined by vectors u=(u1 u2 u3) and v=(v1 v2 v3) in R3.

Get your pictures, and save them as Q x Q .sgi files. Suppose they are called gal1.sgi, gal2.sgi, ... ,etc. Q must be a power of 2. We got our pictures using the SDSS search tools and a image retrieval utility that Mark had written for an earlier project of his. (SDSS also has an image retrieval utility, but this one is much easier to use. You save a .csv file with ra and dec positions found by the first SDSS utility referred to above.)

Here is the distribution of eclass values in the 250K SDSS release 2 galaxies:

Using Matlab to get a Partiview SPECK matrix

Open Matlab, and make sure the .m files above are in your path (with an addpath command or using the GUI). Then type

  S = load ('sloangalaxies.dat');
  z = S(:,3);
  posred = find(z>=0);
  S = S(posred,:);
  okpetmag = find (petmag > -9999);
  S = S(okpetmag,:);
  M = radecz2xxyyzz(S);  

Note the 'single quotes' - Matlab hates "double quotes".

Only galaxies with positive redshift were considered (there are galaxies with negative redshift - but very few, e.g. 1 in the 250 000 galaxies in Sloan's second data release).

The 24 galaxies with no Petrosian magnitude measurements (default = -9999) were also removed.

M is a G x 8 matrix representing G galaxies, with for each galaxy:

  1. x position (comoving coordinates)
  2. y position
  3. z position (z doesnt refer to redshift here, but it will soon...)
  4. eclass
  5. Petrosian radius
  6. Petrosian magnitude (apparent)
  7. distance from origin (megaparsecs)
  8. redshift

What you have is a matrix with the columns above. What you want is one with the columns below.

  1. x position (comoving coordinates)
  2. y position
  3. z position (z doesnt refer to redshift here, but it will soon...)
  4. eclass
  5. positive absolute magnitude
  6. luminosity
  7. distance from origin (megaparsecs)
  8. redshift
  9. label (from 1 to N, showing which galaxy type bin it's in)
  10. random polygon orientation u1
  11. random polygon orientation u2
  12. random polygon orientation u3
  13. random polygon orientation v1
  14. random polygon orientation v2
  15. random polygon orientation v3

Luminosity was set to 1000.2 * positive absolute magnitude, where

positive absolute magnitude = - absolute magnitude

absolute magnitude = apparent magnitude - 5 log10(luminosity distance) - k-correction - 25
luminosity distance = (1+z)(comoving distance in megaparsecs))
k-correction = -2.5 (1+alpha) log10 (1+z)
alpha was taken to be -0.5. (Actually, it should only be -0.5 for quasars, and should depend on eclass, but that would have taken too much work.)

A few notes to non-astronomers: The 'absolute' in 'absolute magnitude' refers to the brightness of an object independent of how far away it is, and not to being positive. Also, brighter stars have a lower magnitude. Greek astronomers decided that the brightest stars had (apparent) magnitude 0 and the faintest had (apparent) magnitude 6. The graph below shows the distribution of positive absolute magnitudes for the 250K galaxies in the SDSS release 2 set.

We wanted to make size proportional to radius (Petrosian radius behaves more consistently than other radii for galaxies at different distances), but couldn't as we couldn't work out how to get Partiview to get polysize set for each point.

Assume for concreteness that N, the number of galaxy bins, is 10 (the code will work for other positive integer values of N quite fine). In Matlab, type

      petmag = M(:,6);
      dist = M(:,7);
      z = M(:,8);
      al = -0.5
      kcorr = -2.5 * (1+al) * log (1+z) / log(10);
      lumdist = (1+z) .* dist;
      absmag = petmag - 5*log(lumdist)/log(10) - kcorr - 25;
      lumin = 100.^(-0.2 * absmag);
      M(:,5) = -absmag;
      M(:,6) = lumin;
      numbins = 10;
      M = addcoarsecol (M,4,numbins);
      P = size(M,1);
      M = horzcat (M, randompolyori(P));
      save sloangalaxies.speck M -ascii

Creating usable Partiview files

So now you have a text file sloangalaxies.speck that has all the galaxy positions, picture types (labels), and orientations. This is almost Partiview speck file format, but you need some headers. Open sloangalaxies.speck in a text editor and place the following lines at the start of it.

      datavar 0 eclass
      datavar 1 mag
      datavar 2 lumin
      datavar 3 distmpc
      datavar 4 redshift
      datavar 5 pictype
      datavar 6 ori 
      texturevar pictype
      polyorivar ori
      texture -a 1 gal1.sgi
      texture -a 2 gal2.sgi
      texture -a 3 gal3.sgi
      texture -a 4 gal4.sgi
      texture -a 5 gal5.sgi
      texture -a 6 gal6.sgi
      texture -a 7 gal7.sgi
      texture -a 8 gal8.sgi
      texture -a 9 gal9.sgi
      texture -a 10 gal10.sgi

You should also create a CF file, say sloan.cf, with the following parameters. You can change the parameters - read the Teuben-Levy Partiview manual for some documentation on what parameters mean. Or guess and tinker and guess and tinker...

   filepath +:.

object g1=galaxies
    include sloangalaxies.speck
   eval psize 1e4  
   eval ptsize 2 10  
   eval lum mag 5 25
   eval slum 0.1     
   eval polysize 7
   eval polysides 4  
   eval cmap spirell.cmap
   eval color eclass
   eval polymin 0.01 20
   eval alpha 0.850  
   eval points on    
   eval poly on      
   eval label off     

   eval censize 0
   eval fade sph
   eval clip 0.1 100000000

spirell.cmap

Smaller values for the near clip distance led to annoying missing polygons for us in the WMAP sphere. Thanks to Selden Ball for suggesting this fix. It was in a different context, but he documented it!

At this stage, assuming you have Partiview installed, calling some version (this is the I-hope-we-dont-have-to-worry-about-paths version) of this command gives something workable, from which you can take pretty pictures.

    partiview sloan.cf

Splitting up the data

At the moment, sloangalaxies.speck is a huge file (e.g. with Release 1 data it's got 100K objects, and with Release 2 data it'll have about 250K objects). It is useful to split this into smaller files for at least two reasons:

To do this, say

    perl splitspecktx.pl sloangalaxies.speck sloan.cf sloansplit.cf

Note that you don't have to say anything about how many files you want to split sloangalaxies.speck into - it works that out by itself. The file sloansplit.cf is based on sloan.cf but has the groups split properly. Very nice! Works for this application - may not work for yours...

Now, saying

    partiview sloansplit.cf

Will do as before, but much faster.


For reference purposes, the distribution of ra, dec and z values in Sloan's release 2 galaxies is shown below. Note that the vertical axes are not the same.

The z-distribution isnt quite clear, so here it is in more detail, just for the cases where z<1. Of the remaining 15 galaxies, 7 have z values between 1 and 2, 3 satisfy 2 < z < 3, 2 satisfy 3 < z < 4, and 3 satisfy 4 < z < 5.