DRAGONS - GSAOI Data Reduction Tutorial

Document ID

PIPE-USER-118_GSAOIImg-DRTutorial

Introduction

This tutorial covers the basics of reducing GSAOI data using DRAGONS.

The next two sections explain what are the required software and the data set that we use throughout the tutorial. Chapter 2: Data Reduction contains a quick example on how to reduce data using the DRAGONS command line tools. Chapter 3: Reduction with API shows how we can reduce the data using DRAGONS packages from within Python.

Software Requirements

Before you start, make sure you have DRAGONS properly installed and configured on your machine. You can test that by typing the following commands:

$ conda activate dragons
$ python -c "import astrodata"

Where dragons is the name of the conda environment where DRAGONS should be installed. If you have an error message, make sure:

  • Anaconda or MiniConda is properly installed;
  • A Conda Virtual Environment is properly created and is active;
  • AstroConda (STScI) is properly installed within the Virtual Environment;
  • DRAGONS was successfully installed within the Conda Virtual Environment;

Downloading the tutorial datasets

All the data needed to run this tutorial are found in the tutorial’s data package:

Download it and unpack it somewhere convenient.

cd <somewhere convenient>
tar xvf gsaoiimg_tutorial_datapkg-v1.tar
bunzip2 gsaoiimg_tutorial/playdata/*.bz2

The datasets are found in the subdirectory gsaoiimg_tutorial/playdata, and we will work in the subdirectory named gsaoiimg_tutorial/playground.

Note

All the raw data can also be downloaded from the Gemini Observatory Archive. Using the tutorial data package is probably more convenient but if you really want to learn how to search for and retrieve the data yourself, see the step-by-step instructions in the appendix, Downloading from the Gemini Observatory Archive.

About the dataset

The table below contains a summary of the dataset downloaded in the previous section. Note that for GSAOI, the dark current is low enough that there is no need to correct for it.

Science
S20170505S0095-110
Kshort-band, on target, 60 s
Flats
S20170505S0030-044
S20170505S0060-074
Lamp on, Kshort, for science
Lamp off, Kshort, for science
Standard star
S20170504S0114-117
Kshort, standard star, 30 s

Data Reduction with “reduce”

This chapter will guide you on reducing GSAOI data using command line tools. In this example we reduce a GSAOI observation of the resolved outskirt of a nearby galaxy. The observation is a dither-on-target with offset-to-sky sequence. Just open a terminal to get started.

While the example cannot possibly cover all situations, it will help you get acquainted with the reduction of GSAOI data with DRAGONS. We encourage you to look at the Tips and Tricks and Issues and Limitations chapters to learn more about GSAOI data reduction.

DRAGONS installation comes with a set of handful scripts that are used to reduce astronomical data. The most important script is called “reduce”, which is extensively explained in the Recipe System Users Manual. It is through that command that a DRAGONS reduction is launched.

For this tutorial, we will be also using the following supplemental tools: “dataselect”, “showd”, “typewalk”, and “caldb”.

The dataset

If you have not already, download and unpack the tutorial’s data package. Refer to Downloading the tutorial datasets for the links and simple instructions.

The dataset specific to this example is described in:

Here is a copy of the table for quick reference.

Science
S20170505S0095-110
Kshort-band, on target, 60 s
Flats
S20170505S0030-044
S20170505S0060-074
Lamp on, Kshort, for science
Lamp off, Kshort, for science
Standard star
S20170504S0114-117
Kshort, standard star, 30 s

Note

A master dark is not needed for GSAOI. The dark current is very low.

Set up the Local Calibration Manager

DRAGONS comes with a local calibration manager that uses the same calibration association rules as the Gemini Observatory Archive. This allows reduce to make requests to a local light-weight database for matching processed calibrations when needed to reduce a dataset.

Let’s set up the local calibration manager for this session.

In ~/.geminidr/, create or edit the configuration file rsys.cfg as follow:

[calibs]
standalone = True
database_dir = ${path_to_my_data}/gsaoiimg_tutorial/playground

This simply tells the system where to put the calibration database, the database that will keep track of the processed calibrations we are going to send to it.

Note

The tilde (~) in the path above refers to your home directory. Also, mind the dot in .geminidr.

Then initialize the calibration database:

caldb init

That’s it! It is ready to use!

You can add processed calibrations with caldb add <filename> (we will later), list the database content with caldb list, and caldb remove <filename> to remove a file only from the database (it will not remove the file on disk). For more the details, check the Recipe System Local Calibration Manager documentation, caldb.

Check files

For this example, all the raw files we need are in the same directory called ../playdata/. Let us learn a bit about the data we have.

Ensure that you are in the playground directory and that the conda environment that includes DRAGONS has been activated.

Let us call the command tool “typewalk”:

$ typewalk -d ../playdata/

directory:  /data/workspace/gsaoiimg_tutorial/playdata
     S20170504S0114.fits ............... (GEMINI) (GSAOI) (IMAGE) (RAW) (SIDEREAL) (SOUTH) (UNPREPARED)
     ...
     S20170505S0030.fits ............... (AZEL_TARGET) (CAL) (DOMEFLAT) (FLAT) (GEMINI) (GSAOI) (IMAGE) (LAMPON) (NON_SIDEREAL) (RAW) (SOUTH) (UNPREPARED)
     ...
     S20170505S0060.fits ............... (AZEL_TARGET) (CAL) (DOMEFLAT) (FLAT) (GEMINI) (GSAOI) (IMAGE) (LAMPOFF) (NON_SIDEREAL) (RAW) (SOUTH) (UNPREPARED)
     ...
     S20170505S0095.fits ............... (GEMINI) (GSAOI) (IMAGE) (RAW) (SIDEREAL) (SOUTH) (UNPREPARED)
     ...
     S20170505S0110.fits ............... (GEMINI) (GSAOI) (IMAGE) (RAW) (SIDEREAL) (SOUTH) (UNPREPARED)
Done DataSpider.typewalk(..)

This command will open every FITS file within the folder passed after the -d flag (recursively) and will print an unsorted table with the file names and the associated tags. For example, calibration files will always have the CAL tag. Flat images will always have the FLAT tag. This means that we can start getting to know a bit more about our data set just by looking the tags. The output above was trimmed for presentation.

Create File lists

This data set contains science and calibration frames. For some program, it could have different observed targets and different exposure times depending on how you like to organize your raw data.

The DRAGONS data reduction pipeline does not organize the data for you. You have to do it. DRAGONS provides tools to help you with that.

The first step is to create lists that will be used in the data reduction process. For that, we use “dataselect”. Please, refer to the “dataselect” documentation for details regarding its usage.

A list for the flats

Let us create the list containing the domeflats:

$ dataselect --tags FLAT ../playdata/*.fits -o flats_Kshort.list

We know that our dataset has only one filter (Kshort). If our dataset contained data with more filters, we would have had to use the --expr option to select the appropriate filter as follow:

$ dataselect --tags FLAT --expr "filter_name=='Kshort'" ../playdata/*.fits -o flats_Kshort.list

Note

To see the name of the filter, use “showd” (show descriptor):

$ showd ../playdata/*.fits -d filter_name
----------------------------------------------------
filename                                 filter_name
----------------------------------------------------
../playdata/S20170504S0114.fits   Kshort_G1105&Clear
...
...

A list for the standard star

In this case we have only one standard star. Indeed, we can confirm that by selecting on partner calibrations and showing the object name:

$ dataselect --expr 'observation_class=="partnerCal"' ../playdata/*.fits | showd -d object
----------------------------------------
filename                          object
----------------------------------------
../playdata/S20170504S0114.fits     9132
../playdata/S20170504S0115.fits     9132
../playdata/S20170504S0116.fits     9132
../playdata/S20170504S0117.fits     9132

If we had more than one object, a list for each standard star is created by using the object descriptor as a selection criterium in “dataselect”:

$ dataselect --expr 'object=="9132"' ../playdata/*.fits -o std_9132.list

A list for the science observations

The rest is the data with your science target. Before we create a new list, let us check that indeed we have only one science target and a unique exposure time:

$ dataselect --expr 'observation_class=="science"' ../playdata/*.fits | showd -d object,exposure_time
---------------------------------------------------------
filename                           object   exposure_time
---------------------------------------------------------
../playdata/S20170505S0095.fits   NGC5128            60.0
../playdata/S20170505S0096.fits   NGC5128            60.0
...
../playdata/S20170505S0109.fits   NGC5128            60.0
../playdata/S20170505S0110.fits   NGC5128            60.0

Just to demonstrate how expression are built, let us consider that we need to select only the files for which object is NGC5128 and exposure_time is 60 seconds. We also want to pass the output to a new list:

$ dataselect --expr '(observation_class=="science" and exposure_time==60.)' ../playdata/*.fits -o science.list

Create a Master Flat Field

The GSAOI Kshort master flat is created from a series of lamp-on and lamp-off dome exposures. They should all have the same exposure time. Each flavor is stacked (averaged), then the lamp-off stack is subtracted from the lamp-on stack and the result normalized.

We create the master flat field and add it to the calibration manager as follow:

$ reduce @flats_Kshort.list
$ caldb add S20170505S0030_flat.fits

The master flat file is found in two places: inside the same folder where you ran reduce and inside the calibrations/processed_flats/ folder, for safekeeping. Here is an example of a master flat:

_images/S20170505S0030_flat.png

Master Flat - K-Short Band

Note that this figure shows the masked pixels in white color but not all the detector features are masked. For example, the “Christmas Tree” on detector 2 can be easily noticed but was not masked.

Reduce Standard Star

The standard star is reduced essentially the same way as the science target (next section). The processed flat field that we added earlier to the local calibration database will be fetched automatically. Also, in this case the standard star was obtained using ROIs (Regions-of-Interest) which do not match the flat field. The software will recognize that the flat field is still valid will crop it to match the ROIs.

$ reduce @std_9132.list

To stack, the tool disco_stu is needed for GSAOI. It is discussed later in this chapter.

$ disco `dataselect *_skyCorrected.fits --expr='observation_class=="partnerCal"'`

Reduce the Science Images

This is an observation of a galaxy with offset to sky. We need to turn off the additive offsetting of the sky because the target fills the field of view and does not represent a reasonable sky background. If the offsetting is not turned off in this particular case, it results in an over-subtraction of the sky frame.

Note

Unlike the other near-IR instruments, the additive offset_sky parameter is used by default to adjust the sky frame background for GSAOI instead of the multiplicative scale_sky parameter. It was found to work better when the sky background per pixel is very low, which is common due to the short exposure time needed to avoid saturating stars and the small pixel scale. The reader is encourage to experiment with scale_sky if offset_sky does not seem to lead to an optimal sky subtraction.

(Remember that when the source is extended, both parameters normally need to be turned off.)

The sky frame comes from off-target sky observations. We feed the pipeline all the on-target and off-target frames. The software will split the on-target and the off-target appropriately using information in the headers.

Once we have our calibration files processed and added to the database, ready for retrieval, we can run reduce on our science data.

$ reduce @science.list -p skyCorrect:offset_sky=False

This command will generate flat corrected and sky subtracted files but will not stack them. You can find which file is which by its suffix (_flatCorrected or _skyCorrected). The on-target files are the ones that have been sky subtracted (_skyCorrected). There should be nine of the them.

The frames are not stacked because of the high level of distortion in the GSAOI images that requires special software to correct and properly stack. The tool disco_stu (next section) must be used to stack GSAOI science data.

_images/S20170505S0095_skyCorrected.png

S20170505S0095 - Flat corrected and sky subtracted

The figure above shows an example of the sky-subtracted frames. The masked pixels are represented in white color.

Stack Sky-Subtracted Science Images

The final step is to stack the images. For that, you must be aware that GSAOI images are highly distorted and that this distortion must be corrected before stacking. The tool for distortion correction and image stacking is disco_stu.

Note

disco_stu is installed with conda when the standard Gemini software installation instructions are followed. To install after the fact:

conda install disco_stu

The simplest use of disco_stu is to run the command disco on the files to be stacked.

$ disco `dataselect *_skyCorrected.fits --expr 'observation_class=="science"'` -o my_Kshort_stack.fits

By default, disco will write the output file as disco_stack.fits, the -o flag allows us to override that and choose the name of the output stack.

For absolute distortion correction and astrometry, disco_stu can use a reference catalog provided by the user. Without a reference catalog, like above, only the relative distortion between the frames is accounted for. For more information about disco_stu see the disco_stu.pdf manual in $CONDA_PREFIX/share/disco_stu.

The output stack units are in electrons (header keyword BUNIT=electrons). The output stack is stored in a multi-extension FITS (MEF) file. The science signal is in the “SCI” extension, the variance is in the “VAR” extension, and the data quality plane (mask) is in the “DQ” extension.

The final image is shown below.

_images/my_Kshort_stack.png

Sky Subtracted and Stacked Final Image

Reduction using API

There may be cases where you might be interested in accessing the DRAGONS’ Application Program Interface (API) directly instead of using the command line wrappers to reduce your data. In this case, you will need to access DRAGONS’ tools by importing the appropriate modules and packages.

The dataset

If you have not already, download and unpack the tutorial’s data package. Refer to Downloading the tutorial datasets for the links and simple instructions.

The dataset specific to this example is described in:

Here is a copy of the table for quick reference.

Science
S20170505S0095-110
Kshort-band, on target, 60 s
Flats
S20170505S0030-044
S20170505S0060-074
Lamp on, Kshort, for science
Lamp off, Kshort, for science
Standard star
S20170504S0114-117
Kshort, standard star, 30 s

Note

A master dark is not needed for GSAOI. The dark current is very low.

Setting up

Importing Libraries

We first import the necessary modules and classes:

1
2
3
4
5
import glob

from gempy.adlibrary import dataselect
from recipe_system import cal_service
from recipe_system.reduction.coreReduce import Reduce

Importing print_function is for compatibility with the Python 2.7 print statement. If you are working with Python 3, it is not needed, but importing it will not break anything.

glob is Python built-in packages. It will be used to return a list with the input file names.

dataselect will be used to create file lists for the darks, the flats and the science observations. The cal_service package is our interface with the local calibration database. Finally, the Reduce class is used to set up and run the data reduction.

Setting up the logger

We recommend using the DRAGONS logger. (See also Double messaging issue.)

8
9
from gempy.utils import logutils
logutils.config(file_name='gsaoi_data_reduction.log')

Setting up the Calibration Service

Before we continue, let’s be sure we have properly setup our calibration database and the calibration association service.

First, check that you have already a rsys.cfg file inside the ~/.geminidr/. It should contain:

[calibs]
standalone = True
database_dir = ${path_to_my_data}/gsaoiimg_tutorial/playground

This tells the system where to put the calibration database. This database will keep track of the processed calibrations as we add them to it.

Note

The tilde (~) in the path above refers to your home directory. Also, mind the dot in .geminidr.

The calibration database is initialized and the calibration service is configured as follow:

10
11
12
13
14
caldb = cal_service.CalibrationService()
caldb.config()
caldb.init()

cal_service.set_calservice()

The calibration service is now ready to use. If you need more details, check the caldb section in the Recipe System Users Manual.

Create list of files

Next step is to create lists of files that will be used as input to each of the data reduction steps. Let us start by creating a list of all the FITS files in the directory ../playdata/.

15
16
all_files = glob.glob('../playdata/*.fits')
all_files.sort()

Before you carry on, you might want to do print(all_files) to check if they were properly read.

Now we can use the all_files list as an input to select_data(). The dataselect.select_data() function signature is:

select_data(inputs, tags=[], xtags=[], expression='True')

A list for the flats

Now you must create a list of FLAT images for each filter. The expression specifying the filter name is needed only if you have data from multiple filters. It is not really needed in this case.

17
18
19
20
21
22
list_of_flats_Ks = dataselect.select_data(
     all_files,
     ['FLAT'],
     [],
     dataselect.expr_parser('filter_name=="Kshort"')
)

A list for the standard star

For the standard star selection, we use:

23
24
25
26
27
28
list_of_std_stars = dataselect.select_data(
    all_files,
    [],
    [],
    dataselect.expr_parser('observation_class=="partnerCal"')
)

Here, we are passing empty lists to the second and the third argument since we do not need to use the Tags for selection nor for exclusion.

A list for the science data

Finally, the science data can be selected using:

29
30
31
32
33
34
list_of_science_images = dataselect.select_data(
    all_files,
    [],
    [],
    dataselect.expr_parser('(observation_class=="science" and exposure_time==60.)')
)

The exposure time is not really needed in this case since there are only 60-second frames, but it shows how you could have two selection criteria in the expression.

Create a Master Flat Field

As explained on the calibration webpage for GSAOI, dark subtraction is not necessary since the dark noise level is very low. Therefore, we can go ahead and start with the master flat.

A GSAOI K-short master flat is created from a series of lamp-on and lamp-off exposures. Each flavor is stacked, then the lamp-off stack is subtracted from the lamp-on stack and the result normalized.

We create the master flat field and add it to the calibration manager as follow:

35
36
37
38
39
reduce_flats = Reduce()
reduce_flats.files.extend(list_of_flats_Ks)
reduce_flats.runr()

caldb.add_cal(reduce_flats.output_filenames[0])

Once runr() is finished, we add the master flat to the calibration manager (line 38).

Reduce Standard Star

The standard star is reduced essentially the same way as the science target (next section). The processed flat field that we added above to the local calibration database will be fetched automatically.

40
41
42
reduce_std = Reduce()
reduce_std.files.extend(list_of_std_stars)
reduce_std.runr()

For stacking the sky-subtracted standard star images, the easiest way is probably to use disco_stu’s command line interface as follow:

$ disco `dataselect *_skyCorrected.fits --expr='observation_class=="partnerCal"'`

If you really want or need to run disco_stu’s API, see the example later in this chapter where we do just that for the science frames.

Reduce the Science Images

The science observation uses a dither-on-target with offset-to-sky pattern. The sky frames from the offset-to-sky position will be automatically detected and used for the sky subtraction.

The master flat will be retrieved automatically from the local calibration database.

We use similar commands as before to initiate a new reduction to reduce the science data:

43
44
45
46
reduce_target = Reduce()
reduce_target.files.extend(list_of_science_images)
reduce_target.uparms.append(('skyCorrect:offset_sky', False))
reduce_target.runr()

Stack Sky-subtracted Science Images

The final step is to stack the images. For that, you must be aware that GSAOI images are highly distorted and that this distortion must be corrected before stacking. The tool for distortion correction and image stacking is disco_stu.

Note

disco_stu is installed with conda when the standard Gemini software installation instructions are followed. To install after the fact:

conda install disco_stu

This package was created to be accessed via command line (See the Stack Sky-Subtracted Science Images command line section). Because of that, the API is not the most polished, and using it requires a fair number of steps. If you can use the command line interface, it is recommended that you do so. If not, then let’s get to work.

First, let’s import some libraries:

46
47
48
49
from collections import namedtuple

from disco_stu import disco
from disco_stu.lookups import general_parameters as disco_pars

Then we need to create a special class using namedtuple(). This object will hold information about matching the objects between files:

50
51
52
53
54
55
56
MatchInfo = namedtuple(
    'MatchInfo', [
        'offset_radius',
        'match_radius',
        'min_matches',
        'degree'
        ])

We now create objects of MatchInfo class:

57
58
59
60
61
62
63
64
65
66
67
68
69
object_match_info = MatchInfo(
    disco_pars.OBJCAT_ALIGN_RADIUS[0],
    disco_pars.OBJCAT_ALIGN_RADIUS[1],
    None,
    disco_pars.OBJCAT_POLY_DEGREE
)

reference_match_info = MatchInfo(
    disco_pars.REFCAT_ALIGN_RADIUS[0],
    disco_pars.REFCAT_ALIGN_RADIUS[1],
    disco_pars.REFCAT_MIN_MATCHES,
    disco_pars.REFCAT_POLY_DEGREE
)

Finally, we call the disco() function and pass the arguments.

70
71
72
73
74
75
76
77
disco.disco(
    infiles=reduce_target.output_filenames,
    output_identifier="my_Kshort_stack",
    objmatch_info=object_match_info,
    refmatch_info=reference_match_info,
    pixel_scale=disco_pars.PIXEL_SCALE,
    skysub=False,
)

This function has many other parameters that can be used to customize this step but further details are out of the scope of this tutorial.

Tips and Tricks

This is a collection of tips and tricks that can be useful for reducing different data, or to do it slightly differently from what is presented in the example.

Sky Subtraction

For sky subtraction, there are two input parameters to skyCorrect that users should be aware of: scale_sky and offset_sky. Both serve to match the sky frames to the target frame before the subtraction. The first, scale_sky is multiplicative and is turned off by default for GSAOI, while the second, offset_sky is additive and is turned on by default for GSAOI.

The reason why offset_sky is favored for GSAOI is that often the flux in individual pixels can be very low and that is observed to make the multiplicative scale less accurate. In any case, from experience, it was found that offset_sky==True was more successful, more often, with GSAOI data, which is why it was set as the default.

Depending on the data and the science objectives, those two input parameters might have to be experimented with. The only combination we would not recommend is setting both of them on. (The software will not let you either.)

When there are offset to sky, it is likely to be because the target fills the field of view and there is no usable sky. In those cases, all sky scaling and offsetting should be turned off (skyCorrect:scale_sky=False and skyCorrect:offset_sky=False). There is no sky to measure in the target frame, any attempts at scaling or offsetting will result in an over subtraction of the sky.

Issues and Limitations

Memory Issues

Some primitives use a lot of RAM memory and they can cause a crash. Memory management in Python is notoriously difficult. The DRAGONS’s team is constantly trying to improve memory management within astrodata and the DRAGONS recipes and primitives. If an “Out of memory” crash happens to you, if possible for your observation sequence, try to run the pipeline on fewer images at the time, like for each dither pattern sequence separately.

Then to align and stack the pieces, run the alignAndStack recipe:

$ reduce @list_of_stacks -r alignAndStack

Double messaging issue

If you run the Reduce API without setting up a logger, you will notice that the output messages appear twice. To prevent this behaviour set up a logger. This will send one of the output stream to a file, keeping the other on the screen. We recommend using the DRAGONS logger located in the gempy.utils.logutils module and its config() function:

1
2
from gempy.utils import logutils
logutils.config(file_name='gsaoi_data_reduction.log')

Downloading from the Gemini Observatory Archive

For this tutorial we provide a pre-made package with all the necessary data. Here we show how one can search and download the data directly from the archive, like one would have to do for their own program.

If you are just interested in trying out the tutorial, we recommend that you download the pre-made package (Downloading the tutorial datasets) instead of getting everything manually.

Step by step instructions

For this tutorial we selected data observed for for the GS-2017A-Q-29 program on the night starting on May 04, 2017.

Science data

Access the GOA webpage.

In the search form, enter the following information:

  • Program ID: GS-2017A-Q-29
  • UTC Date: 20170504-20170505
  • Obs. Class: science

The search will return 16 files. Download them all by pressing the “Download all 16 files” button at the bottom.

Calibrations

Matching calibration files can be obtained by clicking on the Load Associated Calibrations tab. For this data, domeflats (lamp on and off) and a standard star observation.

The first four files are the standard star sequence. The other files are the lamp on and lamp off domeflats.

The table returned by the automatic calibration association has all that we need. Download everything by pressing the “Download all 34 files” button at the bottom.

Unpacking

Now, copy all the .tar files to the same place in your computer. Then use tar and bunzip2 commands to decompress them. For example:

$ cd ${path_to_my_data}/
$ tar -xf gemini_data.tar
$ bunzip2 *.fits.bz2

(The tar files names may differ slightly depending on how you selected and downloaded the data from the Gemini Archive.)

Note

If you are using the manually selected data to run the tutorial, please remember to put all the data in a directory called playdata, and create a parallel directory of running the tutorial called playground. The tutorial makes assumption as to where everything is located.

Indices and tables