7. Supplemental tools

DRAGONS provides a number of command line tools that users should find helpful in executing reduce on their data. Some of those tools also offer an API.

These supplemental tools can help users discover information, not only about their own data, but about the Recipe System, such as available recipes, primitives, and defined tags.

If your environment has been configured correctly these applications will work directly.

7.1. dataselect

The tool dataselect will help with the bookkeeping and with creating lists of input files to feed to the Recipe System. The tool has a command line script and an API. This tool finds files that match certain criteria defined with AstroData Tags and expressions involving AstroData Descriptors.

You can access the basic documentation from the command line by typing:

$ dataselect --help

usage: dataselect [-h] [--tags TAGS] [--xtags XTAGS] [--expr EXPRESSION] [--strict]
                  [--output OUTPUT] [--adpkg ADPKG] [--verbose] [--debug]
                  inputs [inputs ...]

Find files that matches certain criteria defined by tags and expression involving
descriptors.

positional arguments:
  inputs                Input FITS file

optional arguments:
  -h, --help            show this help message and exit
  --tags TAGS, -t TAGS  Comma-separated list of required tags.
  --xtags XTAGS         Comma-separated list of tags to exclude
  --expr EXPRESSION     Expression to apply to descriptors (and tags)
  --strict              Toggle on strict expression matching for exposure_time (not
                        just close) and for filter_name (match component number).
  --output OUTPUT, -o OUTPUT
                        Name of the output file
  --adpkg ADPKG         Name of the astrodata instrument package to useif not
                        gemini_instruments
  --verbose, -v         Toggle verbose mode when using -o
  --debug               Toggle debug mode

7.1.1. dataselect Command Line Tool

dataselect accepts list of input files separated by space, and wildcards. Below are some usage examples.

  1. This command selects all the FITS files inside the raw directory with a tag that matches DARK.

    $ dataselect raw/*.fits --tags DARK
    
  2. To select darks of a specific exposure time:

    $ dataselect raw/*.fits --tags DARK --expr='exposure_time==20'
    
  3. To send that list to a file that can be used later:

    $ dataselect raw/*.fits --tags DARK --expr='exposure_time==20' -o dark20s.lis
    
  4. This commands prints all the files in the current directory that do not have the CAL tag (calibration files).

    $ dataselect raw/*.fits --xtags CAL
    
  5. The xtags can be used with tags. To select images that are not flats:

    $ dataselect raw/*.fits --tags IMAGE --xtags FLAT
    
  6. This command selects all the files with a specific target name:

    $ dataselect --expr 'object=="FS 17"' raw/*.fits
    
  7. This command selects all the files with an “observation_class” descriptor that matches the “science” value and a specific exposure time:

    $ dataselect --expr '(observation_class=="science" and exposure_time==60.)' raw/*.fits
    

7.1.2. dataselect API

The same selections presented in the command line section above can be done from the dataselect API. Here is the API versions of the examples presented in the previous sections.

The list of files on disk must first be obtained with Python’s glob module.

>>> import glob
>>> all_files = glob.glob('raw/*.fits')

The dataselect module is located in gempy.adlibrary and must first be imported:

>>> from gempy.adlibrary import dataselect
  1. This command selects all the FITS files inside the raw directory with a tag that matches DARK.

    >>> all_darks = dataselect.select_data(all_files, ['DARK'])
    
  2. To select darks of a specific exposure time:

    >>> expression = 'exposure_time==20'
    >>> parsed_expr = dataselect.expr_parser(expression)
    >>> darks20 = dataselect.select_data(all_files, ['DARK'], [], parsed_expr)
    
  3. To send that list to a file that can be used later:

    >>> expression = 'exposure_time==20'
    >>> parsed_expr = dataselect.expr_parser(expression)
    >>> darks20 = dataselect.select_data(all_files, ['DARK'], [], parsed_expr)
    >>> with open('dark20s.lis', 'w') as f:
    ...     for filename in dark20:
    ...         f.write(filename + '\n')
    ...
    >>>
    

    Note that the need to send a list of a file on disk will probably not be very common when using the API as Reduce will take the Python list directly.

  4. This commands prints all the files in the current directory that do not have the CAL tag (calibration files).

    >>> non_cals = dataselect.select_data(all_files, [], ['CAL'])
    
  5. The xtags can be used with tags. To select images that are not flats:

    >>> has_tags = ['IMAGE']
    >>> has_not_tags = ['FLAT']
    >>> non_flat_images = dataselect.select_data(all_files, has_tags, has_not_tags)
    
  6. This command selects all the files with a specific target name:

    >>> expression = 'object="FS 17"'
    >>> parsed_expr = dataselect.expr_parser(expression)
    >>> stds = dataselect.select_data(all_files, expression=parsed_expr)
    
  7. This command selects all the files with an “observation_class” descriptor that matches the “science” value and a specific exposure time:

    >>> expression = '(observation_class=="science" and exposure_time==60.)'
    >>> parsed_expr = dataselect.expr_parser(expression)
    >>> sci60 = dataselect.select_data(all_files, expression=parsed_expr)
    

7.1.3. The strict Flag

The strict flag applies to the descriptors central_wavelength, detector_name, disperser, exposure_time(), filter_name(). To keep the user interface more friendly, in the expressions, the exposure time and central wavelength are matched on a “close enough” principle and the filter name, disperser and detector name are matched on the “pretty name” principle.

For example, if the exposure time in the header is 10.001 second, from a user’s perspective, asking to match “10” seconds is a lot nicer, exposure_time==10. Similarly, asking for the “H”-band filter is more natural than asking for the “H_G0203” filter.

However, there might be cases where the exposure time or the filter name must be matched exactly. In such case, the strict flag should be activated. For example:

$ dataselect raw/*.fits --strict --expr='exposure_time==0.95'

And:

>>> expression = 'exposure_time==0.95'
>>> parsed_expr = dataselect.expr_parser(expression, strict=True)
>>> filelist = dataselect.select_data(all_files, expression=parsed_expr)

7.2. showd

The showd command line tool helps the user gather information about files on disk. The “d” in showd stands for “descriptor”. showd is used to show the value of specific AstroData descriptors for the files requested.

Its basic usage can be printed using the following command:

$ showd --help
usage: showd [-h] --descriptors DESCRIPTORS [--long] [--csv] [--adpkg ADPKG] [--debug]
             [inputs ...]

For each input file, show the value of the specified descriptors.

positional arguments:
  inputs                Input FITS files

optional arguments:
  -h, --help            show this help message and exit
  --descriptors DESCRIPTORS, -d DESCRIPTORS
                        comma-separated list of descriptor values to return
  --long                Long format for the descriptor value
  --csv                 Format as CSV list.
  --adpkg ADPKG         Name of the astrodata instrument package to useif not
                        gemini_instruments
  --debug               Toggle debug mode
      --debug               Toggle debug mode

One or more descriptors can be printed together. Here is an example::

$ showd -d object,exposure_time *.fits
----------------------------------------------
filename                object   exposure_time
----------------------------------------------
N20160102S0275.fits    SN2014J          20.002
N20160102S0276.fits    SN2014J          20.002
N20160102S0277.fits    SN2014J          20.002
N20160102S0278.fits    SN2014J          20.002
N20160102S0279.fits    SN2014J          20.002
N20160102S0295.fits      FS 17          10.005
N20160102S0296.fits      FS 17          10.005
N20160102S0297.fits      FS 17          10.005
N20160102S0298.fits      FS 17          10.005
N20160102S0299.fits      FS 17          10.005

Above is a human-readable table. It is possible to return a comma-separated list, CSV list, with the --csv tag:

$ showd -d object,exposure_time *.fits --csv
filename,object,exposure_time
N20160102S0275.fits,SN2014J,20.002
N20160102S0276.fits,SN2014J,20.002
N20160102S0277.fits,SN2014J,20.002
N20160102S0278.fits,SN2014J,20.002
N20160102S0279.fits,SN2014J,20.002
N20160102S0295.fits,FS 17,10.005
N20160102S0296.fits,FS 17,10.005
N20160102S0297.fits,FS 17,10.005
N20160102S0298.fits,FS 17,10.005
N20160102S0299.fits,FS 17,10.005

The showd command also integrates well with dataselect. You can use dataselect together with showd if you want to print the descriptors values of a data subset:

$ dataselect raw/*.fits --tag FLAT | showd -d object,exposure_time
----------------------------------------------
filename                object   exposure_time
----------------------------------------------
N20160102S0363.fits   GCALflat          42.001
N20160102S0364.fits   GCALflat          42.001
N20160102S0365.fits   GCALflat          42.001
N20160102S0366.fits   GCALflat          42.001
N20160102S0367.fits   GCALflat          42.001

The “pipe” `` | `` gets the dataselect output and passes it to showd.

7.3. showrecipes

The Recipe System will select the best recipe for your data, which can be overriden when necessary. To see what sequence of primitives a recipe will execute or which recipes are available for the dataset, one can use showrecipes.

7.3.1. Show Recipe Content

To see the content of the best-matched default recipes:

$ showrecipes S20170505S0073.fits
Recipe not provided, default recipe (makeProcessedFlat) will be used.
Input file: /path_to/S20170505S0073.fits
Input tags: ['FLAT', 'LAMPOFF', 'AZEL_TARGET', 'IMAGE', 'DOMEFLAT',
'GSAOI', 'RAW', 'GEMINI', 'NON_SIDEREAL', 'CAL', 'UNPREPARED', 'SOUTH']
Input mode: sq
Input recipe: makeProcessedFlat
Matched recipe: geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedFlat
Recipe location: /path_to/dragons/geminidr/gsaoi/recipes/sq/recipes_FLAT_IMAGE.py
Recipe tags: set(['FLAT', 'IMAGE', 'GSAOI', 'CAL'])
Primitives used:
   p.prepare()
   p.addDQ()
   p.nonlinearityCorrect()
   p.ADUToElectrons()
   p.addVAR(read_noise=True, poisson_noise=True)
   p.makeLampFlat()
   p.normalizeFlat()
   p.thresholdFlatfield()
   p.storeProcessedFlat()

To see the content of a specific recipe:

$ showrecipes S20170505S0073.fits -r makeProcessedBPM
Input file: /path_to/S20170505S0073.fits
Input tags: ['FLAT', 'LAMPOFF', 'AZEL_TARGET', 'IMAGE', 'DOMEFLAT',
'GSAOI', 'RAW', 'GEMINI', 'NON_SIDEREAL', 'CAL', 'UNPREPARED', 'SOUTH']
Input mode: sq
Input recipe: makeProcessedBPM
Matched recipe: geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedBPM
Recipe location: /path_to/dragons/geminidr/gsaoi/recipes/sq/recipes_FLAT_IMAGE.pyc
Recipe tags: set(['FLAT', 'IMAGE', 'GSAOI', 'CAL'])
Primitives used:
   p.prepare()
   p.addDQ()
   p.addVAR(read_noise=True, poisson_noise=True)
   p.ADUToElectrons()
   p.selectFromInputs(tags="DARK", outstream="darks")
   p.selectFromInputs(tags="FLAT")
   p.stackFrames(stream="darks")
   p.makeLampFlat()
   p.normalizeFlat()
   p.makeBPM()

7.3.2. Show Index of Available Recipes

Of course in order to ask for a specific recipe, it is useful to know which recipes are available to the dataset. To see the index of available recipes:

$ showrecipes S20170505S0073.fits --all
Input file: /path_to/S20170505S0073.fits
Input tags: set(['FLAT', 'LAMPOFF', 'AZEL_TARGET', 'IMAGE', 'DOMEFLAT',
'GSAOI', 'RAW', 'GEMINI', 'NON_SIDEREAL', 'CAL', 'UNPREPARED', 'SOUTH'])
Recipes available for the input file:
   geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedBPM
   geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedFlat
   geminidr.gsaoi.recipes.qa.recipes_FLAT_IMAGE::makeProcessedFlat

The output shows that there are two recipes for the SQ (Science Quality) mode and one recipe for the QA (Quality Assesment) mode. By default, the Recipe System uses the SQ mode for processing the data.

As for the other commands, you can use the --help or -h flags on the command line to display the help message.

7.4. showpars

The showpars application is a simple command line utility allowing users to see the available parameters and defaults for a particular primitive function applicable to a given dataset. Since the applicable primitives for a dataset are dependent upon the tagset of the identified dataset (i.e. NIRI IMAGE , F2 SPECT , GMOS BIAS, etc.), which is to say, the kind of data we are looking at, the parameters available on a named primitive function can vary across data types, as can the primitive function itself. For example, F2 IMAGE stackFlats uses the generic implementation of the function, while GMOS IMAGE stackFlats overrides that generic method.

We examine the help on the command line of showpars:

$ showpars -h
usage: showpars [-h] [-v] [-d] [--adpkg ADPKG] [--drpkg DRPKG] filename primitive

Primitive parameter display, v3.1.0

positional arguments:
  filename       filename
  primitive      primitive name

optional arguments:
  -h, --help     show this help message and exit
  -v, --version  show program's version number and exit
  -d, --doc      show the full docstring
  --adpkg ADPKG  Name of the astrodata instrument package to use if not
                 gemini_instruments
  --drpkg DRPKG  Name of the DRAGONS instrument package to use if not geminidr

Two arguments are required: the dataset filename, and the primitive name of interest. As readers will note, showpars provides a wealth of information about the available parameters on the specified primitive, including allowable values or ranges of values:

$ showpars S20180516S0237.fits stackFlats
Dataset tagged as set(['RAW', 'GMOS', 'GEMINI', 'SIDEREAL', 'FLAT',
'UNPREPARED', 'IMAGE', 'CAL', 'TWILIGHT', 'SOUTH'])
Settable parameters on 'stackFlats':
========================================
 Name                       Current setting

suffix               '_stack'             Filename suffix
apply_dq             True                 Use DQ to mask bad pixels?
scale                False                Scale images to the same intensity?
operation            'mean'               Averaging operation
Allowed values:
    wtmean  variance-weighted mean
    mean    arithmetic mean
    median  median
    lmedian low-median

reject_method        'minmax'             Pixel rejection method
Allowed values:
    minmax  reject highest and lowest pixels
    none    no rejection
    varclip reject pixels based on variance array
    sigclip reject pixels based on scatter

hsigma               3.0                  High rejection threshold (sigma)
    Valid Range = [0,inf)
lsigma               3.0                  Low rejection threshold (sigma)
    Valid Range = [0,inf)
mclip                True                 Use median for sigma-clipping?
max_iters            None                 Maximum number of clipping iterations
    Valid Range = [1,inf)
nlow                 0                    Number of low pixels to reject
    Valid Range = [0,inf)
nhigh                0                    Number of high pixels to reject
    Valid Range = [0,inf)
memory               None                 Memory available for stacking (GB)
    Valid Range = [0.1,inf)

With this information, users can adjust parameters for particular primitive functions. As we have seen already, this can be done from the reduce command line or the Reduce class. Building on material covered in this manual, and continuing our example from above::

$ reduce -p stackFlats:nhigh=3 <fitsfiles> [ <fitsfile>, ... ]

And the reduction proceeds. When the stackFlats primitive begins, the new value for nhigh will be used.

Note

Advanced User. Inheritance and class overrides within the primitive and parameter hierarchies means that one cannot simply look at any given primitive function and its parameters and extrapolate those to all such named primitives and parameters. Primitives and their parameters are tied to the particular classes designed for those datasets identified as a particular kind of data.

7.5. typewalk

The typewalk application examines files in a directory or directory tree and reports the data classifications through the astrodata tag sets. By default, typewalk will recurse all subdirectories under the current directory. Users may specify an explicit directory with the -d, --dir option.

typewalk supports the following options:

-h, --help            show this help message and exit
-b BATCHNUM, --batch BATCHNUM
                      In shallow walk mode, number of files to process at a time in
                      the current directory. Controls behavior in large data
                      directories. Default = 100.
-d TWDIR, --dir TWDIR
                      Walk this directory and report types. default is cwd.
-f FILEMASK, --filemask FILEMASK
                      Show files matching regex <FILEMASK>. Default is all .fits and
                      .FITS files.
-n, --norecurse       Do not recurse subdirectories.
--or                  Use OR logic on 'types' criteria. If not specified, matching
                      logic is AND (See --types). Eg., --or --types SOUTH GMOS IMAGE
                      will report datasets that are one of SOUTH *OR* GMOS *OR*
                      IMAGE.
-o OUTFILE, --out OUTFILE
                      Write reported files to this file. Effective only with --tags
                      option.
--tags TAGS [TAGS ...]
                      Find datasets that match only these tag criteria. Eg., --tags
                      SOUTH GMOS IMAGE will report datasets that are all tagged
                      SOUTH *and* GMOS *and* IMAGE.
--xtags XTAGS [XTAGS ...]
                      Exclude <xtags> from reporting.
--adpkg ADPKG         Name of the astrodata instrument package to useif not
                      gemini_instruments

Files are selected and reported through a regular expression mask which, by default, finds all “.fits” and “.FITS” files. Users can change this mask with the -f, –filemask option.

As the –tags option indicates, typewalk can find and report data that match specific tag criteria. For example, a user might want to find all GMOS image flats (--tags GMOS IMAGE FLAT) under a certain directory. typewalk will locate and report all datasets that would match the AstroData tags, set(['GMOS', 'IMAGE', 'FLAT']).

A user may request that an output file be written containing all datasets matching AstroData tag qualifiers passed by the –tags option. An output file is specified through the -o, –out option. Output files are formatted so they may be passed directly to the reduce command line via that applications ‘at-file’ (@file) facility. See The @file Facility or the reduce help for more on ‘at-files’. However, for such use, dataselect is probably preferable as it is more versatile than typewalk.

Users may select tag matching logic with the –or switch. By default, qualifying logic is AND, i.e. the logic specifies that all tags must be present (x AND y); –or specifies that ANY tags, enumerated with –tags, may be present (x OR y). –or is only effective when the –tags option is specified with more than one tag.

As a simple example, find all F2 SPECT datasets in a directory tree:

$ typewalk --tags SPECT F2

Users may find the –xtags flag useful, as it provides a facility for filtering results further by allowing certain tags to be excluded from the report.

For example, find GMOS, IMAGE tag sets, but exclude ACQUISITION images from reporting:

$ typewalk --tags GMOS IMAGE --xtags ACQUISITION

directory: ../test_data/output
   S20131010S0105.fits .............. (GEMINI) (SOUTH) (GMOS) (IMAGE) (RAW)
   (SIDEREAL) (UNPREPARED)

   S20131010S0105_forFringe.fits .... (GEMINI) (SOUTH) (GMOS)
   (IMAGE) (NEEDSFLUXCAL) (OVERSCAN_SUBTRACTED) (OVERSCAN_TRIMMED)
   (PREPARED) (PROCESSED_SCIENCE) (SIDEREAL)

   S20131010S0105_forStack.fits ...... (GEMINI) (SOUTH) (GMOS) (IMAGE)
   (NEEDSFLUXCAL) (OVERSCAN_SUBTRACTED) (OVERSCAN_TRIMMED)
   (PREPARED) (SIDEREAL)