7. Supplemental tools
DRAGONS provides a number of command line tools that users should find helpful
in executing reduce
on their data. Some of those tools also offer an API.
These supplemental tools can help users discover information, not only about their own data, but about the Recipe System, such as available recipes, primitives, and defined tags.
If your environment has been configured correctly these applications will work directly.
7.1. dataselect
The tool dataselect
will help with the bookkeeping and with creating lists
of input files to feed to the Recipe System. The tool has a command line
script and an API. This tool finds files that match certain criteria defined
with AstroData Tags and expressions involving AstroData Descriptors.
You can access the basic documentation from the command line by typing:
$ dataselect --help
usage: dataselect [-h] [--tags TAGS] [--xtags XTAGS] [--expr EXPRESSION] [--strict]
[--output OUTPUT] [--adpkg ADPKG] [--verbose] [--debug]
inputs [inputs ...]
Find files that matches certain criteria defined by tags and expression involving
descriptors.
positional arguments:
inputs Input FITS file
optional arguments:
-h, --help show this help message and exit
--tags TAGS, -t TAGS Comma-separated list of required tags.
--xtags XTAGS Comma-separated list of tags to exclude
--expr EXPRESSION Expression to apply to descriptors (and tags)
--strict Toggle on strict expression matching for exposure_time (not
just close) and for filter_name (match component number).
--output OUTPUT, -o OUTPUT
Name of the output file
--adpkg ADPKG Name of the astrodata instrument package to useif not
gemini_instruments
--verbose, -v Toggle verbose mode when using -o
--debug Toggle debug mode
7.1.1. dataselect
Command Line Tool
dataselect
accepts list of input files separated by space, and wildcards.
Below are some usage examples.
This command selects all the FITS files inside the
raw
directory with a tag that matchesDARK
.$ dataselect raw/*.fits --tags DARK
To select darks of a specific exposure time:
$ dataselect raw/*.fits --tags DARK --expr='exposure_time==20'
To send that list to a file that can be used later:
$ dataselect raw/*.fits --tags DARK --expr='exposure_time==20' -o dark20s.lis
This commands prints all the files in the current directory that do not have the
CAL
tag (calibration files).$ dataselect raw/*.fits --xtags CAL
The
xtags
can be used withtags
. To select images that are not flats:$ dataselect raw/*.fits --tags IMAGE --xtags FLAT
This command selects all the files with a specific target name:
$ dataselect --expr 'object=="FS 17"' raw/*.fits
This command selects all the files with an “observation_class” descriptor that matches the “science” value and a specific exposure time:
$ dataselect --expr '(observation_class=="science" and exposure_time==60.)' raw/*.fits
7.1.2. dataselect
API
The same selections presented in the command line section above can be done
from the dataselect
API. Here is the API versions of the examples
presented in the previous sections.
The list of files on disk must first be obtained with Python’s glob
module.
>>> import glob
>>> all_files = glob.glob('raw/*.fits')
The dataselect
module is located in gempy.adlibrary
and must first be
imported:
>>> from gempy.adlibrary import dataselect
This command selects all the FITS files inside the
raw
directory with a tag that matchesDARK
.>>> all_darks = dataselect.select_data(all_files, ['DARK'])
To select darks of a specific exposure time:
>>> expression = 'exposure_time==20' >>> parsed_expr = dataselect.expr_parser(expression) >>> darks20 = dataselect.select_data(all_files, ['DARK'], [], parsed_expr)
To send that list to a file that can be used later:
>>> expression = 'exposure_time==20' >>> parsed_expr = dataselect.expr_parser(expression) >>> darks20 = dataselect.select_data(all_files, ['DARK'], [], parsed_expr) >>> with open('dark20s.lis', 'w') as f: ... for filename in dark20: ... f.write(filename + '\n') ... >>>
Note that the need to send a list of a file on disk will probably not be very common when using the API as
Reduce
will take the Python list directly.This commands prints all the files in the current directory that do not have the
CAL
tag (calibration files).>>> non_cals = dataselect.select_data(all_files, [], ['CAL'])
The
xtags
can be used withtags
. To select images that are not flats:>>> has_tags = ['IMAGE'] >>> has_not_tags = ['FLAT'] >>> non_flat_images = dataselect.select_data(all_files, has_tags, has_not_tags)
This command selects all the files with a specific target name:
>>> expression = 'object="FS 17"' >>> parsed_expr = dataselect.expr_parser(expression) >>> stds = dataselect.select_data(all_files, expression=parsed_expr)
This command selects all the files with an “observation_class” descriptor that matches the “science” value and a specific exposure time:
>>> expression = '(observation_class=="science" and exposure_time==60.)' >>> parsed_expr = dataselect.expr_parser(expression) >>> sci60 = dataselect.select_data(all_files, expression=parsed_expr)
7.1.3. The strict
Flag
The strict
flag applies to the descriptors central_wavelength
,
detector_name
, disperser
, exposure_time()
, filter_name()
.
To keep the user interface more friendly, in the expressions, the exposure
time and central wavelength are matched on a “close enough” principle and
the filter name, disperser and detector name are matched on the
“pretty name” principle.
For example, if the exposure time in the header is 10.001 second, from a user’s
perspective, asking to match “10” seconds is a lot nicer, exposure_time==10
.
Similarly, asking for the “H”-band filter is more natural than asking for the
“H_G0203” filter.
However, there might be cases where the exposure time or the filter name must
be matched exactly. In such case, the strict
flag should be activated.
For example:
$ dataselect raw/*.fits --strict --expr='exposure_time==0.95'
And:
>>> expression = 'exposure_time==0.95'
>>> parsed_expr = dataselect.expr_parser(expression, strict=True)
>>> filelist = dataselect.select_data(all_files, expression=parsed_expr)
7.2. showd
The showd
command line tool helps the user gather information about files
on disk. The “d” in showd
stands for “descriptor”. showd
is used to
show the value of specific AstroData descriptors for the files requested.
Its basic usage can be printed using the following command:
$ showd --help
usage: showd [-h] --descriptors DESCRIPTORS [--long] [--csv] [--adpkg ADPKG] [--debug]
[inputs ...]
For each input file, show the value of the specified descriptors.
positional arguments:
inputs Input FITS files
optional arguments:
-h, --help show this help message and exit
--descriptors DESCRIPTORS, -d DESCRIPTORS
comma-separated list of descriptor values to return
--long Long format for the descriptor value
--csv Format as CSV list.
--adpkg ADPKG Name of the astrodata instrument package to useif not
gemini_instruments
--debug Toggle debug mode
--debug Toggle debug mode
One or more descriptors can be printed together. Here is an example::
$ showd -d object,exposure_time *.fits
----------------------------------------------
filename object exposure_time
----------------------------------------------
N20160102S0275.fits SN2014J 20.002
N20160102S0276.fits SN2014J 20.002
N20160102S0277.fits SN2014J 20.002
N20160102S0278.fits SN2014J 20.002
N20160102S0279.fits SN2014J 20.002
N20160102S0295.fits FS 17 10.005
N20160102S0296.fits FS 17 10.005
N20160102S0297.fits FS 17 10.005
N20160102S0298.fits FS 17 10.005
N20160102S0299.fits FS 17 10.005
Above is a human-readable table. It is possible to return a comma-separated
list, CSV list, with the --csv
tag:
$ showd -d object,exposure_time *.fits --csv
filename,object,exposure_time
N20160102S0275.fits,SN2014J,20.002
N20160102S0276.fits,SN2014J,20.002
N20160102S0277.fits,SN2014J,20.002
N20160102S0278.fits,SN2014J,20.002
N20160102S0279.fits,SN2014J,20.002
N20160102S0295.fits,FS 17,10.005
N20160102S0296.fits,FS 17,10.005
N20160102S0297.fits,FS 17,10.005
N20160102S0298.fits,FS 17,10.005
N20160102S0299.fits,FS 17,10.005
The showd
command also integrates well with dataselect
. You can use
dataselect together with showd
if you want to print
the descriptors values of a data subset:
$ dataselect raw/*.fits --tag FLAT | showd -d object,exposure_time
----------------------------------------------
filename object exposure_time
----------------------------------------------
N20160102S0363.fits GCALflat 42.001
N20160102S0364.fits GCALflat 42.001
N20160102S0365.fits GCALflat 42.001
N20160102S0366.fits GCALflat 42.001
N20160102S0367.fits GCALflat 42.001
The “pipe” `` | `` gets the dataselect
output and passes it to showd
.
7.3. showrecipes
The Recipe System will select the best recipe for your data, which
can be overriden when necessary. To see what sequence of primitives a
recipe will execute or which recipes are available for the dataset, one
can use showrecipes
.
7.3.1. Show Recipe Content
To see the content of the best-matched default recipes:
$ showrecipes S20170505S0073.fits
Recipe not provided, default recipe (makeProcessedFlat) will be used.
Input file: /path_to/S20170505S0073.fits
Input tags: ['FLAT', 'LAMPOFF', 'AZEL_TARGET', 'IMAGE', 'DOMEFLAT',
'GSAOI', 'RAW', 'GEMINI', 'NON_SIDEREAL', 'CAL', 'UNPREPARED', 'SOUTH']
Input mode: sq
Input recipe: makeProcessedFlat
Matched recipe: geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedFlat
Recipe location: /path_to/dragons/geminidr/gsaoi/recipes/sq/recipes_FLAT_IMAGE.py
Recipe tags: set(['FLAT', 'IMAGE', 'GSAOI', 'CAL'])
Primitives used:
p.prepare()
p.addDQ()
p.nonlinearityCorrect()
p.ADUToElectrons()
p.addVAR(read_noise=True, poisson_noise=True)
p.makeLampFlat()
p.normalizeFlat()
p.thresholdFlatfield()
p.storeProcessedFlat()
To see the content of a specific recipe:
$ showrecipes S20170505S0073.fits -r makeProcessedBPM
Input file: /path_to/S20170505S0073.fits
Input tags: ['FLAT', 'LAMPOFF', 'AZEL_TARGET', 'IMAGE', 'DOMEFLAT',
'GSAOI', 'RAW', 'GEMINI', 'NON_SIDEREAL', 'CAL', 'UNPREPARED', 'SOUTH']
Input mode: sq
Input recipe: makeProcessedBPM
Matched recipe: geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedBPM
Recipe location: /path_to/dragons/geminidr/gsaoi/recipes/sq/recipes_FLAT_IMAGE.pyc
Recipe tags: set(['FLAT', 'IMAGE', 'GSAOI', 'CAL'])
Primitives used:
p.prepare()
p.addDQ()
p.addVAR(read_noise=True, poisson_noise=True)
p.ADUToElectrons()
p.selectFromInputs(tags="DARK", outstream="darks")
p.selectFromInputs(tags="FLAT")
p.stackFrames(stream="darks")
p.makeLampFlat()
p.normalizeFlat()
p.makeBPM()
7.3.2. Show Index of Available Recipes
Of course in order to ask for a specific recipe, it is useful to know which recipes are available to the dataset. To see the index of available recipes:
$ showrecipes S20170505S0073.fits --all
Input file: /path_to/S20170505S0073.fits
Input tags: set(['FLAT', 'LAMPOFF', 'AZEL_TARGET', 'IMAGE', 'DOMEFLAT',
'GSAOI', 'RAW', 'GEMINI', 'NON_SIDEREAL', 'CAL', 'UNPREPARED', 'SOUTH'])
Recipes available for the input file:
geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedBPM
geminidr.gsaoi.recipes.sq.recipes_FLAT_IMAGE::makeProcessedFlat
geminidr.gsaoi.recipes.qa.recipes_FLAT_IMAGE::makeProcessedFlat
The output shows that there are two recipes for the SQ (Science Quality) mode and one recipe for the QA (Quality Assesment) mode. By default, the Recipe System uses the SQ mode for processing the data.
As for the other commands, you can use the --help
or -h
flags on
the command line to display the help message.
7.4. showpars
The showpars
application is a simple command line utility allowing users
to see the available parameters and defaults for a particular primitive
function applicable to a given dataset. Since the applicable primitives
for a dataset are dependent upon the tagset of the identified dataset
(i.e. NIRI IMAGE
, F2 SPECT
, GMOS BIAS
, etc.), which is
to say, the kind of data we are looking at, the parameters available on a
named primitive function can vary across data types, as can the primitive function
itself. For example, F2 IMAGE stackFlats
uses the generic implementation of
the function, while GMOS IMAGE stackFlats
overrides that generic method.
We examine the help on the command line of showpars:
$ showpars -h
usage: showpars [-h] [-v] [-d] [--adpkg ADPKG] [--drpkg DRPKG] filename primitive
Primitive parameter display, v3.1.0
positional arguments:
filename filename
primitive primitive name
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-d, --doc show the full docstring
--adpkg ADPKG Name of the astrodata instrument package to use if not
gemini_instruments
--drpkg DRPKG Name of the DRAGONS instrument package to use if not geminidr
Two arguments are required: the dataset filename, and the primitive name of
interest. As readers will note, showpars
provides a wealth of information
about the available parameters on the specified primitive, including allowable
values or ranges of values:
$ showpars S20180516S0237.fits stackFlats
Dataset tagged as set(['RAW', 'GMOS', 'GEMINI', 'SIDEREAL', 'FLAT',
'UNPREPARED', 'IMAGE', 'CAL', 'TWILIGHT', 'SOUTH'])
Settable parameters on 'stackFlats':
========================================
Name Current setting
suffix '_stack' Filename suffix
apply_dq True Use DQ to mask bad pixels?
scale False Scale images to the same intensity?
operation 'mean' Averaging operation
Allowed values:
wtmean variance-weighted mean
mean arithmetic mean
median median
lmedian low-median
reject_method 'minmax' Pixel rejection method
Allowed values:
minmax reject highest and lowest pixels
none no rejection
varclip reject pixels based on variance array
sigclip reject pixels based on scatter
hsigma 3.0 High rejection threshold (sigma)
Valid Range = [0,inf)
lsigma 3.0 Low rejection threshold (sigma)
Valid Range = [0,inf)
mclip True Use median for sigma-clipping?
max_iters None Maximum number of clipping iterations
Valid Range = [1,inf)
nlow 0 Number of low pixels to reject
Valid Range = [0,inf)
nhigh 0 Number of high pixels to reject
Valid Range = [0,inf)
memory None Memory available for stacking (GB)
Valid Range = [0.1,inf)
With this information, users can adjust parameters for particular primitive
functions. As we have seen already, this can be done from the reduce
command line or the Reduce
class. Building on material covered in this
manual, and continuing our example from above::
$ reduce -p stackFlats:nhigh=3 <fitsfiles> [ <fitsfile>, ... ]
And the reduction proceeds. When the stackFlats
primitive begins, the
new value for nhigh
will be used.
Note
Advanced User. Inheritance and class overrides within the primitive and parameter hierarchies means that one cannot simply look at any given primitive function and its parameters and extrapolate those to all such named primitives and parameters. Primitives and their parameters are tied to the particular classes designed for those datasets identified as a particular kind of data.
7.5. typewalk
The typewalk
application examines files in a directory or directory tree
and reports the data classifications through the astrodata
tag sets. By
default, typewalk will recurse all subdirectories under the current
directory. Users may specify an explicit directory with the -d, --dir
option.
typewalk
supports the following options:
-h, --help show this help message and exit
-b BATCHNUM, --batch BATCHNUM
In shallow walk mode, number of files to process at a time in
the current directory. Controls behavior in large data
directories. Default = 100.
-d TWDIR, --dir TWDIR
Walk this directory and report types. default is cwd.
-f FILEMASK, --filemask FILEMASK
Show files matching regex <FILEMASK>. Default is all .fits and
.FITS files.
-n, --norecurse Do not recurse subdirectories.
--or Use OR logic on 'types' criteria. If not specified, matching
logic is AND (See --types). Eg., --or --types SOUTH GMOS IMAGE
will report datasets that are one of SOUTH *OR* GMOS *OR*
IMAGE.
-o OUTFILE, --out OUTFILE
Write reported files to this file. Effective only with --tags
option.
--tags TAGS [TAGS ...]
Find datasets that match only these tag criteria. Eg., --tags
SOUTH GMOS IMAGE will report datasets that are all tagged
SOUTH *and* GMOS *and* IMAGE.
--xtags XTAGS [XTAGS ...]
Exclude <xtags> from reporting.
--adpkg ADPKG Name of the astrodata instrument package to useif not
gemini_instruments
Files are selected and reported through a regular expression mask which, by default, finds all “.fits” and “.FITS” files. Users can change this mask with the -f, –filemask option.
As the –tags option indicates, typewalk
can find and report data
that match specific tag criteria. For example, a user might want to find
all GMOS image flats (--tags GMOS IMAGE FLAT
) under a certain directory.
typewalk
will locate and report all datasets that would match the
AstroData tags, set(['GMOS', 'IMAGE', 'FLAT'])
.
A user may request that an output file be written containing all datasets
matching AstroData tag qualifiers passed by the –tags option. An output
file is specified through the -o, –out option. Output files are
formatted so they may be passed directly to the reduce command line via
that applications ‘at-file’ (@file) facility. See The @file Facility or the reduce
help for more on ‘at-files’. However, for such use, dataselect is
probably preferable as it is more versatile than typewalk
.
Users may select tag matching logic with the –or switch. By default, qualifying logic is AND, i.e. the logic specifies that all tags must be present (x AND y); –or specifies that ANY tags, enumerated with –tags, may be present (x OR y). –or is only effective when the –tags option is specified with more than one tag.
As a simple example, find all F2 SPECT datasets in a directory tree:
$ typewalk --tags SPECT F2
Users may find the –xtags flag useful, as it provides a facility for filtering results further by allowing certain tags to be excluded from the report.
For example, find GMOS, IMAGE tag sets, but exclude ACQUISITION images from reporting:
$ typewalk --tags GMOS IMAGE --xtags ACQUISITION
directory: ../test_data/output
S20131010S0105.fits .............. (GEMINI) (SOUTH) (GMOS) (IMAGE) (RAW)
(SIDEREAL) (UNPREPARED)
S20131010S0105_forFringe.fits .... (GEMINI) (SOUTH) (GMOS)
(IMAGE) (NEEDSFLUXCAL) (OVERSCAN_SUBTRACTED) (OVERSCAN_TRIMMED)
(PREPARED) (PROCESSED_SCIENCE) (SIDEREAL)
S20131010S0105_forStack.fits ...... (GEMINI) (SOUTH) (GMOS) (IMAGE)
(NEEDSFLUXCAL) (OVERSCAN_SUBTRACTED) (OVERSCAN_TRIMMED)
(PREPARED) (SIDEREAL)