2.2. The AstroData Object
The AstroData
object is an internal representation of a file on disk.
As of this version, only a FITS layer has been written, but AstroData
itself
is not limited to FITS.
The internal structure of the AstroData
object makes uses of
astropy.nddata.NDData
, astropy.table
, and
astropy.io.fits.Header
, the latter simply because it is a
convenient ordered dictionary.
Try it yourself
Download the data package (Try it yourself) if you wish to follow along and run the examples. Then
$ cd <path>/ad_usermanual/playground
$ python
2.2.1. Global vs Extension-specific
At the very top level, the structure is divided in two types of information. In the first category, there is the information that applies to the data globally, for example the information that would be stored in a FITS Primary Header Unit, a table from a catalog that matches the RA and DEC of the field, etc. In the second category, there is the information specific to individual science pixel extensions, for example the gain of the amplifier, the data themselves, the error on those data, etc.
Let us look at an example. The info()
method shows
the content of the AstroData
object and its organization, from the user’s
perspective.:
>>> import astrodata
>>> import gemini_instruments
>>> ad = astrodata.open('../playdata/N20170609S0154_varAdded.fits')
>>> ad.info()
Filename: N20170609S0154_varAdded.fits
Tags: ACQUISITION GEMINI GMOS IMAGE NORTH OVERSCAN_SUBTRACTED OVERSCAN_TRIMMED
PREPARED SIDEREAL
Pixels Extensions
Index Content Type Dimensions Format
[ 0] science NDAstroData (2112, 256) float32
.variance ndarray (2112, 256) float32
.mask ndarray (2112, 256) uint16
.OBJCAT Table (6, 43) n/a
.OBJMASK ndarray (2112, 256) uint8
[ 1] science NDAstroData (2112, 256) float32
.variance ndarray (2112, 256) float32
.mask ndarray (2112, 256) uint16
.OBJCAT Table (8, 43) n/a
.OBJMASK ndarray (2112, 256) uint8
[ 2] science NDAstroData (2112, 256) float32
.variance ndarray (2112, 256) float32
.mask ndarray (2112, 256) uint16
.OBJCAT Table (7, 43) n/a
.OBJMASK ndarray (2112, 256) uint8
[ 3] science NDAstroData (2112, 256) float32
.variance ndarray (2112, 256) float32
.mask ndarray (2112, 256) uint16
.OBJCAT Table (5, 43) n/a
.OBJMASK ndarray (2112, 256) uint8
Other Extensions
Type Dimensions
.REFCAT Table (245, 16)
The “Pixel Extensions” contain the pixel data. Each extension is represented
individually in a list (0-indexed like all Python lists). The science pixel
data, its associated metadata (extension header), and any other pixel or table
extensions directly associated with that science pixel data are stored in
a NDAstroData
object which is a subclass of astropy NDData
. We will
return to this structure later. An AstroData
extension is accessed like
any list: ad[0]
. To access the science pixels, one uses ad[0].data
; for
the object mask of the first extension, ad[0].OBJMASK
.
In the example above, the “Other Extensions” at the bottom of the
info()
display contains a REFCAT
table which in
this case is a list of stars from a catalog that overlaps the field of view
covered by the pixel data. The “Other Extensions” are global extensions. They
are not attached to any pixel extension in particular. To access a global
extension one simply uses the name of that extension: ad.REFCAT
.
2.2.2. Organization of the Global Information
All the global information is stored in attributes of the AstroData
object.
The global headers, or Primary Header Unit (PHU), is stored in the phu
attribute as an astropy.io.fits.Header
.
Any global tables, like REFCAT
above, are stored in the private attribute
_tables
as a Python dictionary with the name (eg. “REFCAT”) as the key.
All tables are stored as astropy.table.Table
. Access to those table
is done using the key directly as if it were a normal attribute, eg.
ad.REFCAT
. Header information for the table, if read in from a FITS table,
is stored in the meta
attribute of the astropy.table.Table
, eg.
ad.REFCAT.meta['header']
. It is for information only, it is not used.
2.2.3. Organization of the Extension-specific Information
The pixel data are stored in the AstroData
attribute nddata
as a list
of NDAstroData
object. The NDAstroData
object is a subclass of astropy
NDData
and it is fully compatible with any function expecting an NDData
as
input. The pixel extensions are accessible through slicing, eg. ad[0]
or
even ad[0:2]
. A slice of an AstroData object is an AstroData object, and
all the global attributes are kept. For example:
>>> ad[0].info()
Filename: N20170609S0154_varAdded.fits
Tags: ACQUISITION GEMINI GMOS IMAGE NORTH OVERSCAN_SUBTRACTED OVERSCAN_TRIMMED
PREPARED SIDEREAL
Pixels Extensions
Index Content Type Dimensions Format
[ 0] science NDAstroData (2112, 256) float32
.variance ndarray (2112, 256) float32
.mask ndarray (2112, 256) uint16
.OBJCAT Table (6, 43) n/a
.OBJMASK ndarray (2112, 256) uint8
Other Extensions
Type Dimensions
.REFCAT Table (245, 16)
Note how REFCAT
is still present.
The science data is accessed as ad[0].data
, the variance as ad[0].variance
,
and the data quality plane as ad[0].mask
. Those familiar with astropy
NDData
will recognize the structure “data, error, mask”, and will notice
some differences. First AstroData
uses the variance for the error plane, not
the standard deviation. Another difference will be evident only when one looks
at the content of the mask. NDData
masks contain booleans, AstroData
masks
are uint16
bit mask that contains information about the type of bad pixels
rather than just flagging them a bad or not. Since 0
is equivalent to
False
(good pixel), the AstroData
mask is fully compatible with the
NDData
mask.
Header information for the extension is stored in the NDAstroData
meta
attribute. All table and pixel extensions directly associated with the
science extension are also stored in the meta
attribute.
Technically, an extension header is located in ad.nddata[0].meta['header']
.
However, for obviously needed convenience, the normal way to access that header
is ad[0].hdr
.
Tables and pixel arrays associated with a science extension are
stored in ad.nddata[0].meta['other']
as a dictionary keyed on the array
name, eg. OBJCAT
, OBJMASK
. As it is for global tables, astropy tables
are used for extension tables. The extension tables and extra pixel arrays are
accessed, like the global tables, by using the table name rather than the long
format, for example ad[0].OBJCAT
and ad[0].OBJMASK
.
When reading a FITS Table, the header information is stored in the
meta['header']
of the table, eg. ad[0].OBJCAT.meta['header']
. That
information is not used, it is simply a place to store what was read from disk.
The header of a pixel extension directly associated with the science extension
should match that of the science extension. Therefore such headers are not
stored in AstroData
. For example, the header of ad[0].OBJMASK
is the
same as that of the science, ad[0].hdr
.
The world coordinate system (WCS) is stored internally in the wcs
attribute
of the NDAstroData
object. It is constructed from the header keywords when
the FITS file is read from disk, or directly from the WCS
extension if
present (see the next chapter). If the WCS is modified (for
example, by refining the pointing or attaching a more accurate wavelength
calibration), the FITS header keywords are not updated and therefore they should
never be used to determine the world coordinates of any pixel. These keywords are
only updated when the object is written to disk as a FITS file. The WCS is
retrieved as follows: ad[0].wcs
.
2.2.4. A Note on Memory Usage
When an file is opened, the headers are loaded into memory, but the pixels are not. The pixel data are loaded into memory only when they are first needed. This is not real “memory mapping”, more of a delayed loading. This is useful when someone is only interested in the metadata, especially when the files are very large.