spacepy.pycdf.CDF

class spacepy.pycdf.CDF(pathname, masterpath=None, create=None, readonly=None, encoding='utf-8')[source]

Python object representing a CDF file.

Open or create a CDF file by creating an object of this class.

Parameters:
pathnamestring

name of the file to open or create

masterpathstring

name of the master CDF file to use in creating a new file. If not provided, an existing file is opened; if provided but evaluates to False (e.g., ''), an empty new CDF is created.

createbool

Create a new CDF even if masterpath isn’t provided

readonlybool

Open the CDF read-only. Default True if opening an existing CDF; False if creating a new one. A readonly CDF with many variables may be slow to close on CDF library versions before 3.8.1. See readonly().

encodingstr, optional

Text encoding to use when reading and writing strings. Default 'utf-8'.

Raises:
CDFError

if CDF library reports an error

Warns:
CDFWarning

if CDF library reports a warning and interpreter is set to error on warnings.

Examples

Open a CDF by creating a CDF object, e.g.:

>>> cdffile = pycdf.CDF('cdf_filename.cdf')

Be sure to close() or save() when done.

Note

Existing CDF files are opened read-only by default, see readonly() to change.

CDF supports the with keyword, like other file objects, so:

>>> with pycdf.CDF('cdf_filename.cdf') as cdffile:
...     #do brilliant things with the CDF

will open the CDF, execute the indented statements, and close the CDF when finished or when an error occurs. The python docs include more detail on this ‘context manager’ ability.

CDF objects behave like a python dictionary, where the keys are names of variables in the CDF, and the values, Var objects. As a dictionary, they are also iterable and it is easy to loop over all of the variables in a file. Some examples:

  1. List the names of all variables in the open CDF cdffile:

    >>> cdffile.keys()
    >>> for k in cdffile: #Alternate
    ...     print(k)
    
  2. Get a Var object for the variable named Epoch:

    >>> epoch = cdffile['Epoch']
    
  3. Determine if a CDF contains a variable named B_GSE:

    >>> if 'B_GSE' in cdffile:
    ...     print('B_GSE is in the file')
    ... else:
    ...     print('B_GSE is not in the file')
    
  4. Find how many variables are in the file:

    >>> print(len(cdffile))
    
  5. Delete the variable Epoch from the open CDF file cdffile:

    >>> del cdffile['Epoch']
    
  6. Display a summary of variables and types in open CDF file cdffile:

    >>> print(cdffile)
    
  7. Open the CDF named cdf_filename.cdf, read all the data from all variables into dictionary data, and close it when done or if an error occurs:

    >>> with pycdf.CDF('cdf_filename.cdf') as cdffile:
    ...     data = cdffile.copy()
    

This last example can be very inefficient as it reads the entire CDF. Normally it’s better to treat the CDF as a dictionary and access only the data needed, which will be pulled transparently from disc. See Var for more subtle examples.

Potentially useful dictionary methods and related functions:

The CDF user’s guide section 2.2 has more background information on CDF files.

The attrs Python attribute acts as a dictionary referencing CDF attributes (do not confuse the two); all the dictionary methods above also work on the attribute dictionary. See gAttrList for more on the dictionary of global attributes.

Creating a new CDF from a master (skeleton) CDF has similar syntax to opening one:

>>> cdffile = pycdf.CDF('cdf_filename.cdf', 'master_cdf_filename.cdf')

This creates and opens cdf_filename.cdf as a copy of master_cdf_filename.cdf.

Using a skeleton CDF is recommended over making a CDF entirely from scratch, but this is possible by specifying a blank master:

>>> cdffile = pycdf.CDF('cdf_filename.cdf', '')

When CDFs are created in this way, they are opened read-write, see readonly() to change.

By default, new CDFs (without a master) are created in version 3 format. To create a version 2 (backward-compatible) CDF, use Library.set_backward():

>>> pycdf.lib.set_backward(True)
>>> cdffile = pycdf.CDF('cdf_filename.cdf', '')

Add variables by direct assignment, which will automatically set type and dimension based on the data provided:

>>> cdffile['new_variable_name'] = [1, 2, 3, 4]

or, if more control is needed over the type and dimensions, use new().

Although it is supported to assign Var objects to Python variables for convenience, there are some minor pitfalls that can arise when changing a CDF that will not affect most users. This is only a concern when assigning a zVar object to a Python variable, changing the CDF through some other variable, and then trying to use the zVar object via the originally assigned variable.

Deleting a variable:

>>> var = cdffile['Var1']
>>> del cdffile['Var1']
>>> var[0] #fail, no such variable

Renaming a variable:

>>> var = cdffile['Var1']
>>> cdffile['Var1'].rename('Var2')
>>> var[0] #fail, no such variable

Renaming via the same variable works:

>>> var = cdffile['Var1']
>>> var.rename('Var2')
>>> var[0] #succeeds, aware of new name

Deleting a variable and then creating another variable with the same name may lead to some surprises:

>>> var = cdffile['Var1']
>>> var[...] = [1, 2, 3, 4]
>>> del cdffile['Var1']
>>> cdffile.new('Var1', data=[5, 6, 7, 8]
>>> var[...]
[5, 6, 7, 8]

attr_num(attrname)

Get the attribute number and scope by attribute name

attrs

Global attributes for this CDF in a dict-like format.

add_attr_to_cache(attrname, num, scope)

Add an attribute to the name-to-number cache

add_to_cache(varname, num)

Add a variable to the name-to-number cache

backward

True if this CDF was created in backward-compatible mode.

checksum([new_val])

Set or check the checksum status of this CDF.

clear_attr_from_cache(attrname)

Mark an attribute deleted in the name-to-number cache

clear_from_cache(varname)

Mark a variable deleted in the name-to-number cache

clone(zVar[, name, data])

Clone a zVariable (from another CDF or this) into this CDF

close()

Closes the CDF file

col_major([new_col])

Finds the majority of this CDF file

compress([comptype, param])

Set or check the compression of this CDF

copy()

Make a copy of all data and attributes in this CDF

from_data(filename, sd)

Create a new CDF file from a SpaceData object or similar

new(name[, data, type, recVary, dimVarys, ...])

Create a new zVariable in this CDF

raw_var(name)

Get a "raw" Var object.

readonly([ro])

Sets or check the readonly status of this CDF

save()

Saves the CDF file but leaves it open.

var_num(varname)

Get the variable number of a particular variable name

version()

Get version of library that created this CDF

attrs

Global attributes for this CDF in a dict-like format. See gAttrList for details.

backward

True if this CDF was created in backward-compatible mode (for opening with CDF library before 3.x)

add_to_cache(varname, num)[source]

Add a variable to the name-to-number cache

This maintains a cache of name-to-number mappings for zVariables to keep from having to query the CDF library constantly. It’s mostly an internal function.

Parameters:
varnamebytes

name of the zVariable. Not this is NOT a string in Python 3!

numint

number of the variable

add_attr_to_cache(attrname, num, scope)[source]

Add an attribute to the name-to-number cache

This maintains a cache of name-to-number mappings for attributes to keep from having to query the CDF library constantly. It’s mostly an internal function.

Parameters:
varnamebytes

name of the zVariable. Not this is NOT a string in Python 3!

numint

number of the variable

scopebool

True if global scope; False if variable scope.

attr_num(attrname)[source]

Get the attribute number and scope by attribute name

This maintains a cache of name-to-number mappings for attributes to keep from having to query the CDF library constantly. It’s mostly an internal function.

Parameters:
attrnamebytes

name of the attribute. Not this is NOT a string in Python 3!

Returns:
outtuple

attribute number, scope (True for global) of this attribute

Raises:
CDFErrorif attribute is not found
checksum(new_val=None)[source]

Set or check the checksum status of this CDF. If checksums are enabled, the checksum will be verified every time the file is opened.

Returns:
outboolean

True if the checksum is enabled or False if disabled

Other Parameters:
new_valboolean

True to enable checksum, False to disable, or leave out to simply check.

clear_from_cache(varname)[source]

Mark a variable deleted in the name-to-number cache

Will remove a variable, and all variables with higher numbers, from the variable cache.

Does NOT delete the variable!

This maintains a cache of name-to-number mappings for zVariables to keep from having to query the CDF library constantly. It’s mostly an internal function.

Parameters:
varnamebytes

name of the zVariable. Not this is NOT a string in Python 3!

clear_attr_from_cache(attrname)[source]

Mark an attribute deleted in the name-to-number cache

Will remove an attribute, and all attributes with higher numbers, from the attribute cache.

Does NOT delete the variable!

This maintains a cache of name-to-number mappings for attributes to keep from having to query the CDF library constantly. It’s mostly an internal function.

Parameters:
attrnamebytes

name of the attribute. Not this is NOT a string in Python 3!

clone(zVar, name=None, data=True)[source]

Clone a zVariable (from another CDF or this) into this CDF

Parameters:
zVarVar

variable to clone

Returns:
outVar

The newly-created zVar in this CDF

Other Parameters:
namestr

Name of the new variable (default: name of the original)

databoolean (optional)

Copy data, or only type, dimensions, variance, attributes? (default: True, copy data as well)

close()[source]

Closes the CDF file

Although called on object destruction (__del__()), to ensure all data are saved, the user should explicitly call close() or save().

Raises:
CDFErrorif CDF library reports an error
Warns:
CDFWarningif CDF library reports a warning
col_major(new_col=None)[source]

Finds the majority of this CDF file

Returns:
outboolean

True if column-major, false if row-major

Other Parameters:
new_colboolean

Specify True to change to column-major, False to change to row major, or do not specify to check the majority rather than changing it. (default is check only)

compress(comptype=None, param=None)[source]

Set or check the compression of this CDF

Sets compression on entire file, not per-variable.

See section 2.6 of the CDF user’s guide for more information on compression.

Returns:
outtuple

(comptype, param) currently in effect

Other Parameters:
comptypectypes.c_long

type of compression to change to, see CDF C reference manual section 4.10. Constants for this parameter are in const. If not specified, will not change compression.

paramctypes.c_long

Compression parameter, see CDF CRM 4.10 and const. If not specified, will choose reasonable default (5 for gzip; other types have only one possible parameter.)

See also

Var.compress()

Examples

Set file cdffile to gzip compression, compression level 9:
>>> cdffile.compress(pycdf.const.GZIP_COMPRESSION, 9)
copy()[source]

Make a copy of all data and attributes in this CDF

Returns:
outCDFCopy

SpaceData-like object of all data

classmethod from_data(filename, sd)[source]

Create a new CDF file from a SpaceData object or similar

The CDF named filename is created, opened, filled with the contents of sd (including attributes), and closed.

sd should be a dictionary-like object; each key will be made into a variable name. An attribute called attrs, if it exists, will be made into global attributes for the CDF.

Each value of sd should be array-like and will be used as the contents of the variable; an attribute called attrs, if it exists, will be made into attributes for that variable.

Parameters:
filenamestring

name of the file to create

sdspacepy.datamodel.SpaceData

data to put in the CDF. This structure cannot be nested, i.e., it must contain only dmarray and no Spacedata objects.

new(name, data=None, type=None, recVary=None, dimVarys=None, dims=None, n_elements=None, compress=None, compress_param=None, sparse=None, pad=None)[source]

Create a new zVariable in this CDF

Note

Either data or type must be specified. If type is not specified, it is guessed from data.

This creates a new variable. If using a “master CDF” with existing variables and no records, simply assign the new data to the variable, or the “whole variable” slice:

>>> cdf['ExistingVariable'] = data
>>> cdf['ExistingVariable'][...] = data
Parameters:
namestr

name of the new variable

Returns:
outVar

the newly-created zVariable

Other Parameters:
data

data to store in the new variable. If this has a an attrs attribute (e.g., dmarray), it will be used to populate attributes of the new variable. Similarly the CDF type, record variance, etc. will, by default, be taken from data if it is a VarCopy. This can be overridden by specifying other keywords.

typectypes.c_long

CDF type of the variable, from const. See section 2.5 of the CDF user’s guide for more information on CDF data types.

recVaryboolean

record variance of the variable (default True)

dimVaryslist of boolean

dimension variance of each dimension, default True for all dimensions.

dimslist of int

size of each dimension of this variable, default zero-dimensional. Note this is the dimensionality as defined by CDF, i.e., for record-varying variables it excludes the leading record dimension. See Var.

n_elementsint

number of elements, should be 1 except for CDF_CHAR, for which it’s the length of the string.

compressctypes.c_long

Compression to apply to this variable, default None. See Var.compress().

compress_paramctypes.c_long

Compression parameter if compression used; reasonable default is chosen. See Var.compress().

sparsectypes.c_long

New in version 0.2.3.

Sparse records type for this variable, default None (no sparse records). See Var.sparse().

pad

New in version 0.2.3.

Pad value for this variable, default None (do not set). See Var.pad().

Raises:
ValueErrorif neither data nor sufficient typing information

is provided.

Notes

Any given data may be representable by a range of CDF types; if the type is not specified, pycdf will guess which the CDF types which can represent this data. This breaks down to:

  1. If input data is a numpy array, match the type of that array

  2. Proper kind (numerical, string, time)

  3. Proper range (stores highest and lowest number provided)

  4. Sufficient resolution (EPOCH16 or TIME_TT2000 required if datetime has microseconds or below.)

If more than one value satisfies the requirements, types are returned in preferred order:

  1. Type that matches precision of data first, then

  2. integer type before float type, then

  3. Smallest type first, then

  4. signed type first, then

  5. specifically-named (CDF_BYTE) vs. generically named (CDF_INT1)

TIME_TT2000 is always the preferred time type if it is available. Otherwise, EPOCH_16 is preferred over EPOCH if data specifies below the millisecond level (rule 1), but otherwise EPOCH is preferred (rule 2).

Changed in version 0.3.0: Before 0.3.0, EPOCH or EPOCH_16 were used if not specified. Now TIME_TT2000 is always the preferred type.

For floats, four-byte is preferred unless eight-byte is required:

  1. absolute values between 0 and 3e-39

  2. absolute values greater than 1.7e38

This will switch to an eight-byte double in some cases where four bytes would be sufficient for IEEE 754 encoding, but where DEC formats would require eight.

raw_var(name)[source]

Get a “raw” Var object.

Normally a Var will perform translation of values for certain types (to/from Unicode for CHAR variables on Py3k, and to/from datetime for all time types). A “raw” object does not perform this translation, on read or write.

This does not affect the data on disk, and in fact it is possible to maintain multiple Python objects with access to the same zVariable.

Parameters:
namestr

name or number of the zVariable

readonly(ro=None)[source]

Sets or check the readonly status of this CDF

If the CDF has been changed since opening, setting readonly mode will have no effect.

Note

Before version 3.8.1 of the NASA CDF library, closing a CDF that has been opened readonly, or setting readonly False, may take a substantial amount of time if there are many variables in the CDF, as a (potentially large) cache needs to be cleared. If upgrading to a newer CDF library is not possible, specifying readonly=False when opening the file is an option. However, this may make some reading operations slower.

Returns:
outBoolean

True if CDF is read-only, else False

Other Parameters:
roBoolean

True to set the CDF readonly, False to set it read/write, or leave out to check only.

Raises:
CDFErrorif bad mode is set
save()[source]

Saves the CDF file but leaves it open.

If closing the CDF, close() is sufficient; there is no need to call save() before close().

Note

Relies on an undocumented call of the CDF C library, which is also used in the Java interface.

Raises:
CDFErrorif CDF library reports an error
Warns:
CDFWarningif CDF library reports a warning
var_num(varname)[source]

Get the variable number of a particular variable name

This maintains a cache of name-to-number mappings for zVariables to keep from having to query the CDF library constantly. It’s mostly an internal function.

Parameters:
varnamebytes

name of the zVariable. Not this is NOT a string in Python 3!

Returns:
outint

Variable number of this zvariable.

Raises:
CDFErrorif variable is not found
version()[source]

Get version of library that created this CDF

Returns:
outtuple

version of CDF library, in form (version, release, increment)