dbprocessing.DButils.DButils

class dbprocessing.DButils.DButils(mission='Test', db_var=None, echo=False, engine=None)[source]

Utility routines for DBProcessing class

All of these may be user called but are meant to be internal routines for DBProcessing

Warning

It is strongly encouraged to make sure the database is closed before the program terminates, either by calling closeDB() or deleting instances of this object (with an explicit del or by allowing it to go out of scope.) If this object still exists at interpreter exit, it will attempt to close the database, but the functionality to do so may have already been torn down. See for example Python issue 39513.

__init__(mission='Test', db_var=None, echo=False, engine=None)[source]

Initialize the DButils class

Parameters
missionstr

Name of the mission. This may be the name of a .sqlite file or the name of a Postgresql database; see Specifying a database for Postgresql support (implemented by postgresql_url()).

echobool, default False

if True, the Engine will log all statements as well as a repr() of their parameter lists to the logger

enginestr, optional

DB engine to connect to (e.g sqlite, postgresql). Defaults to sqlite if mission is an existing file, else postgresql.

Other Parameters
db_var

Does nothing

Methods

ProcessqueueClean([dryrun])

Keep only latest version of each file in the process queue.

ProcessqueueFlush()

remove everything from the process queue

ProcessqueueGet([index, instance])

Get the file at the head of the queue (from the left)

ProcessqueueGetAll([version_bump])

Return the entire contents of the process queue

ProcessqueueLen()

Return the number of files in the process queue

ProcessqueuePop([index])

pop a file off the process queue (from the left)

ProcessqueuePush(fileid[, version_bump, MAX_ADD])

Push a file onto the process queue (onto the right)

ProcessqueueRawadd(fileid[, version_bump, ...])

raw add file ids to the process queue

ProcessqueueRemove(item[, commit])

remove a file from the queue by name or number

addCode(filename, relative_path, ...[, ...])

Add an executable code to the DB

addFile([filename, data_level, version, ...])

Add a datafile to the database.

addFilecodelink(resulting_file_id, source_code)

Add a file code link to the database

addFilefilelink(resulting_file_id, source_file)

Add a file file link to the database

addInspector(filename, relative_path, ...[, ...])

Add an inspector to the DB.

addInstrument(instrument_name, satellite_id)

Add a Instrument to the database

addInstrumentproductlink(instrument_id, ...)

Add a instrument product link to the database

addLogging(currently_processing, ...[, pid, ...])

Add an entry to the logging table

addMission(mission_name, rootdir, incoming_dir)

Add a mission to the database

addProcess(process_name, output_product, ...)

Add a process to the database

addProduct(product_name, instrument_id, ...)

Add a product to the database

addRelease(filename, release[, commit])

Given a filename or file_id add an entry to the release table

addSatellite(satellite_name, mission_id)

Add a satellite to the database

addUnixTimeTable()

Add a table containing a file's Unix start/stop time.

addproductprocesslink(input_product_id, ...)

Add a product process link to the database

checkDiskForFile(file_id[, fix])

Check if the file existence on disk matches database record

checkFileSHA(file_id)

Given a file id or name check the db checksum and the file checksum

checkFiles([limit])

Check files in the DB, return inconsistent files and why

checkIncoming([glb])

Check the incoming directory for the current mission

closeDB()

Close the database connection

codeIsActive(ec_id, date)

Determine if a code is active and newest version.

commitDB()

Do the commit to the DB

currentlyProcessing()

Checks the db to see if it is currently processing

delFilecodelink(f[, commit])

Remove entries from Filecodelink for a Given file

delFilefilelink(f[, commit])

Remove entries from Filefilelink

delInspector(i)

Removes an inspector from the db

delProduct(pp)

Removes a product from the db

delProductProcessLink(ll)

Removes a product from the db

editTable(table, my_id, column[, my_str, ...])

Apply string editing operations on a single row, column of a table

fileIsNewest(filename[, debug])

quesry the database, is this filename or file_id newest version?

file_id_Clean(invals)

Given a list of file IDs return only newest versions of matching files.

getActiveInspectors()

Query the db and returns all active inspectors

getAllCodes([active])

Return a list of all codes

getAllCodesFromProcess(proc_id)

Given a process id return the code ids that performs that process

getAllFileIds([fullPath, startDate, ...])

Return all the file ids in the database All parameters are optional; if not specified, default is "all".

getAllFilenames([fullPath, startDate, ...])

Return all the file names in the database

getAllInstruments()

Return dictionaries of instrument traceback dictionaries

getAllProcesses([timebase])

Get all processes

getAllProducts([id_only])

Return a list of all products as instances

getAllSatellites()

Return dictionaries of satellite, mission objects

getChildTree(inprod)

Given an input product return a list of its output product ids

getChildrenProcesses(file_id)

Given a file, return all the processes that use this as input

getCodeDirectory()

Return the code directory for the current mission

getCodeFromProcess(proc_id, utc_file_date)

Given a process id return the code id that performs that process on a particular date.

getCodeID(codename)

Return the codeID for a code's filename.

getCodePath(code_id)

Given a code_id list return the full name (path and all) of the code

getCodeVersion(code_id)

Given a code_id get the code version Given a code_id list return the full name (path and all) of the code

getDirectory(column[, default])

Look up directory for the specified column.

getEntry(table, args)

Return entry instance from any table in DB

getErrorPath()

Return the error directory for the current mission

getFileDates(file_id)

Given a file_id or name return the dates it spans

getFileFullPath(filename)

Return the full path to a file given the name or id

getFileID(filename)

Return the fileID for the input filename

getFileParents(file_id[, id_only])

Given a file_id (or filename) return the files that went into making it

getFileVersion(fileid)

Return the version instance for a file

getFilecodelink_bycode(code_id)

Given a code_id return the file_id of all files it created

getFilecodelink_byfile(file_id)

Given a file_id return the code_id that created it, or None

getFiles([startDate, endDate, level, ...])

Query database for file records, with filters.

getFilesByCode(code_id[, newest_version, ...])

Given a code_id (or name) return the files that were created using it

getFilesByDate(daterange[, newest_version])

Return files in the db with utc_file_date in the range specified

getFilesByInstrument(inst_id[, level, ...])

Given an instrument_id return all the file instances associated with it

getFilesByProduct(prod_id[, newest_version])

Given a product_id or name return all the files associated with it

getFilesByProductDate(product_id, daterange)

Return the files by product id with utc_file_date in range specified

getFilesByProductTime(product_id, daterange)

Return the files in the db by product id with any data in date range

getIncomingPath()

Return the incoming directory for the current mission

getInputProductID(process_id[, range])

Return the input products for a particular process.

getInspectorDirectory()

Return the inspector directory for the current mission

getInstrumentID(name[, satellite_id])

Return the instrument_id for a given instrument.

getMissionDirectory()

Return the root directory for the current mission

getMissionID(mission_name)

Given a mission name return its ID

getMissions()

Return a list of all the missions

getProcessFromInputProduct(product)

Given a product id return all the processes that use that as an input

getProcessFromOutputProduct(outProd)

Gets process from the db that have the output product

getProcessID(proc_name)

Given a process name return its id

getProcessTimebase(process_id)

Return the timebase for a process

getProductID(product_name)

Return the product ID for an input product name

getProductParentTree()

go through the db and return a tree of all products and their parents

getProductsByInstrument(inst_id)

Get all the products for a Given instrument

getProductsByLevel(level)

Get all the products for a Given level

getRunProcess()

Return a list of the processes who's output_timebase is "RUN"

getSatelliteID(sat_name)

Returns the satellite ID for an input satellite name

getSatelliteMission(sat_name)

Given a satellite or satellite id return the mission

getTraceback(table, in_id[, in_id2])

Master routine for all the getXXXTraceback functions

list_release(rel_num[, fullpath])

Given a release number return all the filenames with the release

openDB(engine[, db_var, verbose, echo])

Setup python to talk to the database

purgeProcess(proc[, commit])

Remove process and productprocesslink

renameFile(filename, newname)

Rename a file in the db

resetProcessingFlag(comment)

Query the db and reset a processing flag

startLogging()

Add an entry to the logging table in the DB, logging

stopLogging(comment)

Finish the entry to the processing table in the DB, logging

tag_release(rel_num)

Tag all the newest versions of files to a release number (integer)

updateCodeNewestVersion(code_id[, is_newest])

Update a code to indicate whether it's the newest version.

updateInspectorSubs(insp_id)

Update an existing inspector performing the {} replacements

updateProcessSubs(proc_id)

Update an existing product performing the {} replacements

updateProductSubs(product_id)

Update an existing product performing the {} replacements

ProcessqueueClean(dryrun=False)[source]

Keep only latest version of each file in the process queue.

This is determined by product_id and utc_file_date. Also sorts queue by level, date

Parameters
dryrunbool, default False

Do not actually make changes to the queue.

ProcessqueueFlush()[source]

remove everything from the process queue

This is as optimized as it can be

ProcessqueueGet(index=0, instance=False)[source]

Get the file at the head of the queue (from the left)

Returns
file_idint

the file_id of the file popped from the queue

ProcessqueueGetAll(version_bump=False)[source]

Return the entire contents of the process queue

Parameters
version_bumpbool, default False

Include the version bump information

Returns
list

All file_id in the process queue, optionally the version_bump information as well.

ProcessqueueLen()[source]

Return the number of files in the process queue

Returns
int

Count of files in the queue

ProcessqueuePop(index=0)[source]

pop a file off the process queue (from the left)

Returns
file_idint

the file_id of the file popped from the queue

Other Parameters
indexint

the index in the queue to pop

ProcessqueuePush(fileid, version_bump=None, MAX_ADD=150)[source]

Push a file onto the process queue (onto the right)

Parameters
fileidint or str

the file_id or filename to put on the process queue.

Returns
file_idint

file_id of the file placed on queue, as grabbed from the db.

ProcessqueueRawadd(fileid, version_bump=None, commit=True)[source]

raw add file ids to the process queue

Warning

This might break things if an id is added that does not exist; it’s meant to be fast and used after getting the ids. IS safe against adding ids that are already in the queue.

Parameters
fileidint or Iterable

the file_id or sequence of file ids to add

Returns
numint

the number of entries added to the processqueue

ProcessqueueRemove(item, commit=True)[source]

remove a file from the queue by name or number

Parameters
itemint or str

filename or file_id of file to remove from the process queue.

commitbool, default True

Commit changes to the database when done.

addCode(filename, relative_path, code_start_date, code_stop_date, code_description, process_id, version, active_code, date_written, output_interface_version, newest_version, arguments=None, cpu=1, ram=1)[source]

Add an executable code to the DB

Creates a record in code table.

Parameters
filenamestr

the filename of the code.

relative_pathstr

the relative path (relative to mission code directory).

code_start_datedatetime

start of validity of the code.

code_stop_datedatetime

end of validity of the code.

code_descriptionstr

description of the code (50 char).

process_idint

process_id of the process this code implements.

versionVersion or str

Version of the code.

active_codebool

if the code is active.

code_date_writtendatetime

date the code was written.

output_interface_versionint

Interface version of files produced by the code.

newest_versionbool

Is the code the newest version.

argumentsstr, optional

Additional command line arguments to pass to the code, default none (no extra arguments).

cpuint, default 1

Relative CPU usage of code (usually in terms of threads).

ramfloat, default 1

Relative memory usage of code.

Returns
code_idint

code_id of newly created record.

addFile(filename=None, data_level=None, version=None, file_create_date=None, exists_on_disk=None, utc_file_date=None, utc_start_time=None, utc_stop_time=None, check_date=None, verbose_provenance=None, quality_comment=None, caveats=None, met_start_time=None, met_stop_time=None, product_id=None, shasum=None, process_keywords=None, quality_checked=None)[source]

Add a datafile to the database.

Adds record to file.

Parameters
filenamestr

Filename to add.

data_levelfloat

The data level of the file.

versionVersion

The version of the file to create.

file_create_datedatetime

Date and time the file was created.

exists_on_diskbool

Does the file exist on disk.

product_idint

product_id of the product the file belongs to.

utc_file_datedate

The UTC date of the file.

utc_start_timedatetime

UTC of first timestamp in file.

utc_end_timedatetime

UTC of last timestamp in file.

check_datedatetime

The date the file was quality checked.

verbose_provenancestr

Verbose provenance of the file.

quality_commentstr

Comment on quality from quality check.

caveatsstr

Caveats on use of file.

met_start_timeint

MET of first timestamp in file.

met_stop_timeint

MET of last timestamp in file.

Returns
int

file_id of the newly inserted file record.

Notes

All arguments are technically optional, but the insertion to the database may fail if an argument is not provided for a column which requires a non-NULL value. See file.

Add a file code link to the database

Connects file to code that made it via filecodelink.

Parameters
resulting_file_idint

file_id of the created file

source_codeint

code_id of the code that created the file

Add a file file link to the database

Links a file to one of its input files via filefilelink.

Parameters
resulting_file_idint

file_id of the output file.

source_fileint

file_id of the input file.

addInspector(filename, relative_path, description, version, active_code, date_written, output_interface_version, newest_version, product, arguments=None)[source]

Add an inspector to the DB.

Creates a record in inspector table.

Parameters
filenamestr

the filename of the inspector

relative_pathstr

the relative path (relative to mission inspector directory).

descriptionstr

description of the inspector (50 char).

versionVersion or str

Version of the code.

active_codebool

if the inspector is active.

date_writtendatetime

date the inspector was written.

output_interface_versionint

Written to database, but not used.

newest_versionbool

Is the inspector the newest version.

productint

product_id of the product this inspector identifies.

argumentsstr, optional

Additional keywords to pass to the inspect() method, default none (no extra arguments).

Returns
inspector_idint

inspector_id of newly created record.

addInstrument(instrument_name, satellite_id)[source]

Add a Instrument to the database

Creates record in instrument.

Parameters
instrument_namestr

The name of the instrument (instrument_name).

satellite_idint

satellite_id of the satellite associated with the instrument.

Add a instrument product link to the database

Links a product to its instrument via instrumentproductlink.

Parameters
instrument_idint

instrument_id of the instrument.

product_idint

product_id of the product.

addLogging(currently_processing, processing_start_time, mission_id, user, hostname, pid=None, processing_end_time=None, comment=None)[source]

Add an entry to the logging table

Parameters
currently_processingbool

is the db currently processing?

processing_start_timedatetime

the time the processing started

mission_idint

the mission_id the processing is for

userstr

the user doing the processing

hostnamestr

the hostname that initiated the processing

pidint, optional

the process id that did the processing, default null

processing_end_timedatetime, optional

the time the processing stopped, default null

commentstr

comment about the processing run

Returns
Logging

instance of the class for the logging table.

addMission(mission_name, rootdir, incoming_dir, codedir=None, inspectordir=None, errordir=None)[source]

Add a mission to the database

Optional directories which are not specified will be inserted into the database as nulls, and the default will be determined at runtime.

Parameters
mission_namestr

the name of the mission

rootdirstr

the root directory of the mission

incoming_dirstr

directory for incoming files

codedirstr, optional

directory containing codes; default, see getCodeDirectory()

inspectordirstr, optional

directory containing product inspectors; default, see getInspectorDirectory())

errordirstr, optional

directory to contain error files; default, see getErrorPath()

addProcess(process_name, output_product, output_timebase, extra_params=None, trigger=None)[source]

Add a process to the database

Parameters
process_namestr

the name of the process (process_name).

output_productint

the output product id (output_product).

output_timebasestr

Timebase to use for output files, options RUN, ORBIT, DAILY, WEEKLY, MONTHLY, YEARLY, FILE (output_timebase).

extra_paramsstr, optional

extra parameters to pass to the code (extra_params).

Other Parameters
trigger

Unused.

addProduct(product_name, instrument_id, relative_path, format, level, product_description)[source]

Add a product to the database

Adds record to product.

Parameters
product_namestr

the name of the product

instrument_idint

the instrument the product is from

relative_pathstr

relative path for the product

formatstr

the format of the product filenames

addRelease(filename, release, commit=False)[source]

Given a filename or file_id add an entry to the release table

Parameters
filenameint or str

filename or file_id of file to add to a release.

releaseint

Release number to add file to.

commitbool, default False

Commit changes to the database when done.

See also

Releases
addSatellite(satellite_name, mission_id)[source]

Add a satellite to the database

Parameters
satellite_namestr

the name of the satellite

mission_idint

mission.mission_id of mission to add to

Returns
int

satellite.satellite_id of newly-added satellite.

addUnixTimeTable()[source]

Add a table containing a file’s Unix start/stop time.

Used for migrating databases; doing file searches based on the Unix time is faster than the UTC timestamp. This will also populate the time columns from a file’s UTC start/stop time.

Raises
RuntimeError

If the Unix time table already exists

Add a product process link to the database

Connects input product to output via productprocesslink.

Parameters
input_product_idint

product_id of the input product.

process_idint

process.process_id of the process for which input_product_id is an input.

optionalbool

if the input product is optional (vs. required)

yesterdayint, default 0

How many extra days back do you need

tomorrowint, default 0

How many extra days forward do you need

checkDiskForFile(file_id, fix=False)[source]

Check if the file existence on disk matches database record

Parameters
file_idint

file_id of the file to check

fixbool, default False

set to have the DB fixed to match the file system this is NOT sure to be safe

Returns
bool

True if consistent, False otherwise

checkFileSHA(file_id)[source]

Given a file id or name check the db checksum and the file checksum

Parameters
file_idint or str

filename or file_id of file to check

Returns
bool

If the calculated checksum of file on disk matches the checksum in the database.

checkFiles(limit=None)[source]

Check files in the DB, return inconsistent files and why

Parameters
limitint

Maximum number of files to check, default all

Returns
:class:`list` ofclass:tuple

All files with problems. Each element is (filename, result), where “result” is 1 for a bad checksum and 2 if file not found on disk.

checkIncoming(glb='*')[source]

Check the incoming directory for the current mission

Parameters
glbstr, optional

Glob pattern that files must match.

Returns
:class:`list` ofclass:str

All files in the incoming directory

closeDB()[source]

Close the database connection

Examples

>>>  pnl.closeDB()
codeIsActive(ec_id, date)[source]

Determine if a code is active and newest version.

Parameters
ec_idint or str

code_id or code_description of the code to check.

datedate

Check if code is valid for files on this date (corresponds to utc_file_date).

Returns
bool

If code is active, newest version, and date falls within the code’s valid date range.

commitDB()[source]

Do the commit to the DB

currentlyProcessing()[source]

Checks the db to see if it is currently processing

Ensures not doing 2 at the same time

Returns
:class:`bool` orclass:int

False or the current process id

Examples

>>>  pnl.currentlyProcessing()

Remove entries from Filecodelink for a Given file

Remove record from filecodelink if the file was created by a code.

Parameters
fint or str

file_id or filename of file to unassociate with code.

commitbool, default True

Commit changes to the database when done.

Remove entries from Filefilelink

Remove record from filefilelink if the file is in either source_file or resulting_file.

Parameters
fint or str

file_id or filename of file to remove from link.

commitbool, default True

Commit changes to the database when done.

delInspector(i)[source]

Removes an inspector from the db

Parameters
iint

inspector.inspector_id of inspector to delete

delProduct(pp)[source]

Removes a product from the db

Parameters
ppint or str

product_id or product_name of product to remove.

Removes a product from the db

Parameters
lllist

Two elements, process_id and input_product_id of record to remove from productprocesslink.

Notes

Untested!

editTable(table, my_id, column, my_str=None, after_flag=None, ins_after=None, ins_before=None, replace_str=None, combine=False)[source]

Apply string editing operations on a single row, column of a table

For a specified row and column of a table, update the value according to operations specified by the combination of the kwargs.

To replace all instances of a string with another, set replace_str to the string to replace and my_str to the new value to replace it with.

To append a string to all instance of a string, set ins_after to the existing string and my_str to the value to append.

To prepend a string to all instance of a string, set ins_after to the existing string and my_str to the value to prepend.

When operating on the arguments column of the code table, and after_flag is specified, all three of these operations will only apply to the “word” (whitespace-separated) after the “word” in after_flag. See examples.

When operating on the arguments column of the code table, combine may be set to True to combine every word that follows each instance of after_flag into a comma-separated list after a single instance of after_flag. See examples.

One and only one of ins_after, ins_before, replace_str and combine can be specified; there is no default operation. If ins_after, ins_before, or replace_str are specified, my_str must be.

Note

Written and tested for code table. Not thoroughly tested for others.

Parameters
tablestr

Name of the table to edit.

my_idint

Record to edit; most commonly the numerical ID (primary key) but also supports string matching on other columns as provided by getEntry().

columnstr

name of column to edit

my_strstr, optional

String to add or replace. Required with ins_after, ins_before, replace_str.

after_flagstr, optional

Only replace string in words immediately following this word. Only supported in arguments column of code table. Default: replace in all.

ins_afterstr, optional

Value to insert my_str after. Conflicts with ins_before, replace_str, combine.

ins_beforestr, optional

Value to insert my_str before. Conflicts with ins_after, replace_str, combine.

replace_strstr, optional

Value to replace with my_str. Conflicts with ins_after, ins_before, combine.

combinebool, default False

If true, combine all instances of words after the word in after_flag. Conflicts with ins_after, ins_before, replace_str.

Raises
ValueError

for any invalid combination of arguments.

RuntimeError

if multiple rows match my_id.

Examples

All examples assume an open DButils instance in dbu and an existing code of ID 1. These examples use command line flags but the treatment of strings is general.

>>> #Replace a string after a flag
>>> code = dbu.getEntry('Code', 1)
>>> code.arguments = '-i foobar -j foobar -k foobar'
>>> dbu.editTable('code', 1, 'arguments', my_str='baz',
...               replace_str='bar', after_flag='-j')
>>> code.arguments
-i foobar -j foobaz -k foobar
>>> #Combine multiple instances of a flag into one
>>> code = dbu.getEntry('Code', 1)
>>> code.arguments = '-i foo -i bar -j baz'
>>> dbu.editTable('code', 1, 'arguments', after_flag='-i',
...               combine=True)
>>> code.arguments
-i foo,bar -j baz
>>> #Append a string to every instance
>>> code = dbu.getEntry('Code', 1)
>>> code.relative_path = 'scripts'
>>> dbu.editTable('code', 1, 'relative_path', ins_after='scripts',
...               my_str='2.0')
>>> code.relative_path
scripts2.0
fileIsNewest(filename, debug=False)[source]

quesry the database, is this filename or file_id newest version?

Parameters
filenameint or str

filename or file_id

Returns
bool

True is file is lastest_version, False is not

file_id_Clean(invals)[source]

Given a list of file IDs return only newest versions of matching files.

Matching is defined as same product_id and same utc_file_date.

Parameters
invalslist of int or of str

All file_id or filename to check.

Returns
:class:`list` ofclass:int

Those file_id from invals which are the newest version of that file.

getActiveInspectors()[source]

Query the db and returns all active inspectors

Returns
:class:`list` ofclass:tuple

For each active inspector, returns full filename (from relative_path and filename), description, arguments, and product.

getAllCodes(active=True)[source]

Return a list of all codes

Parameters
activebool, default False

Only return codes which are marked active_code and newest_version.

Returns
list

All codes

getAllCodesFromProcess(proc_id)[source]

Given a process id return the code ids that performs that process

Also returns the valid dates for each code

Parameters
proc_idint

process_id of process to look up.

Returns
:class:`list` ofclass:tuple

For every active, newest version code that implements the process, code_id, code_start_date, and code_stop_date.

getAllFileIds(fullPath=True, startDate=None, endDate=None, level=None, product=None, code=None, instrument=None, exists=None, newest_version=False, limit=None)[source]

Return all the file ids in the database All parameters are optional; if not specified, default is “all”.

Parameters
startDatedatetime, optional

First date to include, based on utc_file_date

endDatedatetime, optional

Last date to include (inclusive)

levelfloat, optional

Only include files of this level.

productint, optional

product_id of files to include

codeint, optional

Only return files created by code with ID of code_id

instrumentint, optional

Only return files with instrument instrument_id

existsbool, default False

Only return files that exist on disk, based on exists_on_disk.

newest_versionbool, default False

Only return files that are the newest version (of their product and date)

limitint

Limit number of results, default all

Returns
:class:`list` ofclass:int:

File ID of all files matching requirements.

Other Parameters
fullPathbool, default True

unused

getAllFilenames(fullPath=True, startDate=None, endDate=None, level=None, product=None, code=None, instrument=None, exists=None, newest_version=False, limit=None)[source]

Return all the file names in the database

All parameters are optional; if not specified, default is “all”.

Parameters
fullPathbool, default True

Return full path (if False, just filename)

startDatedatetime, optional

First date to include, based on utc_file_date

endDatedatetime, optional

Last date to include (inclusive)

levelfloat, optional

Only include files of this level.

productint, optional

product_id of files to include

codeint, optional

Only return files created by code with ID of code_id

instrumentint, optional

Only return files with instrument instrument_id

existsbool, default False

Only return files that exist on disk, based on exists_on_disk.

newest_versionbool, default False

Only return files that are the newest version (of their product and date)

limitint

Limit number of results, default all

Returns
:class:`list` ofclass:str

Filename of all files matching requirements.

getAllInstruments()[source]

Return dictionaries of instrument traceback dictionaries

Returns
dict

dictionaries of instrument traceback dictionaries

getAllProcesses(timebase='all')[source]

Get all processes

Parameters
timebasestr, optional

Limit to products with this output_timebase (default: all).

Returns
Query

process table records

getAllProducts(id_only=False)[source]

Return a list of all products as instances

Parameters
id_onlybool, default False

Return only the product_id, instead of the entire record.

Returns
:class:`~sqlalchemy.orm.Query` orclass:list of int

Complete product records for all products, or just product_id (if id_only).

getAllSatellites()[source]

Return dictionaries of satellite, mission objects

Returns
dict

dictionaries of satellite, mission objects

getChildTree(inprod)[source]

Given an input product return a list of its output product ids

Parameters
inprodint

product_id of the input product.

Returns
:class:`list` ofclass:int

product_id of all products that can be made from inprod.

getChildrenProcesses(file_id)[source]

Given a file, return all the processes that use this as input

Parameters
file_idint or str

file_id or filename

Returns
:class:`list` ofclass:int

process_id for all processes which can use the given file as input.

getCodeDirectory()[source]

Return the code directory for the current mission

Returns
str

Code directory for current mission (i.e. codedir, if defined).

See also

getDirectory()
getCodeFromProcess(proc_id, utc_file_date)[source]

Given a process id return the code id that performs that process on a particular date.

Parameters
proc_idint

process_id of process to look up.

utc_file_datedatetime

Date on which the code must be valid.

Returns
int

code_id for the active, newest version code that implements the process, and is valid on the given date. Returns None if there is no match.

Raises
DBError

If there is more than one matching code.

getCodeID(codename)[source]

Return the codeID for a code’s filename.

Parameters
codenamestr or int

filename or code_id of code to look up.

Returns
int

code_id of given code.

getCodePath(code_id)[source]

Given a code_id list return the full name (path and all) of the code

Parameters
code_idint or str

code_id or code_description of code to look up.

Returns
str

Full path to code.

getCodeVersion(code_id)[source]

Given a code_id get the code version Given a code_id list return the full name (path and all) of the code

Parameters
code_idint or str

code_id or code_description of code to look up.

Returns
Version

Version of the code.

getDirectory(column, default=None)[source]

Look up directory for the specified column.

The mission rootdir may be absolute or relative to current path. Directory requested may be in db as absolute or relative to mission root. Home dir references are expanded.

Parameters
columnstr

Name of column in mission to look up.

defaultstr, optional

Default to return if directory not found in mission table, default None.

getEntry(table, args)[source]

Return entry instance from any table in DB

Parameters
tablestr

Name of the table

argsint or str

Argument to identify entry. This is first tried as a primary key (integer or sequence of integers); if that fails, then assumed to be a name and used for a lookup via get[table]ID.

Returns
various types

Matching column from the table. If there is no primary key match and the table does not support name lookup, returns None.

Raises
DBNoData

if argument is not found as primary key and name lookup fails (but not if name lookup is not available).

getErrorPath()[source]

Return the error directory for the current mission

Returns
str

Error directory for current mission (i.e. errordir, if defined).

See also

getDirectory()
getFileDates(file_id)[source]

Given a file_id or name return the dates it spans

Parameters
file_idint or str

file_id or filename of file to look up.

Returns
:class:`list` ofclass:~datetime.datetime

First and last UTC timestamp of file.

getFileFullPath(filename)[source]

Return the full path to a file given the name or id

TODO, this is really slow, this query made it a lot faster but I bet it can get better

Parameters
filenamestr or int

filename or file_id of file to look up.

Returns
str

Full path to the file.

getFileID(filename)[source]

Return the fileID for the input filename

Parameters
filenamestr or int

filename or file_id of file to look up.

Returns
int

file_id of input file.

getFileParents(file_id, id_only=False)[source]

Given a file_id (or filename) return the files that went into making it

Parameters
file_idint or str

file_id or filename of the file of interest.

id_onlybool, default False

Return only the file_id, instead of the entire record.

Returns
:class:`~sqlalchemy.orm.Query` orclass:list of int

Complete file records for all input files, or just file_id (if id_only).

getFileVersion(fileid)[source]

Return the version instance for a file

Parameters
fileidint or str

file_id or filename of the file of interest.

Returns
Version

Version of the file.

Given a code_id return the file_id of all files it created

Parameters
code_idint or str

code_id or code_description of the code to look up.

Returns
Query

file_id of all files created by the code.

Given a file_id return the code_id that created it, or None

Parameters
file_idint or str

file_id or filename of the file to look up.

Returns
int

code_id of the code that created the file.

getFiles(startDate=None, endDate=None, level=None, product=None, code=None, instrument=None, exists=None, newest_version=False, limit=None, startTime=None, endTime=None)[source]

Query database for file records, with filters.

All parameters are optional; if not specified, default is “all”.

Parameters
startDatedatetime, optional

First date to include, based on utc_file_date

endDatedatetime, optional

Last date to include (inclusive)

levelfloat, optional

Only include files of this level.

productint, optional

product_id of files to include

codeint, optional

Only return files created by code with ID of code_id

instrumentint, optional

Only return files with instrument instrument_id

existsbool, default False

Only return files that exist on disk, based on exists_on_disk.

newest_versionbool, default False

Only return files that are the newest version (of their product and date)

limitint

Limit number of results, default all

startTimedatetime, optional

Include files containing timestamps at or after this time, utc_start_time

endTimedatetime, optional

Include files containing timestamps at or before this time, utc_stop_time

Returns
list

File records of all files matching requirements.

getFilesByCode(code_id, newest_version=False, id_only=False)[source]

Given a code_id (or name) return the files that were created using it

Parameters
code_idint or str

Only return files created by code with this code_id or code_description

newest_versionbool, default False

Only return files that are the newest version (of their product and date)

id_onlybool, default False

Only return file IDs, not complete file record.

Returns
list

File records of all files matching requirements.

getFilesByDate(daterange, newest_version=False)[source]

Return files in the db with utc_file_date in the range specified

Parameters
daterangelist of datetime

First and last date to include, based on utc_file_date.

newest_versionbool, default False

Only return files that are the newest version (of their product and date).

Returns
list

File records of all files matching requirements.

getFilesByInstrument(inst_id, level=None, newest_version=False, id_only=False)[source]

Given an instrument_id return all the file instances associated with it

Parameters
inst_idint or str

Only return files with this instrument_id or instrument_name

levelfloat, optional

Only include files of this level, default all.

newest_versionbool, default False

Only return files that are the newest version (of their product and date)

id_onlybool, default False

Only return file IDs, not complete file record.

Returns
list

File records of all files matching requirements.

getFilesByProduct(prod_id, newest_version=False)[source]

Given a product_id or name return all the files associated with it

Parameters
prod_idint or str

product_id or product_name of files to include.

newest_versionbool, default False

Only return files that are the newest version (of their product and date).

Returns
list

File records of all files matching requirements.

getFilesByProductDate(product_id, daterange, newest_version=False)[source]

Return the files by product id with utc_file_date in range specified

Parameters
product_idint

product_id of files to include.

daterangelist of datetime

First and last date to include, based on utc_file_date.

newest_versionbool, default False

Only return files that are the newest version (of their product and date).

Returns
list

File records of all files matching requirements.

getFilesByProductTime(product_id, daterange, newest_version=False)[source]

Return the files in the db by product id with any data in date range

A file with a UTC time range overlapping at all with daterange is considered a match, so a returned file may also include some times outside of the range.

Parameters
product_idint

product_id of files to include.

daterangelist of datetime

Range of times to include, based on utc_start_time and utc_stop_time.

newest_versionbool, default False

Only return files that are the newest version (of their product and date).

Returns
list

File records of all files matching requirements.

getIncomingPath()[source]

Return the incoming directory for the current mission

Returns
str

Incoming directory for current mission (i.e. incoming_dir).

See also

getDirectory()
getInputProductID(process_id, range=False)[source]

Return the input products for a particular process.

Parameters
process_idint

process_id of process to look up.

rangebool, default False

Also return number of days in past/future to use as inputs.

Returns
list

Result of query: each element has input_product_id and optional; if range, then also yesterday and tomorrow.

getInspectorDirectory()[source]

Return the inspector directory for the current mission

Returns
str

Inspector directory for current mission (i.e. inspectordir, if defined).

See also

getDirectory()
getInstrumentID(name, satellite_id=None)[source]

Return the instrument_id for a given instrument.

Parameters
namestr or int

instrument_name or instrument_id.

satellite_idint or str

Only return results for satellite with this satellite_id or satellite_name.

Returns
int

instrument_id.

getMissionDirectory()[source]

Return the root directory for the current mission

Returns
str

Root directory for current mission (i.e. rootdir).

getMissionID(mission_name)[source]

Given a mission name return its ID

Parameters
mission_namestr

Name of mission, i.e. mission_name.

Returns
int

mission_id for the corresponding mission.

See also

Missions
getMissions()[source]

Return a list of all the missions

Returns
:class:`list` ofclass:str

Names of all missions in the database.

Notes

Ordinarily there is only one mission per database.

getProcessFromInputProduct(product)[source]

Given a product id return all the processes that use that as an input

Use getProductID() if have a name (or not sure).

Parameters
productint

product_id of product.

Returns
:class:`list` ofclass:int

process_id of all processes which use product as an input.

getProcessFromOutputProduct(outProd)[source]

Gets process from the db that have the output product

Parameters
outProdint

product_id of product.

Returns
int

process_id of process which produces product as an output.

Notes

Assumes there is only one product that makes a process; this is common but not necessarily enforced.

getProcessID(proc_name)[source]

Given a process name return its id

Parameters
proc_namestr or int

process_name or process_id.

Returns
int

process_id.

getProcessTimebase(process_id)[source]

Return the timebase for a process

Parameters
process_idint or str

process_id or process_name of the desired process.

Returns
str

output_timebase for the process.

getProductID(product_name)[source]

Return the product ID for an input product name

Parameters
product_namestr

the name of the product to get the id of. Also supports a sequence of names, a single product ID (to confirm existence), or a sequence of product IDs.

Returns
product_idint

the product ID for the input product name

getProductParentTree()[source]

go through the db and return a tree of all products and their parents

This will allow for a run all the non done files script

Returns
list

Each entry has two elements: a product_id and another list of product_id for all product that can be made from it.

See also

getChildTree()
getProductsByInstrument(inst_id)[source]

Get all the products for a Given instrument

Parameters
inst_idint or str

instrument_id or instrument_name for instrument

Returns
:class:`list` ofclass:int

product_id for every product associated with this instrument.

getProductsByLevel(level)[source]

Get all the products for a Given level

Parameters
levelfloat

Data level to look up

Returns
:class:`list` ofclass:int

product_id for every product with level equal to level.

getRunProcess()[source]

Return a list of the processes who’s output_timebase is “RUN”

Returns
list

Full process record for all RUN timebase processes.

getSatelliteID(sat_name)[source]

Returns the satellite ID for an input satellite name

Parameters
sat_namestr

the satellite_name to get the id of. Also supports a sequence of names, a single satellite ID (to confirm existence), or a sequence of satellite IDs.

Returns
satellite_idint

the satellite_id for the input satellite name

getSatelliteMission(sat_name)[source]

Given a satellite or satellite id return the mission

Parameters
sat_nameint or str

satellite_id or satellite_name.

Returns
various

Complete record from mission table.

getTraceback(table, in_id, in_id2=None)[source]

Master routine for all the getXXXTraceback functions

The “traceback” is the set of records across tables that are relevant to one particular record

this is some large select statements with joins in them, these are tested and do work

Parameters
tablestr

Name of the table to look up.

in_idint

ID, usually primary key on the table, for the record to look up.

Returns
dict

Keyed by table name, values are records from that table (instances of table types created by Table).

Other Parameters
in_id2

Not used.

Examples

>>> tb = dbu.getTraceback('File', 500)
>>> tb.['product'].product_name
u'rbspb_int_ect-mageisM35-ns-L05'
list_release(rel_num, fullpath=True)[source]

Given a release number return all the filenames with the release

Parameters
rel_numint

Release number to list

fullpathbool, default True

Include full path to files (not just filenames)

Returns
:class:`list` ofclass:str

All filenames in the release.

openDB(engine, db_var=None, verbose=False, echo=False)[source]

Setup python to talk to the database

Parameters
enginestr

DB engine to connect to

verbosebool, default False

if True, will print out extra debugging

echobool, default False

if True, the Engine will log all statements as well as a repr() of their parameter lists to the logger

Other Parameters
db_var

Does nothing

purgeProcess(proc, commit=True)[source]

Remove process and productprocesslink

Removes a process record from the database and all productprocesslink records for that process.

Parameters
procint

process_id of process to delete.

commitbool, default True

Commit changes to the database when done.

renameFile(filename, newname)[source]

Rename a file in the db

Parameters
filenamestr or int

filename or file_id of file to rename.

newnamestr

New name to write to database.

Notes

Does not rename the file on disk. Operates on filename only (not entire path).

resetProcessingFlag(comment)[source]

Query the db and reset a processing flag

Parameters
commentstr

the comment to enter into the processing log DB

Returns
bool

True - Success, False - Failure

startLogging()[source]

Add an entry to the logging table in the DB, logging

stopLogging(comment)[source]

Finish the entry to the processing table in the DB, logging

Parameters
commentstr

a comment to insert into the DB

tag_release(rel_num)[source]

Tag all the newest versions of files to a release number (integer)

Parameters
rel_numint

Tag all “newest version” files as part of this release.

See also

Releases
updateCodeNewestVersion(code_id, is_newest=False)[source]

Update a code to indicate whether it’s the newest version.

Assumption is that the newest version of a code should be the only active one, so sets both newest_version and active_code fields in the database.

Parameters
code_idint or str

code_id or code_description for code to update.

is_newestbool, default False

Set newest_version and active_code (True), or not newest, and inactive (False).

updateInspectorSubs(insp_id)[source]

Update an existing inspector performing the {} replacements

Updates the database, replacing the generic {} references with the actual values for the inspector.

Parameters
insp_idint

inspector_id of inspector to update.

updateProcessSubs(proc_id)[source]

Update an existing product performing the {} replacements

Updates the database, replacing the generic {} references with the actual values for the process.

Parameters
proc_idint or str

process_id or process_name of process to update

updateProductSubs(product_id)[source]

Update an existing product performing the {} replacements

Updates the database, replacing the generic {} references with the actual values for the product.

Parameters
product_idint or str

product_id or product_name of product to update


Release: 0.1.0 Doc generation date: Feb 10, 2022