stompy.io — Reading and writing of various formats and data sources¶
Subpackages¶
Submodules¶
stompy.io.match_datasets module¶
Venturing into generic code to match two datasets.
Not remotely generic at this point, and makes some assumptions about dimensions, depth, time, etc.
stompy.io.qnc module¶
-
class
stompy.io.qnc.
QDataset
(*args, **kws)[source]¶ Bases:
netCDF4._netCDF4.Dataset
-
class
VarProxy
(dataset, varname)[source]¶ Bases:
object
represents a variable about to be defined, but waiting for dimension names and data, via setattr.
-
add_dimension
(dim_name, length)[source]¶ create dimension if it doesn’t exist, otherwise check that the length requested matches what does exist.
-
alias
(**kwargs)[source]¶ had been just copying the variables. But why not just update self.variables? This works even if not writing to the file.
-
copy
(skip=[], fn=None, **create_args)[source]¶ make a deep copy of self, into a writable, diskless QDataset if fn is given, target is a netCDF file on disk.
-
interpolate_dimension
(int_dim, int_var, new_coordinate, max_gap=None, gap_fields=None, int_mode='nearest')[source]¶ return a new dataset as a copy of this one, but with the given dimension interpolated according to varname=values
typically this would be done to a list of datasets, after which they could be appended.
it can also be used to collapse a ‘dependent’ coordinate into an independent coordinate - e.g. if depth bins are a function of time, this can be used to interpolate onto a constant depth axis, which will also remove the time dimension from that depth variable.
max_gap: jumps in the source variable greater than max_gap are filled with nan (or -99 if int valued). For now this is only supported when int_dim has just one dimension gap_fields: None, or a list of variable names to be masked based on gaps.
- int_mode:
- ‘nearest’ - grab the integer value from the nearest sample may add ‘linear’ in the future, which would cast to float
-
class
-
stompy.io.qnc.
anon_dim_name
(size, **kws)[source]¶ Name given to on-demand dimensions. kws: unused, but might include the type?
-
stompy.io.qnc.
concatenate
(ncs, cat_dim, skip=[], new_dim=None)[source]¶ ncs is an ordered list of QDataset objects If a single QDataset is given, it will be copied at the metadata level new_dim: if given, then fields not having cat_dim, but differing
between datasets, will be concatenated along new_dim.for convenience, elements of gdms which are None are silently dropped
-
stompy.io.qnc.
downsample
(ds, dim, stride, lowpass=True)[source]¶ Lowpass variables along the given dimension, and resample at the given stride. lowpass=False => decimate, no lowpass lowpass=<float> => lowpass window size is lowpass*stride
-
stompy.io.qnc.
linear_to_orthogonal_nc
(nc_src, lin_dim, ortho_dims, nc_dst=None)[source]¶ copy a dataset, changing a linear dimension into a pair of orthogonal dimensions
stompy.io.rbr module¶
-
class
stompy.io.rbr.
Calibration
(txt, coefs, units)[source]¶ Bases:
object
a container for calibration information - these aren’t actually used, though
-
class
stompy.io.rbr.
Rbr
(dat_file, instrument_tz=<UTC>, target_tz=<UTC>)[source]¶ Bases:
object
-
remove_spikes
(ci, method='d2', d2_threshold=30)[source]¶ attempt to automatically remove the spikes. not the best idea, but hopefully saves some time for a quick look at data
- d2_threshold=number of standard deviations in 2nd derivative
- to consider an outlier
-
-
class
stompy.io.rbr.
RbrHex
(dat_file, instrument_tz=<UTC>, target_tz=<UTC>)[source]¶ Bases:
stompy.io.rbr.Rbr
subclass for reading hex files.
-
class
stompy.io.rbr.
RbrRsk
(dat_file, instrument_tz=<UTC>, target_tz=<UTC>)[source]¶ Bases:
stompy.io.rbr.Rbr
-
class
stompy.io.rbr.
RbrText
(dat_file, instrument_tz=<UTC>, target_tz=<UTC>)[source]¶ Bases:
stompy.io.rbr.Rbr
stompy.io.rdb module¶
Tools for reading RDB files, the text-based format often used in USGS data. See stompy/test/data for examples of this type of data.
-
class
stompy.io.rdb.
Rdb
(text=None, source_file=None, fp=None)[source]¶ Bases:
object
-
data
()[source]¶ assuming that only one data type was requested, try to figure out which column it is, and return that data for single-valued columns, this will expand the data out to be the right length
-
parse_date
(s)[source]¶ parse a date like ‘2008-01-13 00:31’ into a float representing absolute days since 0ad
-
record_count
= 0¶
-
stompy.io.rdb_codes module¶
Handle database of USGS codes used in RDB files, namely for parameters (e.g. streamflow in cfs) and statistics (e.g. mean)
-
stompy.io.rdb_codes.
sanitize_code
(code)[source]¶ Make a canonical text version of a code - a 5 digit, 0-padded string
stompy.io.rdradcp module¶
A mostly direct translation of rdradcp.m to python. 1/3/2013: Updated with DKR changes to rdradcp.m
-
stompy.io.rdradcp.
adcp_merge_nmea
(r, gps_fn, adjust_to_utc=False)[source]¶ parse a NMEA file from WinRiver (i.e. with RDENS sentences), and add lat/lon to r. adjust_to_utc: use GPS time to modify the hours place of r.mtime
-
stompy.io.rdradcp.
checkheader
(fd)[source]¶ Given an open file object, read the ensemble size, skip ahead, make sure we can read the cfg bytes of the next ensemble, come back to the starting place, and report success.
-
stompy.io.rdradcp.
invalidate_from_bed
(r)[source]¶ where bottom track is good, nan out data in bottom 5%
-
stompy.io.rdradcp.
rd_buffer
(fd, num_av, msg=<function msg_print>)[source]¶ RH: return ens=None, hdr=None if there’s a problem
returns (ens,hdr,cfg,pos)
-
stompy.io.rdradcp.
rd_fixseg
(fd)[source]¶ returns Config, nbyte Reads the configuration data from the fixed leader
-
stompy.io.rdradcp.
rdradcp
(name, num_av=5, nens=-1, baseyear=2000, despike='no', log_fp=None)[source]¶ The original documentation from Rich Pawlowicz’s code:
RDRADCP Read (raw binary) RDI ADCP files, ADCP=RDRADCP(NAME) reads the raw binary RDI BB/Workhorse ADCP file NAME and puts all the relevant configuration and measured data into a data structure ADCP (which is self-explanatory). This program is designed for handling data recorded by moored instruments (primarily Workhorse-type but can also read Broadband) and then downloaded post-deployment. For vessel-mount data I usually make p-files (which integrate nav info and do coordinate transformations) and then use RDPADCP.
This current version does have some handling of VMDAS, WINRIVER, and WINRIVER2 output files, but it is still ‘beta’. There are (inadequately documented) timestamps of various kinds from VMDAS, for example, and caveat emptor on WINRIVER2 NMEA data.
(ADCP,CFG)=RDRADCP(…) returns configuration data in a separate data structure.
Various options can be specified on input: (..)=RDRADCP(NAME,NUMAV) averages NUMAV ensembles together in the result. (..)=RDRADCP(NAME,NUMAV,NENS) reads only NENS ensembles (-1 for all). (..)=RDRADCP(NAME,NUMAV,(NFIRST NEND)) reads only the specified range of ensembles. This is useful if you want to get rid of bad data before/after the deployment period.
Notes: - sometimes the ends of files are filled with garbage. In this case you may
have to rerun things explicitly specifying how many records to read (or the last record to read). I don’t handle bad data very well. Also - in Aug/2007 I discovered that WINRIVER-2 files can have a varying number of bytes per ensemble. Thus the estimated number of ensembles in a file (based on the length of the first ensemble and file size) can be too high or too low.I don’t read in absolutely every parameter stored in the binaries; just the ones that are ‘most’ useful. Look through the code if you want to get other things.
chaining of files does not occur (i.e. read .000, .001, etc.). Sometimes a ping is split between the end of one file and the beginning of another. The only way to get this data is to concatentate the files, using cat file1.000 file1.001 > file1 (unix) copy file1.000/B+file2.001/B file3.000/B (DOS/Windows)
(as of Dec 2005 we can probably read a .001 file)
velocity fields are always called east/north/vertical/error for all coordinate systems even though they should be treated as 1/2/3/4 in beam coordinates etc.
String parameter/option pairs can be added after these initial parameters:
‘baseyear’: Base century for BB/v8WH firmware (default to 2000).
‘despike’: ‘no’ | ‘yes’ | 3-element vector
Controls ensemble averaging. With ‘no’ a simple mean is used (default). With ‘yes’ a mean is applied to all values that fall within a window around the median (giving some outlier rejection). This is useful for noisy data. Window sizes are [.3 .3 .3] m/s for [ horiz_vel vert_vel error_vel ] values. If you want to change these values, set ‘despike’ to the 3-element vector.
- Pawlowicz (rich@eos.ubc.ca) - 17/09/99
R. Pawlowicz - 17/Oct/99 5/july/00 - handled byte offsets (and mysterious ‘extra” bytes) slightly better, Y2K
5/Oct/00 - bug fix - size of ens stayed 2 when NUMAV==1 due to initialization, hopefully this is now fixed.
10/Mar/02 - #bytes per record changes mysteriously, tried a more robust workaround. Guess that we have an extra 2 bytes if the record length is even?
28/Mar/02 - added more firmware-dependent changes to format; hopefully this works for everything now (put previous changes on firmer footing?)
30/Mar/02 - made cfg output more intuitive by decoding things. An early version of WAVESMON and PARSE which split out this data from a wave recorder inserted an extra two bytes per record. I have removed the code to handle this but if you need it see line 509
29/Nov/02 - A change in the bottom-track block for version 4.05 (very old!).
29/Jan/03 - Status block in v4.25 150khzBB two bytes short?
14/Oct/03 - Added code to at least ‘ignore’ WinRiver GPS blocks.
11/Nov/03 - VMDAS navigation block, added hooks to output navigation data.
26/Mar/04 - better decoding of nav blocks - better handling of weird bytes at beginning and end of file - (code fixes due to Matt Drennan).
25/Aug/04 - fixes to “junk bytes” handling.
27/Jan/05 - even more fixed to junk byte handling (move 1 byte at a time rather than two for odd lengths.
29/Sep/2005 - median windowing done slightly incorrectly in a way which biases results in a negative way in data is very noisy. Now fixed.
28/Dc/2005 - redid code for recovering from ensembles that mysteriously change length, added ‘checkheader’ to make a complete check of ensembles.
Feb/2006 - handling of firmware version 9 (navigator)
23/Aug/2006 - more firmware updates (16.27)
23/Aug2006 - ouput some bt QC stiff
29/Oct/2006 - winriver bottom track block had errors in it - now fixed.
30/Oct/2006 - pitch_std, roll_std now uint8 and not int8 (thanks Felipe pimenta)
13/Aug/2007 - added Rio Grande (firmware v 10), better handling of those cursed winriver ASCII NMEA blocks whose lengths change unpredictably. skipping the inadequately documented 2022 WINRIVER-2 NMEA block
13/Mar/2010 - firmware version 50 for WH.
31/Aug/2012 - Rusty Holleman / RMA - ported to python
Python port details:
log_fp: a file-like object - the message are the same as in the matlab code, but this allows them to be redirected elsewhere.