Quick Start¶
Installation¶
pip install multitables
Alternatively, to install from HEAD, run
pip install git+https://github.com/ghcollin/multitables.git
You can also download or clone the repository and run
python setup.py install
multitables depends on tables (the pytables package), numpy, msgpack, and wrapt.
The package is compatible with the latest versions of Python 3, as pytables no longer supports Python 2.
Quick start: Streaming¶
import multitables
stream = multitables.Streamer(filename='/path/to/h5/file')
for row in stream.get_generator(path='/internal/h5/path'):
do_something(row)
Quick start: Random access¶
import multitables
reader = multitables.Reader(filename='/path/to/h5/file')
dataset = reader.get_dataset(path='/internal/h5/path')
stage = dataset.create_stage(10) # Size of the shared
# memory stage in rows
req = dataset['col_A'][30:35] # Create a request as you
# would index normally.
future = reader.request(req, stage) # Schedule the request
with future.get_unsafe() as data:
do_something(data)
data = None # Always set data to None after get_unsafe to
# prevent a dangling reference
# ... or use a safer proxy method
req = dataset.col('col_A')[30:35,...,:100]
future = reader.request(req, stage)
with future.get_proxy() as data:
do_something(data)
# ... or provide a function to run on the data
req = dataset.read_sorted('col_C', checkCSI=True, start=200, stop=300)
future = reader.request(req, stage)
future.get_direct(do_something)
# ... or get a copy of the data
req = dataset['col_A'][30:35,np.arange(500) > 45]
future = reader.request(req, stage)
do_something(future.get())
# once done, close the reader
reader.close(wait=True)
Examples¶
See the How-To for more in-depth documentation, and the unit tests for complete examples.