Quick Start

Installation

pip install multitables

Alternatively, to install from HEAD, run

pip install git+https://github.com/ghcollin/multitables.git

You can also download or clone the repository and run

python setup.py install

multitables depends on tables (the pytables package), numpy, msgpack, and wrapt. The package is compatible with the latest versions of Python 3, as pytables no longer supports Python 2.

Quick start: Streaming

import multitables
stream = multitables.Streamer(filename='/path/to/h5/file')
for row in stream.get_generator(path='/internal/h5/path'):
    do_something(row)

Quick start: Random access

import multitables
reader = multitables.Reader(filename='/path/to/h5/file')

dataset = reader.get_dataset(path='/internal/h5/path')
stage = dataset.create_stage(10) # Size of the shared
                                    # memory stage in rows

req = dataset['col_A'][30:35] # Create a request as you
                                 # would index normally.

future = reader.request(req, stage) # Schedule the request
with future.get_unsafe() as data:
    do_something(data)
data = None # Always set data to None after get_unsafe to
            # prevent a dangling reference

# ... or use a safer proxy method

req = dataset.col('col_A')[30:35,...,:100]

future = reader.request(req, stage)
with future.get_proxy() as data:
    do_something(data)

# ... or provide a function to run on the data

req = dataset.read_sorted('col_C', checkCSI=True, start=200, stop=300)

future = reader.request(req, stage)
future.get_direct(do_something)

# ... or get a copy of the data

req = dataset['col_A'][30:35,np.arange(500) > 45]

future = reader.request(req, stage)
do_something(future.get())

# once done, close the reader
reader.close(wait=True)

Examples

See the How-To for more in-depth documentation, and the unit tests for complete examples.