DEMutilities.postprocessing. h5storage

The h5storage module interfaces Mpacts simulations with HDF5 files through the Python interface h5py. The class H5Storage can be used both in a simulation session or during a postprocessing session, and adds information of the simulation, its environment, and any desired ‘results’ in the HDF5 file. The classes HdfFrameReader and HdfReader provide convenient read access to individual .hdf5 files and collections of .hdf5 files.

For using this module, a working installation of h5py is required. In debian-based systems, you can install this as:

$ sudo apt install python-h5py

or on other systems, using pip:

$ sudo pip install h5py

In order to be able to use this module import it like this:

import DEMutilities.postprocessing.h5storage
#or assign it to a shorter name
import DEMutilities.postprocessing.h5storage as h5s

H5BaseEntry

class DEMutilities.postprocessing.h5storage.H5BaseEntry(f, name, overwrite)

Bases: object

Base class for data entries in Hdf5 files, such as H5Entry and H5DataSection. The default way of accessing elements within a H5BaseEntry is using the __call__ operator (()), which tries to return the required data first as another H5Entry/H5DataSection, or an h5py.Dataset/:obj`h5py.Group` otherwise.

Parameters:
  • f (either an actual h5py h5py.HLObject, something that derives from H5BaseEntry, or and HdfFrameReader instance.) – Hdf5 file object that will be accessed
  • name (str) – From the current location (or in absolute path), the name in the file that this entry must point towards. If the file is accessed in write-mode, a new entry will be created with this name, should it not exist
  • overwrite (bool) – If True and the file is accessed in write-mode, any existing entry with the same name will be deleted and replaced with a new, empty one.
H5BaseEntry(name, parent, **kwargs)
assert_write_mode()

Asserts that the current file is opened in write-compatible mode. Throws otherwise.

get(value)

Gets the hf5 Group/DataSet with given name. None is returned if it does not exist.

get_attr(name)

Gets this entry’s attribute with given name. None is returned if it does not exist.

ls(pad=5)

Allows easy browsing through hdf5 files in interactive terminals by listing all contents of the current entry.

print_entries(pad=5)
set_attr(name, value)

Sets this entry’s attribute with name to value. The attribute is created if it did not exist

H5DataSection

class DEMutilities.postprocessing.h5storage.H5DataSection(f, name, overwrite=False)

Bases: DEMutilities.postprocessing.h5storage.H5BaseEntry

Class used for results sections in an H5Storage class instance. H5DataSection provides functions that convert value- or array-like data into a commonly agreed upon data format inside the HDF5 file. H5DataSection should usually not be instantiated directly, but through the data_section() function of an H5Storage object.

Parameters:
  • f (either an actual h5py h5py.HLObject, something that derives from H5BaseEntry, or and HdfFrameReader instance.) – Hdf5 file object that will be accessed
  • name (str) – From the current location (or in absolute path), the name in the file that this entry must point towards. If the file is accessed in write-mode, a new entry will be created with this name, should it not exist
  • overwrite (bool) – If True and the file is accessed in write-mode, any existing entry with the same name will be deleted and replaced with a new, empty one.
H5DataSection(name, parent, **kwargs)
add_data(name, data, axes=None, description='', axis_label='', label='', unit='', **kwargs)

General all-purpose method to add ‘any’ data to an HDF5 file using a common syntax. Adds a new data set group to this H5DataSection with a given name. In this group, a member with name "data" is created with contains the values contained in argument data.

Parameters:
  • name (str) – Name of the new data entry in the HDF5 file
  • data – Any HDF5-compatible data that will be stored in the group with name name. Rule-of-thumb: If something can be converted into a numpy array it can probably be saved.
  • axes (list) – List of strings refering to other (existing) data set entries specified by either a relative or an absolute path in the HDF5 file. By specifying axes, you can indicate that the given data is a dependent variable that is a function of the given independent variables. For example, if the data stored in F contains forces that vary with time t, we can specify axes = ['t'] to capture this relationship. For multidimensional data, more than one axis can be specified. If this is done, the actual data will be reshaped such that shape=(len(a1),len(a2),..,len(an)) for n dimensions. For example, if the data stored in F depends on time t and particle index idx, we can specify axes=['t','idx']. While optional, providing axes is crucial for preventing that multidimensional data is later plotted or represented improperly.
  • description (str) – (optional) description of the data that will be stored
  • axis_label (str) – (optional) label of the axis along which given data can vary. This field will be consulted e.g. by plotting tools to generate axis labels. LaTeX syntax is accepted e.g. one can write "$F_n$" to generate F_n.
  • unit (str) – (optional) unit of the values in the given data. This field will be consulted e.g. by plotting tools to generate axis labels. LaTeX syntax is accepted e.g. one can write "J/m$^2$".
  • **kwargs – Any additional key word argument is added to the data set entry with name = ‘key’ and value = ‘value’. For example, specifying foo=2 in the argument list will create an additional attribute named "foo" with value 2.
apply_function_on_data_axis(source_name, result_name, func, axis, *args, **kwargs)
apply_function_to_data(source_name, result_name, func, *args, **kwargs)
copy_all_data_recursive(source, destination, **kwargs)

Recursively copy all dataset entries in ‘source’ to ‘destination’. We will create data set entries with the same name and dataset tree structure as exist in source and put them in destination. Key word arguments as accepted by copy_data() can be provided.

Parameters:
  • source (str) – Path from which all sub-entries will be copied
  • destination (str) – Path (that might be created) towards all new sub-entries will be copied
  • kwargs – Key word arguments accepted by copy_data
copy_data(source, destination, overwrite=True, **kwargs)

Copy a data entry at path ‘source’ to an entry with name ‘destination’ All data elements are copied exactly per default. Only the dataset entry ‘data’ can be optionally ‘filtered’ with data indices to be copied

Parameters:
  • source (str) – Path to data set entry that should be copied
  • destination (str) – Path to new data set entry which should be a copy of source
  • overwrite (bool) – If True, the copy will fully erase any similarly named dataset entry that already existed. If False, no data will be deleted, but an error might be raised instead, since the copy could not be performed.
  • kwargs – Key word arguments are extracted as follows: If ‘data_indices’ is present, it will simply be used to extract from data all array values with these indices. Else, for any other single key word argument a dataset entry with this precise name will be searched. Next the data will be ‘masked’ with all content provided in the given key word argument.
data_section(name, overwrite=False)

Adds a new data section, or accesses an existing one with given name.

scale(name, value, **kwargs)
set_attrs(name, **kwargs)

Sets all attributes in the entry with name name with given key word arguments.

shift(name, value, **kwargs)

H5Entry

class DEMutilities.postprocessing.h5storage.H5Entry(f, name, overwrite=True)

Bases: DEMutilities.postprocessing.h5storage.H5BaseEntry

Common interface to data entries in Hdf5 files, with a common and recognizable data-layout. All H5Entry objects have a [‘data’] member that contains the actual data, and are preferentially initialized using the add_data() method on H5DataSection objects. On top of h5py functionality, H5Entry implements some convenient methods to change and manipulate the underlying data, and to manage its location in the Hdf5 file.

Parameters:
  • f (either an actual h5py h5py.HLObject, something that derives from H5BaseEntry, or and HdfFrameReader instance.) – Hdf5 file object that will be accessed
  • name (str) – From the current location (or in absolute path), the name in the file that this entry must point towards. If the file is accessed in write-mode, a new entry will be created with this name, should it not exist
  • overwrite (bool) – If True and the file is accessed in write-mode, any existing entry with the same name will be deleted and replaced with a new, empty one.

Tip

Using the [] operators on an H5Entry one can have direct access to array elements, if the passed value is of integer type.

H5Entry(name, parent, **kwargs)
assert_data()

Asserts that a valid "data" exists in this data entry. If not, a new one with an empty list [] is created.

copy(name=None, **kwargs)

Copy this data entry to name. If name is a data section which is not our own data section, a copy is made with the same name in that data section. If name is a data entry other than ourselves, this given entry is replaced by our own data entry. If name is a string, it must refer to a valid path in the hdf5 file, toward which this entry will be copied. If name is None, a new element will be created with the same name as ourselves, and -copy appended to it and will be locate in the same data section as our own. Returns the copied data entry.

Example usage:

d1 = H5Entry( f, "test1" )
d2 = H5Entry( f, "test2")
s1 = H5DataSection( f, "section" )

d3 = d1.copy()#Name will be assigned automatically
d4 = d1.copy( d2 )
d5 = d1.copy( s1 )
filter_on_axis(result_name, value, axis, **kwargs)
function(function, *args)

Applies a given function to the data of this data entry and stores the result in-place. Optional arguments to the function are passed as additional arguments.

Example usage:

import numpy

d1 = H5Entry(f, "test1")
d1.set_data( [1,2,3,4,5,6,7,8] )
d1.function( numpy.multiply, 2  )
function_on_axis(result_name, func, axis, *args, **kwargs)

Applies a function to this data entry, but along a given axis (assuming it contained a multidimensional array). A new data entry is always created with result_name. Any ‘axis’ members that this data entry contains will be automatically updated for the created entry, properly reduced with the given axis. For more information on this functionality, see apply_function_on_data_axis() of H5DataSection.

get_data()

Returns the underlying data as a numpy array. If no data exists, an empty array is returned.

move(name)

Copy this data entry to name. If name is a data section which is not our own data section, it is moved to an entry with the same name in that data section. If name is a data entry other than ourselves, this given entry is replaced by our own data entry. If name is a string, it must refer to a valid path in the hdf5 file, toward which this entry will be moved. Returned the moved data entry.

Example usage:

d1 = H5Entry( f, "test1" )
d2 = H5Entry( f, "test2")
s1 = H5DataSection( f, "section" )

d3 = d1.move( s1 )
d4 = d1.copy( d2 )
remove_data()

Remove the "data" in this data entry.

scale(value)

Scales (multiplies with) the data of this data entry with given value

set_attrs(**kwargs)

Sets attributes passed as key word arguments.

Example usage:

d1 = H5Entry( f, "test1" )
d1.set_attrs( a=1, b=2, unit='N/m')
set_data(data)

Sets the "data" of this entry to the given value. All previous existing data is removed

shift(value)

Shifts (adds to) the data of this data entry with given value

value()

Returns the raw value contained in the underlying h5 file.

H5Storage

class DEMutilities.postprocessing.h5storage.H5Storage(filename, openmode='a', scriptname=None, overwrite=False)

Bases: DEMutilities.postprocessing.h5storage.H5BaseEntry

Class for storing mpacts simulation and postprocessing information in an HDF5 file

Parameters:
  • filename (str) – Name of the HDF5 file to be created
  • openmode (str) – File open mode. Options are: "w": write mode: any new write action will overwrite the existing data "a": append mode: any new write action will append to the original file "r": read mode: no write action is allowed "wa": write-append mode. The file is initially opened in write mode, thereby overwriting any existing HDF5 file with the same name, but individual write actions will later be performed in append mode, ensuring that stored data is maintained. This mode is recommended for use within simulations.
  • scriptname (str) – Name of the original Python script that was used to perform the simulations. If specified, this script will be added to the HDF5 file when calling add_simulation_info().
  • overwrite (bool) – If True, adding sub-groups with the same name (e.g. identical results sections) will not trigger an error, but simply replace the original data.

Example usage within a simulation script:

import mpacts.core.simulation as sim
import mpacts.core.valueproperties as v

storage = H5Storage( 'simulation.h5', 'wa', overwrite=True )
mysim = sim.simulation( "simulation" )
#Add an Mpacts Variable to the simulation. Its value will be saved automatically in the HDF5 file!
v.Variable( "g", mysim('params'), value=(0,-9.81,0) )
storage.add_simulation_info(mysim)

#... Add all necessary elements for a full mpacts simulation

#Run the simulation
mysim.run_n(100)

#This will add some performance and environment information to the HDF5 file:
storage.simulation_completed()

For post-processing, the member function data_section() can be called to add objects of H5ResultSection, which provide ‘save’ functions for specific data types. Example usage in a post-processing step:

#Ensure that we use 'append' mode 'a' now, since otherwise our original data is lost:
storage = H5Storage( 'simulation.h5', 'a' )

#Add a 'results' section to our data:
results = storage.data_section("results", overwrite=True)

#Assume we want to store a single value called 'value':
results.add_data("value_name", value, description="The best value"
                unit="kg", axis_label='$v_\mathrm{best}$')
H5Storage(name, parent, **kwargs)
add_mpacts_info()

Add information about the version of mpacts used to perform this simulation. The following information will be added:

  • git revision of current mpacts version
  • C++ compiler used for building the current mpacts version
  • whether the current build of mpacts is a debug version
add_performance_info()

Add general information about the computational process used in this simulation. The following information will be added:

  • peak memory usage in MB
  • user time and system time
  • elapsed time and process elapsed time
  • end time of the simulation as a human readable date
  • end time of the simulation in seconds from the current epoch.
add_simulation_info(mysim, name='settings', store_parameters=True)

Add a some information about an Mpacts simulation to an H5Storage object

Parameters:
  • mysim (mpacts.core.simulation.Simulation) – A valid mpacts simulation (root of the tree)
  • name (str) – Name under which the parameters of the simulation will be stored
This function will add the following information under "./simulation_info":
  • start time of the simulation as a human readable date
  • start time of the simulation in seconds from the start of the epoch
  • a unique identifier for this simulation
  • the number of threads used for this simulation
  • the type of contact data storage used for this simulation
  • the (Python) working directory when the simulation was originally performed
  • optionally, the Python script that specifies this simulation
  • all properties used in the simulation. These will be located under name.
add_simulation_metrics(mysim)
add_system_info()

Add information about the system used to perform this simulation The following information will be added:

  • Distibution of Linux (if any)
  • Implementation of the Python interpreter
  • Version of Python
  • Information about the CPU
  • Version of standard C library libc
  • Host name of the used computing node
  • System architecture family
  • OS kernel version
  • OS family (e.g. Windows, Linux, or Darwin=Mac).
close()

Close the existing HDF5 file object if it was not already closed

data_section(name, overwrite=False)
simulation_completed(sim=None)
store_parameters(sim, group)

HdfFrameReader

class DEMutilities.postprocessing.h5storage.HdfFrameReader(fname, index=-1)

Bases: object

Reader class that provides data access to an HDF5 file

Parameters:fname (str) – Name of a valid HDF5 file that should be accessed.

Example usage:

#Make an HdfFrameReader object:
data = HdfFrameReader( 'simulation.h5' )

#Data access has the same interface as DataFrameReaders, and VTPDataFrameReaders:
start_time = data('simulatation_info/start_time')
print( start_time )
HdfFrameReader(name, parent, **kwargs)
close()
DEMutilities.postprocessing.h5storage.call_h5_item(h5file, item_name)
DEMutilities.postprocessing.h5storage.filt(array, value)
DEMutilities.postprocessing.h5storage.h5todict(*dsargs, **kwargs)
DEMutilities.postprocessing.h5storage.h5topandas(*dsargs, **kwargs)
DEMutilities.postprocessing.h5storage.import_csv(ds, fname, **kwargs)

In a given data section import the named data field entries from a csv file with given file name.

Parameters:
  • ds – Data section entry (H5DataSection)
  • fname (str) – Name of the csv file which will be imported in the hdf5 file.
DEMutilities.postprocessing.h5storage.import_h5(ds, fname, **kwargs)
DEMutilities.postprocessing.h5storage.neutralfunc(val)