Skip navigation links

Package etomica.data

Defines structures that represent, generate, process, and output data from a simulation.

See: Description

Package etomica.data Description

Defines structures that represent, generate, process, and output data from a simulation. The data managed by these classes is typically that which would be considered the "results" of the simulation; data needed to conduct the simulation (such as the atom positions) are not normally handled by these classes.

The abstract class Data is the generic class for encapsulation of data. Major subclasses of it are defined in <@link etomica.data.types etomica.data.types>, and include (for example) classes that encapsulate primitives and primitive arrays (such as double and double[]). In addition to the actual data values, the Data class also holds an instance of DataInfo. The DataInfo holds information about the values held by the Data instance; in particular it has a descriptive String label and an instance of Dimension that indicates the physical dimensions of the data values. DataInfo also holds a DataFactory that can be used to construct new Data instances having the same type and structure as the Data holding the DataInfo (for more information about the structure of a Data instance, see the types package). DataInfo is declared final in Data, and is itself immutable. To change the information in DataInfo, it is necessary to change it in the DataSource, which should generate a new Data instance with the updated information.

A flow model is used to process data. Data instances are generated by a DataSource. A DataPump retrieves the Data from the DataSource and passes it to a DataSink; DataPump implements Action, so it can move Data in response to a user action or (more likely) an IntegratorEvent. The DataSink will process the Data (accumulate an average, for example), and if it implements DataPipe it may itself have a DataSink to which it pushes the output Data from its processing step (the output Data might be a different instance than the input). Thus the Data moves down the line until it reaches a DataSink that doesn't pass it further.

Before a Data instance can be sent through a sequence of data processing elements, it is necessary to first send through the DataInfo for the Data. This procedure alerts each element to the type and structure of the Data it can expect to receive. Some elements need to prepare by creating "scratch" Data instances, that they use to conduct their calculations; they do this using the DataFactory held by the DataInfo. The DataPump performs this preparation automatically.

Sometimes it is necessary for a Data instance to be cast to another Data type, one that a Data processing element is configured to handle. To treat this circumstance there are Data casters defined in etomica.data.types, which are DataProcessor subclasses to perform this transformation. These caster are inserted automatically, if needed, when a DataSink is added to a DataPipe.

Most DataSources are classes that perform some measurement on a Box; a DataSource of this variety is termed a Meter, many of which are defined in the package etomica.data.meter. Some DataSources are not connected to a Box; examples includes sources that report the simulation time, or yield the acceptance probability for a Monte Carlo trial. Some such classes are defined in this package (i.e., etomica.data).

Several Data instances can be bundled and processed as a single Data instance. DataGroup is configured to hold heterogeneous Data instances (having different type or structure); DataArray can hold multiple Data instances of the same type and structure (and sharing the same DataInfo). These classes are defined in etomica.data.types. Often Data collected this way must be "unpacked" at some point downstream; this is not easily done in a general way, and the developer must insert a DataGroupExtractor or DataGroupFilter that specifies how such Data are handled.

Skip navigation links