utils#
data_mask#
- ensure_id(mask: List[str])#
- filter_data(data: dict, mask: dict | None)#
- masks_overlap(pub: dict | None, sub: dict | None)#
calculates whether there is overlap between the pub and sub filters of two models. This function assumes that the two filters have been validated using validate_filter
- validate_mask(data_mask: dict | None)#
determines whether the dataset filter has the correct shape, it must be lists inside dictionaries inside a dictionary. eg.: {“some_dataset”: {“some_entity_group”: [“attribute1”, “attribute2”]}}
Also, at every level, the filter must either be filled or be none. It cannot be an empty container, eg:
{"some_dataset": {}}
{"some_dataset": {"some_entity_group": ["attribute1"], "empty_group": []}}
lifecycle#
- deprecated(obj=None, alternative: str | None = None)#
- has_deprecations(cls)#
logging#
- captureWarnings(logger)#
If logger is an instance of logging.Logger, redirect all warnings to that logger. If logger is None, ensure that warnings are not redirected to logging but to their original destinations.
path#
- DatasetPath(*args, **kwargs)#
JsonPath is a subclass of pathlib.Path that points to a Movici format dataset file. It has one additional method read_dict that returns a dictionary of the dataset
- Parameters:
path – The location of the the dataset file
strategies#
- get_instance(strat: Type[T], **kwargs) T #
- get_type(strat: Type[T]) Type[T] #
- reset()#
- set(strat)#
time#
- string_to_datetime(datetime_str: str, max_year=5000, **kwargs) datetime #
Convert a string into a datetime. datetime_str can be one of the following
A year (eg. ‘2025’)
A unix timestamp (in seconds) (eg. ‘1626684322’)
A dateutil parsable string
- Parameters:
max_year – int. The cutoff for when a datestime_str representing a single integer is interpreted as a year or as a unix timestamp
kwargs – Additional parameters passed directly into the dateutil.parser to customize parsing. For example dayfirst=True.
unicode#
- determine_new_unicode_dtype(a: ndarray, b: ndarray | str, max_size=256) dtype | None #
Determine the new unicode dtype for array a if it needs to be updated with data coming from b.
Returns: a new np.dtype if required or None if the dtype can remain the same. A new dtype is the first power of 2 that fits the dtype of b
- equal_str_dtypes(a: ndarray, b: ndarray)#
- get_unicode_dtype(size, max_size=256)#
- largest_unicode_dtype(a: ndarray, b: ndarray | str, max_size=256)#
Determines whether the dtype of unicode array a and/or b must be upcasted to the largest size dtype of the two arrays to be able to use them both in numba jit compiled functions, since numba requires unicode arrays to be of the same itemsize in order to do certain operations, such as comparisons.
- :returns The largest dtype of the two or None if no upcasting has to be done (or when the
arrays involved are not unicode or bytes)
- next_power_of_two(val, max_val=256)#
Module contents#
- DatasetPath(*args, **kwargs)#
JsonPath is a subclass of pathlib.Path that points to a Movici format dataset file. It has one additional method read_dict that returns a dictionary of the dataset
- Parameters:
path – The location of the the dataset file
- determine_new_unicode_dtype(a: ndarray, b: ndarray | str, max_size=256) dtype | None #
Determine the new unicode dtype for array a if it needs to be updated with data coming from b.
Returns: a new np.dtype if required or None if the dtype can remain the same. A new dtype is the first power of 2 that fits the dtype of b
- filter_data(data: dict, mask: dict | None)#
- largest_unicode_dtype(a: ndarray, b: ndarray | str, max_size=256)#
Determines whether the dtype of unicode array a and/or b must be upcasted to the largest size dtype of the two arrays to be able to use them both in numba jit compiled functions, since numba requires unicode arrays to be of the same itemsize in order to do certain operations, such as comparisons.
- :returns The largest dtype of the two or None if no upcasting has to be done (or when the
arrays involved are not unicode or bytes)
- masks_overlap(pub: dict | None, sub: dict | None)#
calculates whether there is overlap between the pub and sub filters of two models. This function assumes that the two filters have been validated using validate_filter
- string_to_datetime(datetime_str: str, max_year=5000, **kwargs) datetime #
Convert a string into a datetime. datetime_str can be one of the following
A year (eg. ‘2025’)
A unix timestamp (in seconds) (eg. ‘1626684322’)
A dateutil parsable string
- Parameters:
max_year – int. The cutoff for when a datestime_str representing a single integer is interpreted as a year or as a unix timestamp
kwargs – Additional parameters passed directly into the dateutil.parser to customize parsing. For example dayfirst=True.
- validate_mask(data_mask: dict | None)#
determines whether the dataset filter has the correct shape, it must be lists inside dictionaries inside a dictionary. eg.: {“some_dataset”: {“some_entity_group”: [“attribute1”, “attribute2”]}}
Also, at every level, the filter must either be filled or be none. It cannot be an empty container, eg:
{"some_dataset": {}}
{"some_dataset": {"some_entity_group": ["attribute1"], "empty_group": []}}