Table
heavylight.Table
¤
Table provides multi-key compatible high performance table lookup.
__init__(df: pd.DataFrame, rectify: Union[bool, None] = False, safe: Union[bool, None] = True)
¤
Initialise a table from a dataframe.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df
|
DataFrame
|
the pandas dataframe used to initialise the table |
required |
rectify
|
Union[bool, None]
|
force table to be rectangular (default False) |
False
|
safe
|
Union[bool, None]
|
validates that integers are between bounds (default True) |
True
|
Tables should be in long format
- the final column containing the values to look up
- all other columns contain keys to lookup
- tables should be contingous, i.e. no gaps in integer keys.
- tables should be complete if viewed as square matrixes (i.e. all combinations of keys are input). If not, you should fill any gaps with np.nan or a suitable value.
The type of key is determined by the suffix on the dataframe df
column names:
|int
: integers (...0, 1, 2, 3...), can start and end anywhere, but must be consecutive
|int_bound
: as |int
but any values are constrained to the lowest and highest values.
|int_cat
: as |str, categorical integers.
|str': keys are interpreted as strings, e.g. 'M' and 'F'
|band: key is numeric and treated as the upper bound on a lookup.
|float`: not currently available due to floating point equality, use int or band depending on use case.
rectify(df: pd.DataFrame, fill=np.nan) -> pd.DataFrame
staticmethod
¤
Convert a triangular (incomplete) dataframe into a valid rectangular dataframe
any missing points will be filled with fill
, default: np.nan
read_excel(spreadsheet_path, sheet_name)
classmethod
¤
Read in a table from an excel sheet, for more control pass in the dataframe using __init__
read_csv(csv_path)
classmethod
¤
Read in a table from an csv file, for more control pass in the dataframe using __init__