Core
pyqvd.qvd
Module contains the core classes and functions for dealing with QVD files. The main class is the
QvdTable class, which represents a the internal data table of a QVD file.
QvdValue
- class pyqvd.QvdValue
Base class for all QVD data types. All values in a QVD file must inherit from this class.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__()
- abstract property calculation_value: object
Returns the calculation value of this QVD value. This value is used for calculations and sorting operations.
- Returns:
The calculation value.
- abstract property display_value: str
Returns the representational value of this QVD value. This value is used for display purposes.
- Returns:
The display value.
IntegerValue
- class pyqvd.IntegerValue(value: int)
Represents an integer value in a QVD file.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(value: int)
Constructs a new integer value.
- Parameters:
value – The integer value.
- property calculation_value: int
Returns the calculation value of this QVD value. This value is used for calculations and sorting operations.
- Returns:
The calculation value.
- property display_value: str
Returns the representational value of this QVD value. This value is used for display purposes.
- Returns:
The display value.
DoubleValue
- class pyqvd.DoubleValue(value: float)
Represents a double value in a QVD file.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(value: float)
Constructs a new double value.
- Parameters:
value – The double value.
- property calculation_value: float
Returns the calculation value of this QVD value. This value is used for calculations and sorting operations.
- Returns:
The calculation value.
- property display_value: str
Returns the representational value of this QVD value. This value is used for display purposes.
- Returns:
The display value.
StringValue
- class pyqvd.StringValue(value: str)
Represents a string value in a QVD file.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(value: str)
Constructs a new string value.
- Parameters:
value – The string value.
- property calculation_value: str
Returns the calculation value of this QVD value. This value is used for calculations and sorting operations.
- Returns:
The calculation value.
- property display_value: str
Returns the representational value of this QVD value. This value is used for display purposes.
- Returns:
The display value.
DualIntegerValue
- class pyqvd.DualIntegerValue(int_value: int, string_value: str)
Represents a dual value with an integer value and a string value in a QVD file.
Dual values are used to store both a display value and a calculation value in a single field. This is useful when the display representation of a value is different from the calculation representation. For example, you may want to display a date as “MM/DD/YYYY” but store it as an integer value representing the number of days since a certain date.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(int_value: int, string_value: str)
Constructs a new dual integer value.
- Parameters:
int_value – The integer value.
string_value – The string value.
- property calculation_value: int
Returns the calculation value of this QVD value. This value is used for calculations and sorting operations.
- Returns:
The calculation value.
- property display_value: str
Returns the representational value of this QVD value. This value is used for display purposes.
- Returns:
The display value.
DualDoubleValue
- class pyqvd.DualDoubleValue(double_value: float, string_value: str)
Represents a dual value with a double value and a string value in a QVD file.
Dual values are used to store both a display value and a calculation value in a single field. This is useful when the display representation of a value is different from the calculation representation. For example, you may want to display a monetary value as “$1,000.00” but store it as a double value representing the number of cents.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(double_value: float, string_value: str)
Constructs a new dual double value.
- Parameters:
double_value – The double value.
string_value – The string value.
- property calculation_value: float
Returns the calculation value of this QVD value. This value is used for calculations and sorting operations.
- Returns:
The calculation value.
- property display_value: str
Returns the representational value of this QVD value. This value is used for display purposes.
- Returns:
The display value.
TimeValue
- class pyqvd.TimeValue(double_value: float, string_value: str)
Represents a time value in a QVD file.
Times are stored as dual double values where the double value represents the fraction of a day and the string value represents the time in a human-readable format. This data type does not exist in QVD files and is provided for convenience. In QVD files, times are stored as dual double values with a number format of “TIME” if the column is a uniform time column.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(double_value: float, string_value: str)
Constructs a new dual double value.
- Parameters:
double_value – The double value.
string_value – The string value.
- static from_serial_number(serial_number: float) TimeValue
Creates a new time value from a serial number.
- Parameters:
serial_number – The serial number representing the time.
- Returns:
The time value.
- static from_time(time: time) TimeValue
Creates a new time value from a time.
- Parameters:
time – The time value.
- Returns:
The time value.
- property time: time
Returns the time value.
- Returns:
The time value.
DateValue
- class pyqvd.DateValue(int_value: int, string_value: str)
Represents a date value in a QVD file.
Dates are stored as dual integer values where the integer value represents the number of days since the base date (December 30, 1899) and the string value represents the date in a human- readable format. This data type does not exist in QVD files and is provided for convenience. In QVD files, dates are stored as dual integer values with a number format of “DATE” if the column is a uniform date column.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(int_value: int, string_value: str)
Constructs a new dual integer value.
- Parameters:
int_value – The integer value.
string_value – The string value.
- property date: date
Returns the date value.
- Returns:
The date value.
TimestampValue
- class pyqvd.TimestampValue(double_value: float, string_value: str)
Represents a timestamp value in a QVD file.
Timestamps are stored as dual double values where the double value represents the fraction of a day and the string value represents the timestamp in a human-readable format. This data type does not exist in QVD files and is provided for convenience. In QVD files, timestamps are stored as dual double values with a number format of “DATETIME” if the column is a uniform timestamp column.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(double_value: float, string_value: str)
Constructs a new dual double value.
- Parameters:
double_value – The double value.
string_value – The string value.
- static from_serial_number(serial_number: float) TimestampValue
Creates a new timestamp value from a serial number.
- Parameters:
serial_number – The serial number representing the timestamp.
- Returns:
The timestamp value.
- static from_timestamp(timestamp: datetime) TimestampValue
Creates a new timestamp value from a timestamp.
- Parameters:
timestamp – The timestamp or time value.
- Returns:
The timestamp value.
- property timestamp: datetime
Returns the timestamp value.
- Returns:
The timestamp value.
IntervalValue
- class pyqvd.IntervalValue(double_value: float, string_value: str)
Represents an interval value in a QVD file.
Intervals are stored as dual double values where the double value represents the fraction of a day and the string value represents the interval in a human-readable format. This data type does not exist in QVD files and is provided for convenience. In QVD files, intervals are stored as dual double values with a number format of “INTERVAL” if the column is a uniform interval column.
Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(double_value: float, string_value: str)
Constructs a new dual double value.
- Parameters:
double_value – The double value.
string_value – The string value.
- static from_interval(interval: timedelta) IntervalValue
Creates a new interval value from an interval.
- Parameters:
interval – The interval value.
- Returns:
The interval value.
- static from_serial_number(serial_number: float) IntervalValue
Creates a new interval value from a serial number.
- Parameters:
serial_number – The serial number representing the interval.
- Returns:
The interval value.
- property interval: timedelta
Returns the interval value.
- Returns:
The interval value.
MoneyValue
- class pyqvd.MoneyValue(double_value: float, string_value: str)
Represents a money value in a QVD file. Money values are stored as dual double values where the double value represents the monetary value and the string value represents the money in a human- readable format. This data type does not exist in QVD files and is provided for convenience. In QVD files, money values are stored as dual double values with a number format of “MONEY” if the column is a uniform money column.
Important
It is important to note that Python does not have a built-in money data type. This class is provided as a convenience for working with money values in QVD files. It is recommended to use the
decimal.Decimalclass for monetary calculations in Python. Because it is not possible to differ between adecimal.Decimalvalue that is representing money and adecimal.Decimalvalue that is representing a non-monetary value, alldecimal.Decimalvalues are considered to be monetary values and will therefore be converted toMoneyValueobjects when importing data from a dictionary or a pandas DataFrame for example.Important
Instances of this class are immutable. After initialization, attributes should not be modified. This guarantees stable hashing and allows instances to be safely used as dictionary keys or set members.
- __init__(double_value: float, string_value: str)
Constructs a new dual double value.
- Parameters:
double_value – The double value.
string_value – The string value.
- static from_money(money: Decimal) MoneyValue
Creates a new money value from a money value.
- Parameters:
money – The money value.
- Returns:
The money value.
- static from_serial_number(serial_number: float) MoneyValue
Creates a new money value from a serial number.
- Parameters:
serial_number – The serial number representing the money.
- Returns:
The money value.
- property money: Decimal
Returns the money value.
- Returns:
The money value.
QvdTable
- class pyqvd.QvdTable(data: List[List[QvdValue]], columns: List[str])
Core class for representing a QVD data table.
- append(row: List[any]) None
Appends a new row to the data table.
- Parameters:
row – The row to append.
- at(row: int, column: str) QvdValue
Returns the value at the specified row and column, where row refers to the current nth record.
- Parameters:
row – The row index.
column – The column name.
- Returns:
The value at the specified row and column.
- property columns: List[str]
Returns the columns of the data table. This property is read-only and immutable.
- Returns:
The column names.
- concat(*args: QvdTable, inplace: bool = False) QvdTable
Concatenates multiple data tables into a single data table. The data tables are concatenated row-wise. If a column is missing in a data table, the values for its rows are filled with None and the column is added to the concatenated data table.
Important
Internally, this method uses the copy.deepcopy function to create a deep copy of the current data table and the data tables to concatenate. This can be very slow for large data tables with many concatenations. For better performance, consider using the inplace parameter to modify the current data table instead of returning a new data table. This will avoid the overhead of creating deep copies of the current data table.
- Parameters:
tables – The data tables to concatenate.
inplace – Instead of returning a new data table, modify the current data table. This may be faster for large data tables with many concatenations.
- Returns:
The concatenated data table.
- copy(deep: bool = True) QvdTable
Returns a copy of the data table.
- Parameters:
deep – Whether to perform a deep copy.
- Returns:
The copy of the data table.
- property data: List[List[QvdValue]]
Returns the internally stored data. This property is read-only and immutable.
- Returns:
The data.
- drop(key: int | str | List[int] | List[str], axis: Literal['rows', 'columns'] = 'rows', inplace: bool = False) QvdTable
Drops the specified rows or columns from the data table.
- Parameters:
key – The key to drop.
axis – The axis to drop along. Must be either ‘rows’ or ‘columns’.
inplace – Instead of returning a new data table, modify the current data table.
- Returns:
The data table with the specified rows or columns dropped.
Examples
You can drop a single row by passing an integer:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.drop(1) >>> tbl A B C --- --- --- 1 2 3 7 8 9
You can drop multiple rows by passing a list of integers:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.drop([0, 2]) >>> tbl A B C --- --- --- 4 5 6
You can drop a single column by passing a string:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.drop("B", axis="columns") >>> tbl A C --- --- 1 3 4 6
You can drop multiple columns by passing a list of strings:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.drop(["A", "C"], axis="columns") >>> tbl B --- 2 5 8
- property empty: bool
Returns whether the data table is empty.
- Returns:
True if the data table is empty; otherwise, False.
- filter_by(column: str, condition: Callable[[QvdValue], bool], inplace: bool = False) QvdTable
Filters the data table by the specified column and condition. By default a new data table is constructed with the filtered data.
- Parameters:
column – The column to filter by.
condition – The condition to filter by.
inplace – Instead of returning a new data table, modify the current data table.
- Returns:
The filtered data table.
- static from_dict(data: Dict[str, any]) QvdTable
Constructs a new QVD data table from a raw value dictionary.
- Parameters:
data – The dictionary representation of the data table.
- Returns:
The QVD data table.
Examples
You can construct a data table from a dictionary:
>>> tbl = QvdTable.from_dict({ ... "columns": ["A", "B", "C"], ... "data": [[1, 2, 3], [4, 5, 6], [7, 8, 9]] ... }) >>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9
- static from_pandas(df: pd.DataFrame, vectorized=True) QvdTable
Constructs a new QVD data table from a pandas data frame.
Important
This method requires the pandas library to be installed. See pandas for more information.
- Parameters:
df – The pandas data frame.
vectorized – Optional flag to enable vectorized conversion.
- Returns:
The QVD data table.
- static from_qvd(path: str | Path, chunk_size: int = None) QvdTable | Iterator[QvdTable]
Loads a QVD file and returns its data table.
- Parameters:
path – The path to the QVD file (str or Path object).
chunk_size – Optional chunk size, as number of records, to read the QVD file in chunks.
- Returns:
The data table of the QVD file or an iterator over the slices of the data table.
- static from_stream(source: BinaryIO, chunk_size: int = None) QvdTable | Iterator[QvdTable]
Constructs a new QVD data table from a binary stream.
- Parameters:
source – The source to the QVD file.
chunk_size – Optional chunk size, as number of records, to read the QVD file in chunks.
- Returns:
The data table of the QVD file or an iterator over the slices of the data table.
- get(key: str | int | slice | Tuple[int, str]) QvdValue | List[QvdValue] | List[List[QvdValue]]
Returns the values for the specified key. As a shorthand, you can also use the indexing operator to get values.
- Parameters:
key – The key to retrieve.
- Returns:
The values for the specified key.
Examples
You can pass a single integer to get a row at the specified index:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.get(0) # Alias tbl[0] [1, 2, 3]
You can pass a single string to get a column with the specified name:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.get("A") # Alias tbl["A"] [1, 4, 7]
You can pass a tuple with an integer and a string to get a value at the specified row and column:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.get((0, "A")) # Alias tbl[0, "A"] 1
You can pass a slice to get a subset of the data table:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.get(slice(0, 2)) # Alias tbl[0:2] [[1, 2, 3], [4, 5, 6]]
- head(n: int = 5) QvdTable
Returns the first n rows of the data table.
- Parameters:
n – The number of rows to return.
- Returns:
The first n rows.
- insert(index: int, row: List[QvdValue]) None
Inserts a new row at the specified index.
- Parameters:
index – The index to insert the row.
row – The row to insert.
- join(other: QvdTable, on: str | List[str], how: Literal['inner', 'left', 'right', 'outer'] = 'outer', lsuffix: str | None = None, rsuffix: str | None = None, inplace: bool = False) QvdTable
Joins the data table with another data table. By default a new data table is constructed with the joined data.
- Parameters:
other – The other data table to join with.
on – The column(s) to join on.
how – The type of join to perform.
lsuffix – The suffix to append to overlapping column names from the left table.
rsuffix – The suffix to append to overlapping column names from the right table.
inplace – Instead of returning a new data table, modify the current data table.
- Returns:
The joined data table.
Examples
You can perform an inner join between two data tables:
>>> tbl1 A B --- --- 1 2 3 4 5 6 >>> tbl2 A C --- --- 1 7 3 8 7 9 >>> tbl1.join(tbl2, on="A", how="inner") A B C --- --- --- 1 2 7 3 4 8
You can use also suffixed for overlapping column names:
>>> tbl1 A B --- --- 1 2 3 4 5 6 >>> tbl2 A B --- --- 1 7 3 8 7 9 >>> tbl1.join(tbl2, on="A", how="inner", lsuffix="_left", rsuffix="_right") A B_left B_right --- --- --- 1 2 7 3 4 8
- rename(columns: Dict[str, str]) QvdTable
Renames the columns of the data table.
- Parameters:
columns – A dictionary mapping the old column names to the new column names.
- Returns:
The data table with the renamed columns.
- rows(*args: int) QvdTable
Returns the specified rows of the data table.
- Parameters:
args – The row indices.
- Returns:
The specified rows.
- select(*columns: str) QvdTable
Returns a new data table with only the specified columns.
- Parameters:
columns – The column names.
- Returns:
The new data table.
- set(key: str | int | slice | Tuple[int, str], value: any | List[any] | List[List[any]]) None
Sets the value for the specified key. As a shorthand, you can also use the indexing operator to set values.
It is possible to set single values, add columns, and overwrite rows or columns with a list of values. Values can also be native Python types, which are automatically converted to QvdValue objects.
- Parameters:
key – The key to set.
value – The value to set.
Examples
You can pass a single integer to overwrite a row at the specified index:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.set(0, [10, 11, 12]) # Alias tbl[0] = [10, 11, 12] >>> tbl A B C --- --- --- 10 11 12 4 5 6 7 8 9
You can pass a single string to overwrite a column with the specified name:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.set("A", [13, 14, 15]) # Alias tbl["A"] = [13, 14, 15] >>> tbl A B C --- --- --- 13 2 3 14 5 6 15 8 9
If you pass a column name that does not exist, a new column is added:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.set("D", [16, 17, 18]) # Alias tbl["D"] = [16, 17, 18] >>> tbl A B C D --- --- --- --- 1 2 3 16 4 5 6 17 7 8 9 18
You can pass a tuple with an integer and a string to overwrite a value at the specified row and column:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.set((0, "A"), 16) # Alias tbl[0, "A"] = 16 >>> tbl A B C --- --- --- 16 2 3 4 5 6 7 8 9
You can pass a slice to overwrite a subset of the data table:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.set(slice(0, 2), 17) # Alias tbl[0:2] = 17 >>> tbl A B C --- --- --- 17 17 17 17 17 17 7 8 9
- property shape: Tuple[int, int]
Returns the shape of the data table.
- Returns:
The shape, which is a tuple containing the number of rows and columns.
- property size: int
Return an int representing the number of elements in this object.
- Returns:
The number of elements in the data table.
- sort_by(column: str, ascending: bool = True, comparator: Callable[[QvdValue, QvdValue], int] | None = None, na_position: Literal['first', 'last'] = 'first', inplace: bool = False) QvdTable
Sorts the data table by the specified column. By default a new data table is constructed with the sorted data.
- Parameters:
column – The column to sort by.
ascending – Whether to sort in ascending
comparator – The comparator function to use for sorting.
na_position – Where to place missing values in the sorted data.
inplace – Instead of returning a new data table, modify the current data table.
- Returns:
The sorted data table.
- tail(n: int = 5) QvdTable
Returns the last n rows of the data table.
- Parameters:
n – The number of rows to return.
- Returns:
The last n rows.
- to_dict() Dict[str, any]
Converts the data table to a dictionary.
- Returns:
The dictionary representation of the data table.
Examples
You can convert the data table to a dictionary:
>>> tbl A B C --- --- --- 1 2 3 4 5 6 7 8 9 >>> tbl.to_dict() {'columns': ['A', 'B', 'C'], 'data': [[1, 2, 3], [4, 5, 6], [7, 8, 9]}
- to_pandas() pd.DataFrame
Converts the data table to a pandas data table. For value conversion, the calculation value is used.
Important
This method requires the pandas library to be installed. See pandas for more information.
- Returns:
The pandas data table.
- to_qvd(path: str | Path, options: QvdFileWriterOptions = None)
Persists the data table to a QVD file.
- Parameters:
path – The path to the QVD file (str or Path object).
- to_stream(target: BinaryIO, options: QvdFileWriterOptions = None)
Writes the QVD file to a binary stream.
- Parameters:
target – The binary stream to write to.