BaseSQLSource type: None#
- class lumen.sources.base.BaseSQLSource(*, excluded_tables, load_schema, cache_data, cache_dir, cache_metadata, cache_per_query, cache_schema, cache_with_dask, metadata_func, root, shared, name)#
The BaseSQLSource implements the additional API required by a SQL based data source.
Parameters#
type: list
default: []
List of table names that should be excluded from the results. Supports:- Fully qualified name: ‘DATABASE.SCHEMA.TABLE’- Schema qualified name: ‘SCHEMA.TABLE’- Table name only: ‘TABLE’- Wildcards: ‘SCHEMA.*’
type: bool
default: True
Whether to load the schema
Methods#
- BaseSQLSource.clear_cache(*events: Event)#
Clears any cached data.
- BaseSQLSource.create_sql_expr_source(tables: dict[str, str], **kwargs)#
Creates a new SQL Source given a set of table names and corresponding SQL expressions.
- BaseSQLSource.execute(sql_query: str, *args, **kwargs) DataFrame #
Executes a SQL query and returns the result as a DataFrame.
- Parameters:
sql_query (str) – The SQL Query to execute
*args (list) – Positional arguments to pass to the SQL query
**kwargs (dict) – Keyword arguments to pass to the SQL query
- Returns:
The result as a pandas DataFrame
- Return type:
pd.DataFrame
- BaseSQLSource.get(table: str, **query) DataFrame #
Return a table; optionally filtered by the given query.
- Parameters:
table (str) – The name of the table to query
query (dict) – A dictionary containing all the query parameters
- Returns:
A DataFrame containing the queried table.
- Return type:
DataFrame
- BaseSQLSource.get_metadata(table: str | list[str] | None) dict #
Returns metadata for one, multiple or all tables provided by the source.
The metadata for a table is structured as:
- {
“description”: …, “columns”: {
- <COLUMN>: {
“description”: …, “data_type”: …,
}
}, **other_metadata
}
If a list of tables or no table is provided the metadata is nested one additional level:
- {
- “table_name”: {
- {
“description”: …, “columns”: {
<COLUMN>: { “description”: …, “data_type”: …, }
}, **other_metadata
}
}
}
- Parameters:
table (str | list[str] | None) – The name of the table to return the schema for. If None returns schema for all available tables.
- Returns:
metadata – Dictionary of metadata indexed by table (if no table was was provided or individual table metdata.
- Return type:
dict
- BaseSQLSource.get_schema(table: str | None = None, limit: int | None = None, shuffle: bool = False) dict[str, dict[str, Any]] | dict[str, Any] #
Returns JSON schema describing the tables returned by the Source.
- Parameters:
table (str | None) – The name of the table to return the schema for. If None returns schema for all available tables.
limit (int | None) – Limits the number of rows considered for the schema calculation
- Returns:
JSON schema(s) for one or all the tables.
- Return type:
dict
- BaseSQLSource.get_sql_expr(table: str | dict)#
Returns the SQL expression corresponding to a particular table.
- BaseSQLSource.get_tables() list[str] #
Returns the list of tables available on this source.
- Returns:
The list of available tables on this source.
- Return type:
list
- BaseSQLSource.normalize_table(table: str) str #
Allows implementing table name normalization to allow fuzze matching of the table name for minor variations such as quoting differences.
- BaseSQLSource.to_spec(context: dict[str, Any] | None = None) dict[str, Any] #
Exports the full specification to reconstruct this component.
- Return type:
Resolved and instantiated Component object