Source#
- class lumen.sources.base.Source(*, cache_dir, cache_per_query, cache_with_dask, root, shared, name)#
Source components provide allow querying all kinds of data.
A Source can return one or more tables queried using the .get_tables method, a description of the data returned by each table in the form of a JSON schema accessible via the .get_schema method and lastly a .get method that allows filtering the data.
The Source base class also implements both in-memory and disk caching which can be enabled if a cache_dir is provided. Data cached to disk is stored as parquet files.
Parameters#
type: str
default: None
Whether to enable local cache and write file to disk.
type: bool
default: True
Whether to query the whole dataset or individual queries.
type: bool
default: True
Whether to read and write cache files with dask if available.
type: str
default: 'Source'
String identifier for this object.
type: pathlib.Path
default: None
Root folder of the cache_dir, default is config.root
type: bool
default: False
Whether the Source can be shared across all instances of thedashboard. If set to True
the Source will be loaded oninitial server load.
Methods#
- Source.clear_cache(*events: Event)#
Clears any cached data.
- Source.get(table: str, **query) DataFrame #
Return a table; optionally filtered by the given query.
- Parameters:
table (str) – The name of the table to query
query (dict) – A dictionary containing all the query parameters
- Returns:
A DataFrame containing the queried table.
- Return type:
DataFrame
- Source.get_schema(table: str | None = None, limit: int | None = None) Dict[str, Dict[str, Any]] | Dict[str, Any] #
Returns JSON schema describing the tables returned by the Source.
- Parameters:
table (str | None) – The name of the table to return the schema for. If None returns schema for all available tables.
limit (int | None) – Limits the number of rows considered for the schema calculation
- Returns:
JSON schema(s) for one or all the tables.
- Return type:
dict
- Source.get_tables() List[str] #
Returns the list of tables available on this source.
- Returns:
The list of available tables on this source.
- Return type:
list
- Source.to_spec(context: Dict[str, Any] | None = None) Dict[str, Any] #
Exports the full specification to reconstruct this component.
- Return type:
Resolved and instantiated Component object
Types#
RESTSource
allows querying REST endpoints conforming to the Lumen REST specification.
FileSource
loads CSV, Excel and Parquet files using pandas and dask read_*
functions.
WebsiteSource
queries whether a website responds with a 400 status code.
PanelSessionSource
queries the session_info endpoint of a Panel application.
JoinedSource
performs a join on tables from one or more sources.
DerivedSource
applies filtering and transforms to tables from other sources.
The JSONSource is very similar to the FileSource but loads json files.
An IntakeSource
loads data from an Intake catalog.
IntakeSQLSource
extends the IntakeSource
with support for SQL data.