Source#

class lumen.sources.base.Source(*, cache_dir, cache_per_query, cache_with_dask, root, shared, name)#

Source components provide allow querying all kinds of data.

A Source can return one or more tables queried using the .get_tables method, a description of the data returned by each table in the form of a JSON schema accessible via the .get_schema method and lastly a .get method that allows filtering the data.

The Source base class also implements both in-memory and disk caching which can be enabled if a cache_dir is provided. Data cached to disk is stored as parquet files.


Parameters#

cache_dir

type: str
default: None
Whether to enable local cache and write file to disk.

cache_per_query

type: bool
default: True
Whether to query the whole dataset or individual queries.

cache_with_dask

type: bool
default: True
Whether to read and write cache files with dask if available.

name

type: str
default: 'Source'
String identifier for this object.

root

type: pathlib.Path
default: None
Root folder of the cache_dir, default is config.root

shared

type: bool
default: False
Whether the Source can be shared across all instances of thedashboard. If set to True the Source will be loaded oninitial server load.


Methods#

Source.clear_cache(*events: Event)#

Clears any cached data.

Source.get(table: str, **query) DataFrame#

Return a table; optionally filtered by the given query.

Parameters:
  • table (str) – The name of the table to query

  • query (dict) – A dictionary containing all the query parameters

Returns:

A DataFrame containing the queried table.

Return type:

DataFrame

Source.get_schema(table: str | None = None, limit: int | None = None) Dict[str, Dict[str, Any]] | Dict[str, Any]#

Returns JSON schema describing the tables returned by the Source.

Parameters:
  • table (str | None) – The name of the table to return the schema for. If None returns schema for all available tables.

  • limit (int | None) – Limits the number of rows considered for the schema calculation

Returns:

JSON schema(s) for one or all the tables.

Return type:

dict

Source.get_tables() List[str]#

Returns the list of tables available on this source.

Returns:

The list of available tables on this source.

Return type:

list

Source.to_spec(context: Dict[str, Any] | None = None) Dict[str, Any]#

Exports the full specification to reconstruct this component.

Return type:

Resolved and instantiated Component object

Types#

RESTSource type: rest

RESTSource allows querying REST endpoints conforming to the Lumen REST specification.

RESTSource.html
FileSource type: file

FileSource loads CSV, Excel and Parquet files using pandas and dask read_* functions.

FileSource.html
WebsiteSource type: live

WebsiteSource queries whether a website responds with a 400 status code.

WebsiteSource.html
PanelSessionSource type: session_info

PanelSessionSource queries the session_info endpoint of a Panel application.

PanelSessionSource.html
JoinedSource type: join

JoinedSource performs a join on tables from one or more sources.

JoinedSource.html
DerivedSource type: derived

DerivedSource applies filtering and transforms to tables from other sources.

DerivedSource.html
JSONSource type: json

The JSONSource is very similar to the FileSource but loads json files.

JSONSource.html
IntakeSource type: intake

An IntakeSource loads data from an Intake catalog.

IntakeSource.html
IntakeSQLSource type: intake_sql

IntakeSQLSource extends the IntakeSource with support for SQL data.

IntakeSQLSource.html