JSONSource  type: json#

class lumen.sources.base.JSONSource(*, chunk_size, dask, kwargs, tables, use_dask, cache_dir, cache_per_query, cache_with_dask, root, shared, name)#

The JSONSource is very similar to the FileSource but loads json files.

Both local and remote JSON files can be fetched by declaring them as a list or dictionaries of tables.


Parameters#

chunk_size

type: int
default: 0
bounds: None
Number of items to load per chunk if a template variableis provided.

dask

type: bool
default: False
Whether to return a Dask dataframe.

kwargs

type: dict
default: None
Keyword arguments to the pandas/dask loading function.

tables

type: list | dict
default: None
List or dictionary of tables to load. If a list is supplied thenames are computed from the filenames, otherwise the keys arethe names. The values must filepaths or URLs to the data:{    'local' : '/home/user/local_file.csv',    'remote': 'https://test.com/test.csv'}

use_dask

type: bool
default: True
Whether to use dask to load files.


Methods#

JSONSource.clear_cache(*events: Event)#

Clears any cached data.

JSONSource.get(table: str, **query) DataFrame#

Return a table; optionally filtered by the given query.

Parameters:
  • table (str) – The name of the table to query

  • query (dict) – A dictionary containing all the query parameters

Returns:

A DataFrame containing the queried table.

Return type:

DataFrame

JSONSource.get_schema(table: str | None = None, limit: int | None = None) Dict[str, Dict[str, Any]] | Dict[str, Any]#

Returns JSON schema describing the tables returned by the Source.

Parameters:
  • table (str | None) – The name of the table to return the schema for. If None returns schema for all available tables.

  • limit (int | None) – Limits the number of rows considered for the schema calculation

Returns:

JSON schema(s) for one or all the tables.

Return type:

dict

JSONSource.get_tables() List[str]#

Returns the list of tables available on this source.

Returns:

The list of available tables on this source.

Return type:

list

JSONSource.to_spec(context: Dict[str, Any] | None = None) Dict[str, Any]#

Exports the full specification to reconstruct this component.

Return type:

Resolved and instantiated Component object