SQLSample  type: sql_sample#

class lumen.transforms.sql.SQLSample(*, percent, sample_kwargs, seed, size, comments, error_level, identify, optimize, pretty, read, unsupported_level, write, controls, name)#

Samples rows from a SQL query using TABLESAMPLE or similar functionality, depending on the dialect’s support.


Parameters#

comments

type: bool
default: False
Whether to include comments in the output SQL

error_level

type: sqlglot.ErrorLevel
default: <ErrorLevel.RAISE: 'RAISE'>
Error level for parsing

identify

type: bool
default: False
Delimit all identifiers, e.g. turn FROM database.table into FROM "database"."table".This is useful for dialects that don’t support unquoted identifiers.

optimize

type: bool
default: False
Whether to optimize the generated SQL query; may produce invalid results, especially withduckdb’s read_* functions.

percent

type: Number
default: 10.0
bounds: (0.0, 100.0)
percent of rows to sample. Must be between 0 and 100.

pretty

type: bool
default: False
Prettify output SQL, i.e. add newlines and indentation

read

type: str
default: None
Source dialect for parsing; if None, automatically detects

sample_kwargs

type: dict
default: {}
Other keyword arguments, like method, bucket_numerator, bucket_denominator, bucket_field.

seed

type: int
default: None
bounds: None
Random seed for reproducible sampling.

size

type: int
default: None
bounds: None
Absolute number of rows to sample. If specified, takes precedence over percent.

unsupported_level

type: sqlglot.ErrorLevel
default: <ErrorLevel.WARN: 'WARN'>
When using to_sql, how to handle unsupported dialect features.

write

type: str
default: None
Target dialect for output; if None, defaults to read dialect


Methods#

SQLSample.apply(sql_in: str) str#

Given an SQL statement, manipulate it, and return a new SQL statement.

Parameters:

sql_in (string) – The initial SQL query to be manipulated.

Returns:

New SQL query derived from the above query.

Return type:

string

SQLSample.parse_sql(sql_in: str) Expression#

Parse SQL string into sqlglot AST.

Parameters:

sql_in (string) – SQL string to parse

Returns:

Parsed SQL expression

Return type:

sqlglot.Expression

SQLSample.to_spec(context: dict[str, Any] | None = None) dict[str, Any]#

Exports the full specification to reconstruct this component.

Return type:

Resolved and instantiated Component object

SQLSample.to_sql(expression: Expression) str#

Convert sqlglot expression back to SQL string.

Parameters:

expression (sqlglot.Expression) – Expression to convert to SQL

Returns:

SQL string representation

Return type:

string