Application¶
Finally, you need a CLI that lets you configure and run your pipeline - that is,
a Ralsei
application
from ralsei import Ralsei
import sqlalchemy
class App(Ralsei):
def __init__(self, url: sqlalchemy.URL) -> None:
super().__init__(url, MyPipeline())
if __name__ == "__main__":
App.run_cli()
Here, Ralsei.run_cli()
is a classmethod
that turns your class into a click command line app.
The url
parameter is parsed from command line and passed into your constructor as its first argument.
Pass in on to the parent’s constructor along with your own custom pipeline:
- class Ralsei(url: sqlalchemy.engine.URL, pipeline: ralsei.graph.Pipeline)¶
The pipeline-running CLI application
Decorate your subclass with
click.option()
decorator to add custom CLI options. Positional arguments are not allowed- Parameters:¶
- url: sqlalchemy.engine.URL¶
When the class constructor is called by the CLI, the URL is provided as the first argument
- pipeline: ralsei.graph.Pipeline¶
The CLI does not give you the pipeline, you must create one in your subclass and pass it to
super().__init__()
Example
@click.option("-s", "--schema", help="Database schema") class App(Ralsei): def __init__( url: sqlalchemy.URL, # First argument must always be the url schema: str | None, # Custom argument added with the click decorator ): super().__init__(url, MyPipeline(schema)) if __name__ == "__main__": App.run_cli()
Custom initialization¶
Hook into connection or environment initialization if necessary -
to automatically create schemas or inject values into jinja templates
@click.option("-s", "--schema", help="Database schema")
class App(Ralsei):
def __init__(self, url: sqlalchemy.URL, schema: Optional[str]) -> None:
self.schema = schema
super().__init__(url, MyPipeline(schema))
def _prepare_env(self, env: SqlEnvironment):
env.globals["my_function"] = custom_function
def _on_connect(self, conn: ConnectionEnvironment):
conn.render_execute(
"CREATE SCHEMA IF NOT EXISTS {{schema | identifier}}",
{"schema": self.schema},
)
- _prepare_env(env: ralsei.jinja.SqlEnvironment)
Here you can add your own filters/globals to the jinja environment
- _on_connect(conn: ralsei.connection.ConnectionEnvironment)
Run custom code after database connection
CLI Arguments¶
Usage: app.py [COMMON OPTIONS] COMMAND [COMMAND ARGS]
Common Options¶
|
SQLAlchemy database url |
Custom arguments |
Commands¶
run, delete, redo¶
All three commands have the same set of arguments:
|
Filter to run only this task |
|
Filter to run only this task and its descendants |
The filtered sets are then added together, so
--one records --from orgs --from export.person
is read as
The task "records"
AND the task "orgs" and its descendants
AND the task "export.person" and its descendants
describe¶
Positional argument: TASK
Print SQL scripts rendered by this task, useful for debugging templates
graph¶
Show visualization of the task graph (must have Graphviz installed)