Local filesystem

Sample setup

If running the samples on MacOS or Linux, you may skip this section.

To run the samples locally on Windows, you must first create the docker image:

$ docker build --build-arg SL_VERSION=0.7.2.2 -t starlake .

One the docker image is built locally, run it:

$ docker run -it starlakeai/starlake:VERSION bash

Inside the docker container, make sure you are in the samples/local folder

The userguide-template is first duplicated into the samples/local folder to create a startup project

$ ./0.data-init.sh

Then you need to import he files located in userguide/incoming into the correct pending folder depending on the domain they belong to:

$ ./1.data-import.sh

To start the ingestion process, run the load command. The resulting tables should be available in the userguide/datasets/accepted folder:

$ ./2.data-load.sh

To join multiple datasets using the KPI job example located in userguide/metadata/jobs/kpi.sql, run the corresponding transformation:

$ ./3.data-transform.sh

To view the data ingested and stored as parquet files:

$ ./4.data-view-results.sh

To exit the spark shell above type :quit

To view the log produced:

$ ./4.data-view-audit.sh

To exit the spark shell above type :quit

You may view the relationship between your tables by generating a graphviz diagram using the command below:

$ ./1.data-visualization.sh