Installation
Starlake CLI
Prerequisites
Make sure you have Java 11+ installed on your machine.
You can check your Java version by typing java -version
in a terminal.
If you don't have Java 11+ installed, you can download it from Oracle JDK or OpenJDK
Install Starlake
To install starlake, you need to download the setup script from github. The script will in turn download required dependencies and copy them to the bin subdirectory.
- Linux/MacOS
- Windows
- Docker
sh <(curl https://raw.githubusercontent.com/starlake-ai/starlake/master/distrib/setup.sh)
Invoke-Expression (Invoke-WebRequest -Uri "https://raw.githubusercontent.com/starlake-ai/starlake/master/distrib/setup.ps1").Content
Pull the docker image from the docker hub
$ docker pull starlakeai/starlake:VERSION
You may also run Starlake from a docker container. To do so, download this Dockerfile and build your docker image
$ git clone git@github.com:starlake-ai/starlake.git
$ cd starlake
$ docker build -t starlakeai/starlake .
$ docker run -it starlakeai/starlake:VERSION help
To build the docker image with a specific version of Starlake or a specific branch, you can use the following command:
$ docker build -t starlakeai/starlake:VERSION --build-arg SL_VERSION=1.2.0 .
To execute the docker image, you can use the following command:
$ docker run -it -v /path/to/starlake/project:/starlake starlakeai/starlake:VERSION <command>
Note that you can pass environment variables to the docker image to configure the CLI. For example, to run against AWS redshift, you can pass the following environment variables:
$ SL_ROOT="s3a://my-bucket/my-starlake-project-base-dir/"
$ docker run -e SL_ROOT=$SL_ROOT \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN \
-e REDSHIFT_PASSWORD=$REDSHIFT_PASSWORD \
-it starlakeai/starlake:VERSION <command>
The following folders should now have been created and contain Starlake dependencies.
starlake
└── bin
├── deps
├── sl
└── spark
Any extra library you may need (Oracle client for example) need to be copied in the bin/deps folder.
Starlake is now installed with all its dependencies. You can run the CLI by typing starlake
.
This will display the commands supported by the CLI.
Starlake Version 1.2.0
Usage:
starlake [command]
Available commands =>
lineage
bootstrap
bq2yml or bq-info
compare
cnxload
esload
extract-data
extract-schema
import
infer-schema
kafkaload
load
metrics
parquet2csv
transform
watch
xls2yml
yml2ddl
table-dependencies
yml2xls
That's it! We now need to bootstrap a new project.
Graph Visualization
Starlake provides features to visualize the lineage of a table, the relationship between tables, and table level and row level acess policies.
To use these features, you need to install the GraphViz on top of which the starlake graph generator is built.
- Linux
- MacOS
- Windows
- Docker
sudo [apt|yum] install graphviz
brew install graphviz
Download and install one of the packages from GraphViz.
GraphViz comes pre-installed with the starlake docker image.
VS Code extension
Starlake comes with a vs-code plugin that allows you to interact with the Starlake CLI. You can install it from the vs-code marketplace.