Deliver high-quality data with
Discover how the Starlake® contract-based ingestion and transformation framework accelerates data pipelines delivery
Features
Focus more on business value, less on pipelines
No code Ingestion
Through declarative configuration, data is validated, transformed and loaded into your data warehouse without writing a single line of code.
Low code Transformations
Declare the datasets you need and the transformations you want to apply, the write strategy and the rules you want to enforce, and let Starlake do the rest.
Automated Workflows
Let Starlake infer your task dependencies and apply predefined and custom Airflow® or Dagster® templates to automate your workflows.
Benefits
The Starlake Ingestion & Transform Platform
When schema is enforced at the data ingestion stage, the transformation logic is governed by the rules specified in the contracts, your metrics tracked and aligned with the SLAs specified in the contracts, your pipelines automated and tested, you deliver value faster and cheaper.
Code less, deliver more
Declare the ingestion and transformation outcomes and let Starlake and your data warehouse take care of the underlying logic. Use our declarative YAML syntax or browser based UI to build and maintain more efficiently your data warehouse.
- Don't code, declare your intent using YAML or our browser based UI
- Infer dependencies and automate workflows
- Reuse orchestration templates among tasks and projects
Apply Software Engineering practices to Data Engineering
Develop and test your workload locally and deploy globally. Thanks to our state of the art SQL Transpiler, use your native SQL dialect both on your test and production environments. Support BigQuery/Databricks/Snowflake/Redshift/DuckDB and more
- Test your load and transformation logic locally or on your C.I. before deploying
- Validate your pipelines on small amount of data before running them on the full dataset
- Reuse orchestration templates among tasks and projects
Data Engineering / Analytics is not about Load or Transform or Orchestration, it's about all of them !
Starlake covers the entire data lifecycle, from data ingestion to data monitoring, including data validation, transformation and orchestration.
- Extract from source database or middleware in full or incremental mode
- Infer schema and data types from your inputs and load them into your data warehouse
- Apply transformations using SQL SELECT statements and materialize using declarative write strategies (append, overwrite, upsert by key and/or timestamp, slow changing dimension, etc.)
Enforce data governance with starlake data contracts
Keep your lakehouse from becoming a dataswamp using automated testing, schema enforcement, validation and transformation rules, ensuring data integrity and consistency across all stages
- Schema enforcement ensuring data consistency
- Transformation logic governed by rules
- Comprehensive documentation of all contracts
- Data availability, freshness and quality tracked and aligned with SLAs
Commitments on data availability and freshness
Tracking Service Level Agreements (SLAs) is crucial for ensuring that data services meet the expected standards of quality, availability, and performance as agreed upon between data producers and consumers.
- Clear availability, freshness, accuracy and completeness metrics
- Real time monitoring, logging and auditing
- Historical analysis to identify recurring issues
Single codebase, multiple deployment options
Deploy Starlake on your own on-premise or serverless cloud infrastructure. When you want to focus on your business, use our SaaS offering.
- Run Starlake as a serverless SaaS offering
- Run Starlake as a Docker image on your own infrastructure
- Ultra light footprint, no need for a dedicated cluster or database server
How it works
Experience the power of the all in one user friendly UI and declarative YAML
Load
Transform
Test
Ready to Transform Your Data Workflow?
Join the waitlist for Starlake Cloud and be among the first to experience the future of data analytics.
Frequently Asked Questions
- In addition to Transformations, Starlake also covers data Extraction and data Loading.
- No more `jinja ref` macros. Use plain SQL and Starlake Cloud SQL parser will infer the dependencies and apply substitutions if required.
- In addition to the browser based YAML editor, Starlake Cloud provides a user friendly UI to build and deploy your data pipelines.
- Starlake Cloud natively generates Airflow DAGs and Dagster pipelines.
- Not only Starlake Cloud works as a SaaS service, it may also be installed on your laptop, your cloud or your on-premise infrastructure.
- Starlake Cloud runs your tests in-memory, not on your data warehouse, enabling you to run your unit tests locally and on your C.I. reducing costs and speeding up development.
Starlake Cloud is built on top of the Starlake UI and the Starlake Core Open Source Software project.
- Running Starlake Cloud on your own laptop is free, period.
- Running Starlake Cloud on your own cloud or on-premise servers requires a license.
- Starlake Cloud is always free for all read-only users
- Starlake Cloud subscriptions on starlake.ai or installed on your company servers is licensed on a per developer basis. The pricing will be per developer per month.
We provide two levels of support.
- Community support is available on our GitHub repository and Slack channel.
- Enterprise support is available through a subscription model.
- We also provide training and professional services to help you with your data projects.
Even though Starlake is based on YAML and versioning on Git, you don't need to know anything about YAML or Git to use it, thanks to the user friendly Starlake UI.
- Although you can use the YAML editor, you can also use the advanced UI that manage the YAML for you.
- Git is behind the scene, you don't need to know about it to use Starlake.