Great Expectations: Always know what to expect from your data.
Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling.
Software developers have long known that testing and documentation are essential for managing complex codebases. Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams.
See Down with Pipeline Debt! for an introduction to the philosophy of pipeline testing:
https://medium.com/@expectgreatdata/down-with-pipeline-debt-introducing-great-expectations-862ddc46782aKey features:
- Expectations or assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues
- Batteries-included data validation
- Tests are docs and docs are tests: many data teams struggle to maintain up-to-date data documentation. Great Expectations solves this problem by rendering Expectations directly into clean, human-readable documentation
- Automated data profiling: wouldn't it be great if your tests could write themselves? Run your data through one of Great Expectations' data profilers and it will automatically generate Expectations and data documentation
- Pluggable and extensible
https://github.com/great-expectations/great_expectations#python #ds #docops