# Assessing Data Quality

One of the biggest difficulties with ML is dealing with messy data. This is a common and reoccurring problem.

### CleanLab

[CleanLab](https://docs.cleanlab.ai/stable/index.html) is a product that attempts to use statistical methods to clean up data and labels. I need to read more about exactly how it works.

They have some tutorials on how to use their system to clean up text for processing [here](https://docs.cleanlab.ai/stable/tutorials/text.html)