Assessing Data Quality
One of the biggest difficulties with ML is dealing with messy data. This is a common and reoccurring problem.
CleanLab
CleanLab is a product that attempts to use statistical methods to clean up data and labels. I need to read more about exactly how it works.
They have some tutorials on how to use their system to clean up text for processing here
No comments to display
No comments to display