[Solved] Validating data with Pandas DataFrame [closed]


After going through the Pandas documentation, found the way to validate the data.

Let’s just say you have a custom validation function.

def validate_rating(rating):
    """"
    Description:
        validate if hotel rating is a digit between 0 to 5

    Args:
        rating (str): rating to be validated

    Returns:
        True if rating is valid
        False if rating is not a valid
    """

    try:
        rating = int(rating)
        if rating < 0 or rating > 5:
            return False
        return True
    except ValueError:
        return False

Once you’ve read the CSV into dataframe

df = pd.read_csv(input_file)
df['stars'].apply(validate_rating)

solved Validating data with Pandas DataFrame [closed]