fr.splayce.rel

cleaners

package cleaners

Visibility
  1. Public
  2. All

Value Members

  1. object AllWhiteSpaceCleaner extends Cleaner

    Replace multiple instances of all Unicode whitespaces by a single space.

  2. object CamelCaseSplitFilter extends Cleaner

    Split CamelCase words.

  3. object DiacriticCleaner extends Cleaner

    Pseudo ASCII folding, remove diacritical marks (and some common variants and ligatures) on characters.

  4. object DiacriticFolder extends AnyRef

  5. object DoubleQuoteNormalizer extends Cleaner

    Normalize frequent Unicode double quotes to ASCII quotation mark U+0022 / ".

  6. object FullwidthNormalizer extends Cleaner

    Normalize CJK Fullwidth characters to their ASCII equivalents.

  7. object IdentityCleaner extends Cleaner

    No-op cleaner

  8. object LineSeparatorNormalizer extends Cleaner

    Normalize all Unicode line breaks and vertical tabs to ASCII new line U+000A / \n.

  9. object LowerCaseFilter extends Cleaner

    Transform text in lowercase

  10. object QuoteNormalizer extends Cleaner

    Combines SingleQuoteNormalizer and DoubleQuoteNormalizer

  11. object SingleQuoteNormalizer extends Cleaner

    Normalize frequent Unicode single quotes / apostrophes to ASCII apostrophe U+0027 / '.

  12. object TrimFilter extends Cleaner

    Trim text

  13. object WhiteSpaceCleaner extends Cleaner

    Replace multiple instances of regular whitespaces \s+ by a single space.

  14. object WhiteSpaceNormalizer extends Cleaner

    Normalize all Unicode spaces and horizontal tabs to ASCII spaces U+0020.