Regex Filter Transform
The Regex Filter transform allows you to apply regular expression patterns to text columns in your dataset. This powerful tool enables you to search, replace, or transform text data based on specific patterns.
Basic Usage
To apply a regex filter to your dataset:
- Select the Regex Filter transform from the transform menu.
- Choose one or more text columns to apply the filter.
- Enter your regex pattern.
- Specify the replacement text or function.
- Apply the transformation.
Configuration Options
Basic Options
- Select Column(s): Choose one or more text columns to apply the regex filter. Only string-type columns will be available for selection.
- Regex Pattern: Enter the regular expression pattern to match within the selected column(s).
- Replacement: Specify the text or pattern to replace the matched regex. This can be a static string or a more complex replacement pattern.
If you're not familiar with regex patterns, consider using our AI assistant to help formulate the appropriate regex for your needs.
Examples
Here are some examples of how to use the Regex Filter transform:
Example 1: Masking Phone Numbers
Input Dataset:
| Name | Phone |
|---|---|
| Alice | 123-456-7890 |
| Bob | (987) 654-3210 |
| Carol | 555.123.4567 |
Configuration:
- Select Column(s):
Phone - Regex Pattern:
\d - Replacement:
X
Result:
| Name | Phone |
|---|---|
| Alice | XXX-XXX-XXXX |
| Bob | (XXX) XXX-XXXX |
| Carol | XXX.XXX.XXXX |
Example 2: Standardizing Email Domains
Input Dataset:
| Employee | |
|---|---|
| John | john@oldomain.com |
| Sarah | sarah@anotherdomain.net |
| Mike | mike@olddomain.org |
Configuration:
- Select Column(s):
Email - Regex Pattern:
@.*$ - Replacement:
@newdomain.com
Result:
| Employee | |
|---|---|
| John | john@newdomain.com |
| Sarah | sarah@newdomain.com |
| Mike | mike@newdomain.com |
Regex filters are powerful but can also be complex. Always preview your results to ensure the transformation behaves as expected, especially when working with critical data.
Best Practices
-
Test Your Regex: Before applying a regex filter to your entire dataset, test it on a small sample to ensure it produces the desired results.
-
Be Specific: Create regex patterns that are as specific as possible to avoid unintended matches.
-
Consider Edge Cases: Think about potential edge cases in your data that might produce unexpected results with your regex pattern.
-
Preserve Original Data: When possible, create new columns for transformed data rather than overwriting existing ones, especially when working with sensitive information.
-
Document Your Patterns: Keep a record of the regex patterns you use and their purposes for future reference and reproducibility.