Abstract
Diet Parselantro is a data annotation tool that helps users parse, label, and organize textual data into a taxonomy. It integrates with the python Jupyter notebook environment, a web-based development environment commonly used by data scientists, researchers, and analysts. Diet Parselantro uses regular expressions (regex's) to parse and label data. Users can define new categories and specify a corresponding regex for each. The tool automatically categorizes the textual data based on which regexes they match. Categories and their corresponding regexes can be hierarchical and deeply nested. These hierarchies are visualized in an icicle plot, allowing the user to quickly overview, navigate and refine their categories. The final output of the Diet Parselantro tool is a dataset where each data row is labeled with the category (or categories) they match with. This dataset can be downloaded and used by the data analyst in further analysis tasks.
Exploratory License
Not eligible for exploratory license
Market Sector
Data Sciences