Senior Data Scientist Serena Peruzzo has published a blog on the OpenDataScience website writing about the challenges of applying natural language processing (NLP) and machine learning (ML) to unstructured data such as legislative texts.
From the blog:
“In absence of labeled data, the first stage of the analysis, where the legislation is parsed and the burdens are extracted, relies on a lightweight ontology and some business rules to extract potential burdens from the text.”
Perruzo and Bardess Lead Data Scientist Daniel Parton will speak on their work applying NLP and ML to Government of Ontario, Canada, legislative texts to analyze legal documents, including statutes and regulations, to identify, detect and categorize legislative burdens.
Read the full blog here.