Unstructured data is the most voluminous form of data in the world, and analysts rarely receive it in perfect condition for processing. In other words, textual data needs to be cleaned, transformed, and enhanced before value can be derived from it. Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS® shows SAS programmers of virtually all skill levels how to harness the robust power of regular expressions and entity resolution within the SAS programming language for a wide array of everyday applications of unstructured data analyses.
This book uses a practical, examples-based approach to present techniques for unstructured data processing and provides the foundational information needed to perform advanced applications. Beginning with regular expressions in SAS, readers will progress to learning the building blocks of Entity Resolution Analytics including entity extraction, ETL, entity resolution, network mapping and analysis, and management concepts. Filled with motivational examples and helpful guidelines, this book is a critical reference for every analytics professional who works with unstructured data.
Table of Contents
Chapter 1: Getting Started with Regular Expressions
Chapter 2: Using Regular Expressions in SAS
Chapter 3: Entity Resolution Analytics
Chapter 4: Entity Extraction
Chapter 5: Extract, Transform, Load
Chapter 6: Entity Resolution
Chapter 7: Entity Network Mapping and Analysis
Chapter 8: Entity Management
Appendix A: Additional Resources