Norwegian version of this page

Text Mining

Text mining, or digital text analysis, is the process of using digital tools to search for, extract and analyse text data.

"Text Mining is the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources... The difference between regular data mining and text mining is that in text mining the patterns are extracted from natural language text rather than from structured databases of facts." - from What is Text Mining? by Marti Hearst

Why text mining? Automated search strategies can provide an overview of patterns and tendencies in large volumes of text. This may provide insight that would otherwise be difficult to achieve, time-consuming or both through conventional qualitative methods.

  • Text mining types
    • Click here for a presentation of various methods of text mining such as plotting lexical dispersion and frequency over time, and searching for concordances.
  • Text mining tools and software bundles
    • Click here for a presentation of leading programming languages and tools for text mining, with suggestions for how to get started with them.