Tools and software bundles for text mining

There are numerous tools, software and platforms useable for text mining. A simple Google search with the keywords “text mining tools” will fetch a current view of what’s available and what’s popular. This Wikipedia article contains a list of text mining tools and may be a good starting point. Below are examples of current programs, programming languages and software we’ve selected that might prove useful.

Voyant Tools

Voyant Tools is a web-based digital text analysis tool that is both functional and user-friendly.

In Voyant Tools, you can upload files that you have on your computer. Voyant Tools accepts a number of different file formats, such as docx, pdf, txt, etc. Press "upload" and select the files you want to upload. In a few seconds, the application has registered the files as a corpus, and you immediately see information about it and can do a number of analyzes.

Screen shot of Voyant Tools

You may read more about Voyant Tools under our section on getting started with tools and software bundles.

AntConc

AntConc is text analysis software that acts as a relatively simple entryway to performing several types of text analysis.

Screen shot of how to download and install AntConc.
AntConcs forside

You may read more about Antconc under our section on getting started with tools and software bundles.

Jupyter Notebook

Using Jupyter Notebook allows you to perform an array of different data analysis methods such as text mining, digital text analysis and much, much more.

Jupyter Notebook is an open-source application developed by the Jupyter Group. Open-source application is a term used to describe software where absolutely anyone is freely permitted to acquire the source code and even modify it further. Normally, such freedoms are highly restricted to protect financial interests, copyright, and other interests. Jupyter Notebook allows you to write and run code on the fly – chiefly in Python, though it can be configured to accept other languages also. To “run code” means that the program executes a command as programmed.

You may read more about Project Jupyter and the Jupyter Notebook, as well as other related products and the applications of them, on the Jupyter home pages. Jupyter Notebooks have applications in just about any field where data is involved. A gallery of interesting applications and uses is available on GitHub.

Jupyter Notebook is not in and itself a tool for text mining, but an application where programs designed to do so can be written and run. The Norwegian National Library has made available large quantities of their digitalised collections for analysis through Jupyter Notebook – some examples are available at the Norwegian National Library DH-Lab.

There is an introduction to Jupyter Notebook’s basic features under our section on getting started with tools and software bundles.

R og RStudio

R is both software and a programming language, while RStudio is the graphical user interface you (generally) work in. R/RStudio give you the tools required for text analysis, statistical analysis as well as graphical visualisation of data.

The RStudio interface

You may read more about R under our section on getting started with tools and software bundles.

Published Sep. 7, 2021 10:42 AM - Last modified June 28, 2022 12:28 PM