Getting started with tools and software bundles

Below you will find an introduction to installing and using the tools and software bundles we have elected to promote.

Voyant Tools – an introduction

Voyant Tools is a web application – no installation is necessary.

There are multiple ways to collect text for analysis in Voyant. You may, for instance, build a corpus using documents you have stored on your personal computer (see how in the presentation of Voyant Tools under tools and software bundles for text mining).

When you’ve uploaded a corpus to Voyant, you’ll see a fair bit of information about the corpus such as a word cloud showing the most frequent words, overview of the total number words and symbols used, how many documents embody the corpus etc.

A “stop word” filter is automatically activated. This entails that the most frequent words in the corpus – generally function words – are filtered out. This filter can be disabled or swapped out for another filter, such as a Norwegian stop word filter.

In the lower right corner of the Voyant Tools application you will find a window with an overview of keywords and/or concordances (see concord analysis under types of text mining). This window can be resized. The default view consists of concordances for the word(s) that you are interested in.

At the top right corner of the application, you can see an overview of more “corpus tools”, such as “collates” and “topics” (see collocation analysis and topic modelling under types of text mining).

See Voyant Help for more information, or contact us.

Antconc – installation and introduction

Download and install Antconc.

Using Antconc you may perform concordance analysis, collocation analysis and corpus comparison (see more under types of text mining).

See also this thorough introduction to Antconc by The Programming Historian.

Contact us if you have any further questions.

Jupyter Notebook – installation and introduction

It is recommended to install Jupyter Notebook from Anaconda – this means that you’ll automatically acquire most if not all of the basic software and modules that you will require. There are alternate methods of installation, but these are not practical for first-time users. Anaconda is a distribution platform for software, modules and more for the programming language Python.

Once Anaconda has been installed, go to Programs/Applications, then Anaconda, then Launch Jupyter Notebook. Alternatively, use Jupyter Lab under the aforementioned programs. Thirdly, you may simply type “Jupyter” in the program search bar to launch either of these.

If you were to perform a concordance analysis in the Norwegian National Library’s digitized collections, you’d first go to their DH-Lab and download the example notebook for concordance analysis. If it’s the first time you’re using their resources, you are advised to have a look at the general information as well as visit their introduction pages. Save the file to a known location on your device, then open Jupyter Notebook. A browser page will appear with a view of your device’s files and folders. Navigate to the notebook file that you downloaded and open it through Notebook.

A Jupyter Notebook consists of two types of cells – code cells and markdown cells.

Code cells are used for machine instructions, or code, that tell the computer how to perform a certain task.

Code cell example

Markdown cells are used for notes and text that is not part of the machine’s instructions.

Markdown cell example

To run the code in the code cells, you have to run them. Go to the menu in the notebook that you’ve opened, then go to “cell” and “run all” to run all cells in the notebook. You can also use the regular “run” command to only run the cell(s) that you’ve designated using your mouse cursor. If something is changed within the already-run cells they must be run again in order to send the command – and if later cells are dependent on the content of cells that have been modified and re-run these must be run again. With this in mind, if you do make changes to any cell that will have an impact on code below – such as changing a date or keyword – it is advisable to select the “run this cell and below” option rather than just running the cell with changed content.

Jupyter Notebook and Python enable you to perform many, many kinds of text analysis – collocation, name recognition, n-grams, concordances and so forth – provided that you have some time and drive to familiarize yourself with Python.

Contact us for more information or guidance and keep an eye on Carpentry@UiO for workshops in Python, Jupyter and more.

RStudio: Installasjon og introduksjon

Important! Both R and RStudio must be installed:

RStudio is a workspace in which you can perform various types of text mining such as concordance and collocation analysis, name recognition, topic modelling, corpus comparison and more.

Contact us for more information or guidance and keep an eye on Carpentry@UiO for workshops in Python, Jupyter and more.

Published Sep. 7, 2021 10:42 AM - Last modified June 28, 2022 12:26 PM