You are now in the main content area

Resources & Materials

Fall 2022 Mini-Course Session Recordings

Class #1

An Introduction to AI and Natural Language Processing

Class #2

Working with Large Datasets

Class #3

An Introduction to Python

Additional Resources

  

Butter

Butter (external link)  is a free text analysis program that allows you to build your own natural language processing pipelines without having to write a single line of code.

  

ConText

ConText (external link)  stands for Connections and Texts. ConText supports a) the construction of network data from natural language text data, a process also known as relation extraction and b) the joint analysis of text data and network data.



  

LIWC

Linguistic Inquiry and Word Count (LIWC) (external link)  is software for analyzing word use. It can be used to study a single individual, groups of people over time, or all of social media.



Gephi

Gephi (external link)  is the leading visualization and exploration software for all kinds of graphs and networks.

Python



Python (external link)  is a general purpose programming language that lets you work quickly and integrate systems more effectively.

Tools

The following tools extend the functionality of Python. You will need to install them on your own computer, after you have installed Python. 

Django web framework (external link) 

If you would like to set up complex Python-based projects online, Django is a great tool. One caveat is that  you need a hosting package that can host Django projects, most web hosting projects don’t. A nice cheap/free option is Python Anywhere (external link) .

Flask web framework (external link) 

Like Django, Flask is a web framework that allows you to build web projects using Python. In my opinion Flask was easier to learn than Django, but it doesn’t include as much built-in functionality. For example, Django has user-authentication built in. Flask is great for building simple web projects, where Django is probably better for building full featured websites. 

Scikit-learn machine learning library (external link) 

Scikit-learn is a machine learning and artificial intelligence library for Python. I found it complicated to learn, but there are a significant amount of resources available online explaining how to do simple machine learning tasks. Depending on what you are doing, machine learning can take a lot of computer processing power. If you are a PC-gamer and have an NVIDIA graphics card, it is well suited to doing complex machine learning tasks. 

Pandas (external link) 

Pandas is a really great data analysis and manipulation tool. I use this library in my research more than any other Python library. It allows you to read in spreadsheets, CSV and JSON files and easily manipulate them for use in research. I find it much easier to manipulate my data using Pandas than in Excel or SPSS.

NLTK (Natural Language Toolkit) (external link) 

The Natural Language Toolkit is a great library for doing natural language processing in Python. It is used quite a bit when doing machine learning tasks with text. 

Libraries

There are countless libraries for Python that extend its functionality. The biggest repository of those libraries is PyPi (external link) . It is a bit overwhelming as there are over 300,000 projects on the site. 

If you are looking for a bit more of a curated list here are some options: