LLAW6313 & JDOC6313

General Course Information

1.1 Course details

Course code: LLAW6313 / JDOC6313
Course name: Law as Data
Programme offered under: LLM Programme / JD Programme
Semester: First
Prerequisites / Co-requisites: No
Credit point value: 9 credit / 6 credits

1.2 Course description

Law is created, transmitted, and performed through speech. By summarizing and extracting information from large amounts of text, we can better understand legal behaviour and institutions. This course has three objectives. First, to introduce some of the building blocks for treating legal text as data. Second, to gain some hands-on experience in analysing text data using the Python programming language. Third, to explore how quantitative methods for text analysis can yield social scientific insights. Motivated examples are provided throughout. No knowledge of Python is necessary although prior exposure to programming will be very helpful. Knowledge of calculus and linear algebra is highly recommended.

Topics to be covered include:

Introduction and Basic Python Syntax. What can text tell us? This module looks at how text data can illuminate questions in law and social science. Basic concepts of Python coding are reviewed, alongside regular expressions.

Machine Learning. How do machines learn patterns from data? By optimizing on an objective function of course! We will undertake a high-level survey of machine learning techniques, beginning from linear models like regressions and progressing to non-linear models like random forests and neural networks.

Pre-Processing Text. Text comprises many characters and symbols and comes in many shapes and sizes. How can we standardize and clean text documents to make them fit for purpose? We will discuss why and how to pre-process text documents and the consequences of pre-processing choices.

Bag-of-Words Representations. We have to represent text numerically to perform computational operations on them. How can we turn documents into vectors? In this module, we study one of the most basic numerical representations of text: the bag-of words (BOW) model. The BOW model is, in essence, a word frequency count that disregards order. Despite its simplicity, the BOW model—including weighted variants like TF-IDF—performs well in many applications.

Topic Modelling. Can a machine identify topics in a text corpus without any human supervision? In this module, we will examine how empirical relationships between words, topics, and documents can be exploited to classify and describe the content of large corpora. Models to be considered include Latent Dirichlet Allocation and Non-negative Matrix Factorization. The topics estimated by these models are probability distributions over words and must be interpreted by the researcher.

Word Embeddings. Apples and oranges are both fruits. But apples are red and oranges are, well, orange. Can we represent words as vectors in a way that captures such similarities and differences? We will study how algorithms such as word2vec generate word embeddings by training neural networks to predict masked words. Word embeddings have improved performance on many natural language processing tasks.

1.3 Course teachers

Name E-mail address Office Consultation
Course convenor Benjamin Chen benched@hku.hk CCT 512 By email

Learning Outcomes

2.1 Course Learning Outcomes (CLOs) for this course

CLO 1 Describe and explain basic approaches to text as data.

CLO 2 Describe and explain how quantitative methods for analysing text can be used to illuminate empirical relationships germane to law and legal institutions.

CLO 3 Apply their knowledge and skills to assess the viability of academic studies or commercial applications that apply quantitative methods for analysing text.

CLO 4 Demonstrate their knowledge of quantitative methods for analysing text by ideating plausible studies or applications and recognizing their limitations.

2.2 LLM and JD Programme Learning Outcomes (PLOs)

Please refer to the following link:

LLM – https://course.law.hku.hk/llm-plo/

JD – https://course.law.hku.hk/jd-plo/

2.3 Programme Learning Outcomes to be achieved in this course

PLO A PLO B PLO C PLO D PLO E PLO F
CLO 1
CLO 2
CLO 3
CLO 4

Assessment(s)

3.1 Assessment Summary

Assessment task Due date Weighting Feedback method* Course learning outcomes
Five In-class quizzes TBC 20% 1, 2, 3, 4
Programming assignment TBC 30% 1, 2, 3, 4
Proposal that applies law as data methods TBC 50% 1, 2, 3, 4
*Feedback method (to be determined by course teacher)
1 A general course report to be disseminated through Moodle
2 Individual feedback to be disseminated by email / through Moodle
3 Individual review meeting upon appointment
4 Group review meeting
5 In-class verbal feedback

3.2 Assessment Detail

To be advised by course convenor(s).

3.3 Grading Criteria

Please refer to the following link: https://www.law.hku.hk/_files/law_programme_grade_descriptors.pdf

Learning Activities

4.1 Learning Activity Plan

Seminar: 3 hours / week for 12 teaching weeks
Private study time: 9.5 hours / week for 12 teaching weeks

Remarks: the normative student study load per credit unit is 25 ± 5 hours (ie. 150 ± 30 hours for a 6-credit course), which includes all learning activities and experiences within and outside of classroom, and any assessment task and examinations and associated preparations.

4.2 Details of Learning Activities

To be advised by course convenor(s).

Learning Resources

5.1 Resources

Reading materials: Reading materials are posted on Moodle
Core reading list: TBA
Recommended reading list: TBA

5.2 Links

Please refer to the following link: http://www.law.hku.hk/course/learning-resources/