Announcing Introduction to Japanese NLP

This post is part of collections on Projects and Japanese Language Technology.

I'm working on a book about Japanese Natural Language Processing with Masato Hagiwara. Introduction to Japanese Language Processing will be available in Japanese and English and preorders should open this month.

A crop of the amazing cover by Nomi - check the book site for the whole thing.

I first became aware of Masato through his post reflecting on his first year as a freelance engineer. I contacted him earlier this year about working together on something, and we decided to write this book to fill what we see as a gap in Japanese-specific NLP resources for practitioners, particularly in English, where information is often limited and out of date.

I'll be writing the first half of the book, covering linguistic concepts, text representation and quirks, tokenization and morphological analysis, and common datasets. In the second half Masato will cover word embeddings and using Transformers to generate text, convert kana to kanji, classify documents, and do NER. The full table of contents is on the book site.

There are many other books about Japanese NLP, but many are for an advanced audience and delve into specific topics in depth. Our goal in this book is to help the everyman programmer accomplish their goals, giving an overview of the current landscape of tools and research, with working example code that can be used as a starting point for real applications.

Our plan is to publish on LeanPub, meaning we'll start selling the book before it's done, including sample chapters and updating other chapters as they're written. If you're interested you can sign up for the mailing list at the official site to be notified of the release. And if you have any questions or anything you'd like to see in the book, just send Masato or I a mail - we'd love to hear from you. Ψ