I have a keen interest on the intersection of software and language, so I’ve been collecting these resources about speech recognition and natural language processing for your enjoyment too.
Libraries and Tools
- Natural Language Toolkit (NLTK)
- Getting Started with Visual Studio
- Alexa Voice Service
- API.ai
- Wit.ai
- SyntaxNet
- NLP Compromise
Voice Banks and Corpii
- Forvo Dialect Database
- Switchboard phone conversations
- Shakespeare
- Google N-grams
Courses and Tutorials
- Behind the Mic: The Science of Talking to Computers
- Interacting with Particle Device Using Slack and NLP
- Get Started with Speech Recognition (Hackaday)
- Microsoft Introduction to Speech Recognition
- MIT OCW Automatic Speech Recognition
- The Scientist and Engineer’s Guide to Digital Signal Processing, Steven W. Smith, Ph.D.
- Stanford Natural Language Processing Course
- Columbia Natural Language Processing
- University of Michigan NLP Course
Concepts
Hidden markov model (HMM)
Lemma: same stem, part of speech rough word sense
Token vs Type
N = number of tokens
V = vocabulary = set of types
|V| is the size of the vocabulary
Papers / Research
- A Bunch of Papers on Philippine Linguistics
- Computational Approach to Filipino Speech Rhythm
- FiliText: A Filipino Hands-Free Text Messaging Application
- A Challenge Dataset for the Machine Comprehension of Text
- Automatic Speech Recognition for the Filipino Language Using the HTK System
- ACL Anthology