Crawling the Web With Lynx
Introduction There are a few reasons you’d want to use a text based browser to crawl the web. For example, it makes it easier to do natural language processing on web pages. I was doing this a year or two ago, and at the time I was unable to find a Python library that would […]
In: Python · Tagged with: nlp, web crawling
NLTK Regular Expression Parser (RegexpParser)
The Natural Language Toolkit (NLTK) provides a variety of tools for dealing with natural language. One such tool is the Regular Expression Parser. If you’re familiar with regular expressions, it can be a useful tool in natural language processing. Background Information You must first be familiar with regular expressions to be able to fully utilize […]
In: Python · Tagged with: nlp, nltk
Spell Checking in Python
I was looking into spell checking in Python. I found spell4py, and downloaded the zip, but couldn’t get it to build on my system. If I tried a bit longer maybe, but in the end my solution worked out fine. This library was overkill for my needs too. I found this article here: http://code.activestate.com/recipes/117221/ This […]
In: Python · Tagged with: nlp, spelling checker
NLTK vs MontyLingua Part of Speech Taggers
This is a comparison of the part of speech taggers available in python. As far as I know, these are the most prominent python taggers. Let me know if you think another tagger should be added to the comparison. MontyLingua includes several natural language processing (NLP) tools. The ones that I used in this comparison […]
In: Python · Tagged with: benchmarks, nlp, taggers