Posts

Showing posts from August, 2018

How to solve 90% of NLP problems: a step-by-step guide

Image
Text data is everywhere Whether you are an established company or working to launch a new service, you can always leverage text data to validate, improve, and expand the functionalities of your product. The science of extracting meaning and learning from text data is an active topic of research called Natural Language Processing (NLP). NLP produces new and exciting results on a daily basis and is a very large field. However, having worked with hundreds of companies, the Insight team has seen a few key practical applications come up much more frequently than any other: Identifying different cohorts of users/customers (e.g. predicting churn, lifetime value, product preferences) Accurately detecting and extracting different categories of feedback (positive and negative reviews/opinions, mentions of particular attributes such as clothing size/fit…) Classifying text according to intent (e.g. request for basic help, urgent problem) While many NLP papers and tutorials exist onlin

Basics of Image Processing in Python

Image
Writing today’s article was a fascinating experience for me and would also be for the readers of this blog. What’s so different? Two things: firstly the article is about something I always wanted to do since I was a 5-year old; secondly this topic / tools / algorithm is new to me as well. I am no way a master of image processing, but the utility of this field simply blew my mind. Imagine, if you can create an application of auto-tagging a photograph as that of Facebook, or create your own face recognition password to your laptop. In this article, I will pick up a very simple but interesting application of Image processing. We will use Python to do the image processing.  In the next few articles, I will take over more complex examples of Image Processing. Here is the problem I will be working on in this article: Problem Statement Guess what is this image? You are right, it’s a typical picture of an open sky. I took this picture last to last night in Bangalore from my terrace

Introduction to Libraries of NLP in Python — NLTK vs. spaCy

Image
The two significant libraries used in NLP are NLTK and spaCy. There are substantial differences between them, which are as follows: NLTK provides a plethora of algorithms to choose from for a particular problem which is a boon for a researcher but a bane for a developer. Whereas, spaCy keeps the best algorithm for a problem in its toolkit and keep it updated as state of the art improves. NLTK supports various languages whereas spaCy have statistical models for 7 languages (English, German, Spanish, French, Portuguese, Italian, and Dutch). It also supports named entities for multi-language. NLTK is a string processing library. It takes strings as input and returns strings or lists of strings as output. Whereas, spaCy uses an object-oriented approach. When we parse a text, spaCy returns document object whose words and sentences are objects themselves. spaCy has support for word vectors whereas NLTK does not. As spaCy uses the latest and best algorithms, its performance is usually

Popular posts from this blog

How to download a file using command prompt (cmd) Windows?

The future of Artificial Intelligence: 6 ways it will impact everyday life

How to Include ThreeJs in Your Projects