Given a particular text or string, it is possible to create an image containing the words of that text, where the size of each word is dependent on its frequency within the text. Such an image is referred to as a Word Cloud. Using Python's wordcloud and matplotlib libraries, a Data Scientist can easily generate a word cloud given an input string or text.
The following is a Word Cloud generated for The Mother Goose poem called "Hey, diddle, diddle".
The following is a Word Cloud generated for The Mother Goose poem called "Hey, diddle, diddle".
Uses of a Word Cloud:
Word Clouds make great visualization tools. They provide a quick picture of which words are most and least frequently used in a text or document. That kind of information can influence the actions of writers, bloggers and companies who receive feedback and comments from clients by highlighting important (frequently occurring) words. Keep in mind that a little bit of text preprocessing before the generation of a Word Cloud makes a huge difference.
Word Clouds make great visualization tools. They provide a quick picture of which words are most and least frequently used in a text or document. That kind of information can influence the actions of writers, bloggers and companies who receive feedback and comments from clients by highlighting important (frequently occurring) words. Keep in mind that a little bit of text preprocessing before the generation of a Word Cloud makes a huge difference.
Python code needed for generating Word Clouds:
Import Libraries and Set System Options
Define Functions
Generate Word Cloud for the Mother Goose poem "Humpty Dumpty sat on a wall"
Generate Word Cloud for the Mother Goose poem "How much wood could a woodchuck chuck ..."
Generate Word Cloud for the Mother Goose poem "Jack be nimble"
Conclusion:
Words Clouds are colorful and informative and give a quick visual of the most and least frequently occurring words in a text or document.
Happy Learning!
Words Clouds are colorful and informative and give a quick visual of the most and least frequently occurring words in a text or document.
Happy Learning!